Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature parity with Wooldridge and other packages #6

Open
2 of 6 tasks
azev77 opened this issue Mar 30, 2021 · 8 comments
Open
2 of 6 tasks

Feature parity with Wooldridge and other packages #6

azev77 opened this issue Mar 30, 2021 · 8 comments

Comments

@azev77
Copy link

azev77 commented Mar 30, 2021

Wooldridge tweeted that every econometrics package should come with the following options for standard errors for virtually every command:

  • Robust to allow complete model misspecification of the modeled parts.
  • HAC to allow for unmodeled time series correlation.
  • Cluster to allow serial correlation in panels and cluster sampling/clustered assignment generally.
  • Driscoll-Kraay to allow unmodeled serial correlation in large-T panels as well as cross-sectional correlation.
    @PaulSoderlind has https://github.com/PaulSoderlind/PaulSoderlindCode/tree/master/DriscollKraay
  • Spatial HAC to allow unmodeled spatial correlation in the cross-sectional dimension (requires a cross-sectional ordering of the data).
  • Spatial HAC with clustering to allow for any kind of serial correlation and cross-sectional spatial correlation (small T).

"In Stata, I'd like to type things like"
glm y x1 ... xK, fam(logit) vce(hac nw 4)
xtpoisson y x1 ... xK, fe vce(shac lat(Lx) long(Ly) dist(50))
qreg y x1 ... xK, q(.25) vce(dk 4)

@azev77
Copy link
Author

azev77 commented Mar 30, 2021

I myself have written some (unoptimized) code too:

using LinearAlgebra, BenchmarkTools, DataFrames;
x =[ones(506) randn(506,3)]; n,p =size(x);
y = x*[1;2;3;4] + randn(506);

b_ls = x\y; y_ls = x*b_ls; e_ls = y-y_ls; s2_ls = ((e_ls'e_ls)/(n-p))[1];

# Sandwich: v(Ω; x=x) = (x'x)^-1 * (x'Ω*x) * (x'x)^-1
v(Ω; x=x) = (x'x) \ (x'Ω*x) / (x'x)

# Newey West Lags 
relu(x) = max(x,0)
NW(nlags,n) = [relu(1 - abs(j-i)/(nlags+1)) for i=1:n, j=1:n]

# SE: 12 methods
function SE(x, y, n, p, e_ls, s2_ls)
    H = x/(x'x) * x'     #H = x*inv(x'x)*x'     ŷ = Hy
    M = I - H            #                      ε̂ = My
    δ = min.(4,n/p .* diag(H) )
    δm = min.(1,n/p .* diag(H) ) + min.(1.5,n/p  .* diag(H) )
    mx = max(n*0.7*maximum(diag(H))/p, 4)
    α  = min.(mx, n/p .* diag(H))
    e_ti = e_ls ./ (diag(I-H))
    #
    HS   = v( Diagonal(fill(s2_ls,n)) )                      #Spherical
    HC0  = v( Diagonal(e_ls*e_ls') * 1.00 )
    HC1  = v( Diagonal(e_ls*e_ls') * (n/(n-p)) )
    HC2  = v( Diagonal(e_ls*e_ls') / (Diagonal(I-H)) )
    HC3  = v( Diagonal(e_ls*e_ls') / ((Diagonal(I-H))^2 ) )
    HC4  = v( Diagonal(e_ls*e_ls') / ((Diagonal(I-H)) .^ δ ) )
    HC4m = v( Diagonal(e_ls*e_ls') / ((Diagonal(I-H)) .^ δm ) )
    HC5  = v( Diagonal(e_ls*e_ls') / sqrt.((Diagonal(I-H)) .^ α ) )
    HCJ  = v( ((n-1)/n) * (Diagonal(e_ti*e_ti') - (1/n)*e_ti*e_ti') )
    # CRVE example w/ 3 clusters
    ix =[1:200 , 201:400 , 401:506]
    blocks = [e_ls[ix[1]]*e_ls[ix[1]]', e_ls[ix[2]]*e_ls[ix[2]]', e_ls[ix[3]]*e_ls[ix[3]]']
    #D      = Diagonal(blocks)
    G = size(ix,1);
    HLZ = v( cat(blocks...,dims=(1,2)) * ((G)/(G-1)) * ((n-1)/(n-p)) )
    #HBC = v( M^(-1/2) * cat(blocks...,dims=(1,2)) * M^(-1/2) )
    HBC = v( real(sqrt(Hermitian(M^-1))) * cat(blocks...,dims=(1,2)) * real(sqrt(Hermitian(M^-1))) )
    # Multi-way clustering
    # NW
    lags = Int(max(round(n^0.25),1.0))
    HNW = v( (e_ls*e_ls') .* NW(lags,n) )
    #
    # Spatially Correlated SE: Conley, Driscoll-Kraay
    # Bootstrap: Wildclustered Bootstrap
    #
    #return HS, HC0, HC1, HC2, HC3, HC4, HC4m, HC5, HCJ, HLZ, HNW
    cov = [HS, HC0, HC1, HC2, HC3, HC4, HC4m, HC5, HCJ, HLZ, HBC, HNW]
    se = [sqrt.(diag(x)) for x in cov]
    #
    se = reshape(vcat(se...), 4, 12)
    se = DataFrame(round.(se, sigdigits=3))
    rename!(se, [:HS, :HC0, :HC1, :HC2, :HC3, :HC4, :HC4m, :HC5, :HCJ, :HLZ, :HBC, :HNW])
    insertcols!(se, 1, :coef => ["β0","β1","β2","β3"])
    return se
end

se1 = SE(x, y, n, p, e_ls, s2_ls)

Here is a screenshot w/ part of the table:
image

@PaulSoderlind
Copy link

I have several of those, but in somewhat crude versions. In particular, the combination of autocorrelation and (some kind of) clustering tends to make things messy. I have the code, but it's not pretty.

Still, If anyone takes the lead to create a common approach, I am happy o contribute.

/Paul S

@PaulSoderlind
Copy link

Another thought/observation: things like
HC0 = v( Diagonal(e_ls*e_ls') * 1.00 )
work on balanced panels.

It becomes substantially messier with unbalanced panels. To keep the matrix notation, a set of interactive "data exists" dummies can be used (see Hayashi's text). Alternatively, create a loop.

@matthieugomez
Copy link
Member

matthieugomez commented Apr 1, 2021

@azev77 Yes it'd be great to have more types of standard errors!
@PaulSoderlind Yes, being able to write these loops without being slower is definitely one of the advantages of Julia. This is already the case for clustered std, i.e. https://github.com/matthieugomez/Vcov.jl/blob/b324f45b788b06e46301963fadb9da8ceaceb150/src/covarianceestimators/vcovcluster.jl#L172

@azev77
Copy link
Author

azev77 commented Apr 1, 2021

@matthieugomez have you defined the scope of Vcov.jl?

  • is it intended to work w/ other packages? GLM.jl, QuantileRegressions.jl, etc
  • are you open to including resampling methods? (wild clustered bootstrap?)

@matthieugomez
Copy link
Member

matthieugomez commented Apr 6, 2021

The goal is for developers (like GLM etc) to depend on this package and use it to compute standard errors. This would simplify things for users (i.e. simply specify Vcov.robust() as an argument of the function fit) without requiring every package to implement standard errors computation.

@IljaK91
Copy link

IljaK91 commented Jun 22, 2022

Is there any movement in this direction? Working out these things would allow many people to move some workflows from R to Julia (regression -> custom standard errors -> latex regression table -> inclusion in paper). I am happy to join efforts if there is someone else who has a good idea of how to tackle this.

@azev77
Copy link
Author

azev77 commented Aug 24, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants