Update: I have added a post on importance sampling here.
Quick note: This code was written in Julia 0.6 and so will not be compatible with some aspects of my previous posts.
I recently replicated Petrin and Seo’s 2016 paper on using optimization as a source of identification in the random coefficients framework. Their method is more computationally complex than BLP, making efficiency a primary concern. Because it is built off of BLP’s basic methodology (it still uses the contraction mapping and calculates marginal costs analogously), I had to go back and streamline my baseline random coefficients estimator. This page only introduces the new code and what changed from my previous version. For a detailed discussion of the mechanics of BLP, see my previous post and the resources included there.
Data Set-up
I have to include the data-cleaning portion of my code as the previous version is not compatible with Julia 0.6. However, it is nearly identical to what I have previously posted, so you only need to worry about this to the extent that the import and cleaning are no longer working.
using DataFrames using DataFramesMeta using Base.LinAlg.BLAS using CSV # Read Data using BenchmarkTools using NLopt using Distributions # For Halton Draws using StatsFuns using Primes cars = CSV.read("[filepath]/Automobile Data.csv", header = true, allowmissing = :none); cars[:ln_hpwt] = log.(cars[:hpwt]) cars[:ln_space] = log.(cars[:space]) cars[:ln_mpg] = log.(cars[:mpg]) cars[:ln_mpd] = log.(cars[:mpd]) cars[:ln_price] = log.(cars[:price]) cars[:trend] = cars[:market] - 1 + 70 cars[:cons] = 1; regSet = @linq cars |> @by(:model_year, s_0 = log.(1 - sum(:share))) regSet = join(cars, regSet, on = :model_year); regSet = @linq regSet |> @transform(s_i = log.(:share)) |> @transform(dif = :s_i - :s_0); regSet[:dif_2] = log.(regSet[:share]) - log.(regSet[:share_out]) ; regSet[:ln_price] = log.(regSet[:price]); regSet = sort!(regSet, [:market, :firmid]) markets = convert(Matrix, regSet[:, [ :market] ]); marks = unique(markets) firms = convert(Matrix, regSet[:, [ :firmid] ]); modlist = unique(regSet[:newmodv]) modLoc = regSet[:newmodv]; X = convert(Array{Float64,2}, regSet[:, [ :cons, :hpwt, :air, :mpd, :space] ]) # Demand Variables W = convert(Array{Float64,2}, regSet[:, [ :cons, :ln_hpwt, :air, :ln_mpg, :ln_space, :trend] ]) # Supply Variables price = convert(Array{Float64,2}, regSet[:, [ :price] ]) # Price delta_0 = convert(Array, round.(regSet[:,:dif_2], 20)); βs = [-7.061, 2.883, 1.521, -0.122, 3.46] γs = [0.952, 0.477, 0.619, -0.415, -0.046, 0.019] α = 43.501 σs = [3.612, 4.628, 1.818, 1.050, 2.056];
Here’s where the changes actually start to matter. My previous function for computing marginal costs was very inefficient because it performed a lot of unnecessary calculations. In it, I calculated the derivative of shares with respect to price between all models in the market. However, it is only necessary to calculate these derivatives for models owned by the same company (e.g. Ford only cares about the substitution between cars it owns, and not about the substitution to other brands). The below code determines the start and end rows for each firm, so that I can efficiently index the data and perform the relevant calculations for each firm individually.
J_f = Array{Int64}(0,2) count = 0 for ms = 1:20 midx = find(markets .== ms) tfirms = unique(firms[midx]) for f in tfirms J_f = vcat(J_f, [count + findfirst(firms[midx], f) count + findlast(firms[midx], f)]) end count += length(midx) end;
Similarly, I find the first and last index of each market in order to more efficiently index the data. However, there isn’t that much of a speed difference between using the find function once vs. thousands of times.
# Beginning and end of the market ids idxs = [findfirst(markets, i) for i = 1:20] idxe = [findlast(markets, i) for i = 1:20] midx = hcat(idxs, idxe);
Model Type
As always, I define a custom type to hold all the variables I will use throughout estimation. This way nothing will be defined in global scope (in Julia, using variables in a function that are defined globally can seriously slow down your program).
type ModelData price::Array{Float64, 1} X::Array{Float64, 2} alpha::Float64 beta::Array{Float64, 1} gamma::Array{Float64, 1} sigma::Array{Float64, 1} delta::Array{Float64, 1} xi::Array{Float64, 1} marketidx::Array{Int64, 1} # Market IDs in a vector midx::Array{Int64, 2} # Beginning and end index for each market J_f::Array{Int64, 2} # Beginning and end index for each firm's products y_it::Array{Float64, 2} # Random income draws v_ik::Array{Float64, 2} # Random coefficient draws unobs_weight::Array{Float64, 1} p_ijt::Array{Float64, 2} # Array for individual choice probability s_jt::Array{Float64, 1} # Array for implied market share act_s::Array{Float64, 1} # Array for actual market shares mc::Array{Float64, 1} # Array for marginal costs optinst::Array{Float64,2} # Array of Optimal Instruments W::Array{Float64, 2} function ModelData( price , X , alpha , beta , gamma, sigma, delta, xi, marketidx, midx, J_f, y_it, v_ik, unobs_weight, p_ijt, s_jt, act_s, mc, optinst) return new( price , X , alpha , beta , gamma, sigma, delta, xi, marketidx, midx, J_f, y_it, v_ik, unobs_weight, p_ijt, s_jt, act_s, mc, optinst, zeros(X) ) end end
Generating Instruments
The instruments are generated in the same manner as before. I include two options for calculating the instruments. First, when normal == 1, they are calculated consistently with BLP 1995 (based on the mistake they made in their code), and can be used to recreate their Table III. Otherwise, the instruments are calculated following their description in Section 5.1 (for more information, see the discussion in their working paper as well).
function gen_inst( inX, normal = 1 ) totMarket = similar(inX) totFirm = similar(inX) for m in marks sub = inX[find(markets .== m),:] firminfo = firms[find(markets .== m),:] #modelinfo = modLoc[find(markets .== m),:] sameFirm = convert(Array{Float64,2}, firminfo .== firminfo') #sameFirm = sameFirm - diagm(diag(sameFirm)) #sameProduct = ones(sameFirm) - diagm(ones(size(sub,1), 1)[:]) z_1 = similar(sub) for i = 1:size(sub, 2) if normal == 1 z_1[:,i] = sum((sub[:,i] .* sameFirm), 1)' # Prime on sameFirm is key else z_1[:,i] = sum((sub[:,i] .* (sameFirm .- eye(sameFirm)))', 1)' # Prime on sameFirm is key end end totFirm[find(markets .== m),:] = z_1 # Within Market sub = inX[find(markets .== m),:] z_1 = similar(sub) sameFirm = firminfo .== firminfo' for i = 1:size(sub, 2) if normal == 1 z_1[:,i] = sum(sub[:,i] .* (sameFirm .+ .!sameFirm ),1) else z_1[:,i] = sum((sub[:,i] .* (.!sameFirm )),1) end end totMarket[find(markets .== m),:] = z_1 end return [totFirm, totMarket] end Z_x_1, Z_x_2 = gen_inst(X,0) Z_w_1, Z_w_2 = gen_inst(W,0) Z_x = hcat(X, Z_x_1, Z_x_2, convert(Array{Float64,1}) Z_w = hcat(W, Z_w_1, Z_w_2, convert(Array{Float64,1}, regSet[:mpd])) Z_ = [Z_x zeros(Z_w); zeros(Z_x) Z_w];
Random Draws
Following a recommendation in Skraina and Judd, I use different draws for each market. BLP used the same set of draws for each market, which is not ideal because the simulation error will be correlated across markets. By using a different set of draws, the simulation error will tend to cancel out, resulting in more precise estimates. Because we use the same number of draws in each market, this we get more precise estimates for little additional computational cost. I scale the number of draws and the weights by the number of markets (in this case 20) to ensure that the set of draws can be partitioned by market.
srand(41318) ns = 1000*20 v_ik1 = randn(6, ns )' m_t = repeat(incomeMeans, inner = [ns, 1]) #y_it = exp(m_t + sigma_v * repeat(v_ik[:,end], outer = [length(incomeMeans),1])); y_it1 = exp.(incomeMeans .+ sigma_v * v_ik1[:,end]'); unobs_weight = ones(ns)'/(ns)*20;
Contraction Mapping and Marginal Costs
Here is the bulk of the code. I have combined the calculation of the contraction mapping and the marginal costs into a single function. While I have written extensively about the bulk of this code already (again, here) there is one significant change. Raynaerts, Varadhan, and Nash (2012) note that the contraction mapping can also be formulated as a root-finding problem. With this insight, they explore three methods of solving for the : Newton’s Method, SQUAREM, and DF-SANE. Having implemented each of these methods, I found that a combination of the contraction mapping and Newton’s method worked best for this problem. Newton’s method has quadratic convergence but is unstable from starting points far away from the solution. I therefore use the contraction mapping to find a region of attraction and then switch to Newton’s method to quickly converge to the solution (for those of you familiar with John Rust’s bus engine replacement model, this is almost identical to what he used in his nested fixed point estimator).
While I don’t use the spectral methods SQUAREM or DF-SANE here, I have found them to be superior in a few contexts. First, with large datasets calculating the Jacobian for Newton’s method can be very expensive. SQUAREM and DF-SANE approximate the Jacobian with a diagonal matrix and so are much less computationally burdensome. You are then trading off between computational complexity and rates of convergence (Newton’s method converges quadratically while the other two converge superlinearly), often in favor of the spectral methods. Second, when using sparse grids for numerical integration, you potentially have to deal with negative market shares. This is becomes an issue for the contraction mapping, which we need to combine with Newton’s method for stability. The spectral methods are far more stable than Newton’s method, and therefore are more robust when using sparse grids. I have not found a situation in which the contraction mapping by itself was superior to any of these methods.
function gen_mm2(m::ModelData) α = m.alpha v_ik = m.v_ik J_f = m.J_f midx = m.midx for i = 1:length(m.p_ijt) m.p_ijt[i] = 0.0 end m.delta .= exp.(m.delta) step = Int(size(m.v_ik,1)/size(m.midx,1)) # iterate over markets for mi = 1:size(m.midx,1) muidx1 = step*(mi - 1) + 1; muidx2 = step*mi eps = 1.0 count = 1 r1 = midx[mi,1]; r2 = midx[mi,2]; # precalculate random coeffs mui = zeros(r2-r1+1, size(m.p_ijt,2)) # Allocate Array randcoeff = (m.sigma' .* m.v_ik[muidx1:muidx2,:]) yp = m.y_it[r1:r2,muidx1:muidx2].*m.price[r1:r2] # calculate random coefficient gemm!('N', 'T', 1.0, m.X[r1:r2,:], randcoeff, 0.0, mui) mui .-= α*yp mui .= exp.(mui); marketmat = ones(r2-r1+1, r2-r1+1) # contraction mapping/fixed point while (eps > 1e-14)*(200 > count) numer = mui.*m.delta[r1:r2]; s0 = 1.0./(1.0 + sum(numer,1)) m.p_ijt[r1:r2,:] .= numer .* s0; m.s_jt[r1:r2] .= m.p_ijt[r1:r2,:] * m.unobs_weight[muidx1:muidx2] # Finding the Fixed Point if eps > 0.1 # Nevo Contraction m.delta[r1:r2] .= m.delta[r1:r2] .* m.act_s[r1:r2]./m.s_jt[r1:r2] eps = maximum(abs.((m.s_jt[r1:r2]./m.act_s[r1:r2]) - 1)) else # Newton Root-finding m.delta[r1:r2] .= log.(m.delta[r1:r2]) tmp2 = m.p_ijt[r1:r2,:] .* m.unobs_weight[muidx1:muidx2]' Jf = Base.LinAlg.I - (tmp2*m.p_ijt[r1:r2,:]')./m.s_jt[r1:r2,:] diff = log.(m.s_jt[r1:r2]) .- log.(m.act_s[r1:r2]) stp = -Jf\diff m.delta[r1:r2,:] .= m.delta[r1:r2,:] .+ stp eps = maximum(abs.(stp)) m.delta[r1:r2] .= exp.(m.delta[r1:r2]) end count += 1 end end m.delta .= log.(m.delta) # calculate mc for j = 1:size(J_f, 1) idx1 = J_f[j,1]; idx2 = J_f[j,2]; muidx = searchsortedlast(m.midx[:,1], idx1 ) muidx_1 = (muidx-1)*step + 1 muidx_2 = (muidx)*step rp = m.p_ijt[idx1:idx2,:] wα = m.unobs_weight[muidx_1:muidx_2] .* m.y_it[idx1,muidx_1:muidx_2] * α nfirms = idx2 - idx1 + 1; tmp2 = rp .* wα' tmpvec = diagm(rp * wα); gemm!('N', 'T', -1.0, m.p_ijt[idx1:idx2,:], tmp2 , 1.0, tmpvec) b = tmpvec\m.s_jt[idx1:idx2] m.mc[idx1:idx2] .= m.price[idx1:idx2] .- b end for i = 1:length(m.mc) m.mc[i] = ifelse(m.mc[i] < 0, 0.001, m.mc[i]) end return nothing end
Estimation
The rest of the code is pretty much the same as before. The only real change I made was to use an initial approximation of the optimal GMM weighting matrix rather than the identity matrix in the first step of my estimation routine. This appears to better approximate BLP’s method.
d_share = convert(Array{Float64,1}, regSet[:share]) m = ModelData( price[:], X, α, βs, γs, σs, deepcopy(delta_0), deepcopy(delta_0), markets[:], midx, J_f, incy, v_ik1, unobs_weight1[:], zeros(size(X,1), size(v_ik1,1)/20), convert(Array{Float64,1}, deepcopy(d_share)), convert(Array{Float64,1}, deepcopy(d_share)), deepcopy(delta_0), zeros(size(X,1), 19)); @time gen_mm2(m) mean(m.delta)
0.256679 seconds (10.21 k allocations: 152.647 MiB, 14.09% gc time) -0.5655685632777482
function GMM(param, m, wght) m.alpha = abs(param[1]) m.sigma = abs.(param[2:end]) gen_mm2(m) Z_ = m.optinst y_ = vcat(m.delta, log.(m.mc)) X_ = [m.X zeros(m.W); zeros(m.X) m.W] zx = Z_'*X_ βols = (zx'wght*zx)\(zx'wght*Z_'y_) ϵ = y_ - X_*βols g = Z_'*ϵ/length(ϵ) m.beta = βols[1:size(m.X,2)] m.gamma = βols[size(m.X,2)+1:end] res = g'wght*g*length(g) println(m.alpha, m.sigma, res) return res end
m.W = W m.optinst = Z_; tp = [α, σs...]; wght = inv(Z_'Z_); y_ = vcat(m.delta, log.(m.mc)) X_ = [m.X zeros(m.W); zeros(m.X) m.W] zx = Z_'*X_ βols = (zx'wght*zx)\(zx'wght*Z_'y_) ϵ = y_ - X_*βols; mm = Z_ .* ϵ g = mean(mm, 1) wght = inv(cholfact(Hermitian(mm'mm/size(mm,1) .- g.*g'))); opt = Opt(:LN_COBYLA, 6) initial_step!(opt, abs.(tp)*0.1 ) ftol_rel!(opt,1e-4) xtol_rel!(opt,1e-4) maxeval!(opt, 250) min_objective!(opt, (x,y) -> GMM(x, m, wght)) @time minf_,minx_,ret_ = NLopt.optimize(opt, tp )
323.130396 seconds (3.91 M allocations: 130.944 GiB, 6.67% gc time) (4.330589470676522, [66.6355, 0.578001, 12.0295, 3.8127, 0.0768498, 5.59248], :FTOL_REACHED)
The code in my previous post took twice as long to run for the same number of draws, so this is a modest improvement. We can just repeat the above for the second stage to get your final estimate.
m.alpha = minx_[1] m.sigma = abs.(minx_[2:end]) gen_mm2(m); y_ = vcat(m.delta, log.(m.mc)) X_ = [m.X zeros(m.W); zeros(m.X) m.W] zx = Z_'*X_ βols = (zx'wght*zx)\(zx'wght*Z_'y_) ϵ = y_ - X_*βols; mm = Z_ .* ϵ g = mean(mm, 1) wght2 = inv(cholfact(Hermitian(mm'mm/size(mm,1) .- g.*g'))); opt = Opt(:LN_COBYLA, 6) initial_step!(opt, abs.(tp)*0.01 ) ftol_rel!(opt,1e-4) maxeval!(opt, 250) min_objective!(opt, (x,y) -> GMM(x, m, wght2)) @time minf_,minx_,ret_ = NLopt.optimize(opt, tp )
314.578066 seconds (3.66 M allocations: 121.229 GiB, 6.55% gc time) (4.786798337602201, [60.0746, 4.58298, 6.35377, 2.11136, -1.02675e-6, 4.18812], :FTOL_REACHED)
Note that the parameter estimates are pretty off from what BLP actually estimate. This is because 1,000 random draws does a poor job of approximating the market share integrals. I have previously discussed some methods to reduce simulation error, such as Halton draws and quadrature, and will provide some notes on importance sampling in a future post.
References
- Berry, Steven, James Levinsohn, and Ariel Pakes. “Automobile prices in market equilibrium.” Econometrica: Journal of the Econometric Society (1995): 841-890.
- Reynaerts, Jo, R. Varadha, and John C. Nash. “Enhencing the Convergence Properties of the BLP (1995) Contraction Mapping.” (2012).
- Skrainka, Benjamin S., and Kenneth L. Judd. “High performance quadrature rules: How numerical integration affects a popular model of product differentiation.” (2011).