Some Modified BLP Code

Update: I have added a post on importance sampling here.

Quick note: This code was written in Julia 0.6 and so will not be compatible with some aspects of my previous posts.

I recently replicated Petrin and Seo’s 2016 paper on using optimization as a source of identification in the random coefficients framework. Their method is more computationally complex than BLP, making efficiency a primary concern. Because it is built off of BLP’s basic methodology (it still uses the contraction mapping and calculates marginal costs analogously), I had to go back and streamline my baseline random coefficients estimator. This page only introduces the new code and what changed from my previous version. For a detailed discussion of the mechanics of BLP, see my previous post and the resources included there.

Data Set-up

I have to include the data-cleaning portion of my code as the previous version is not compatible with Julia 0.6. However, it is nearly identical to what I have previously posted, so you only need to worry about this to the extent that the import and cleaning are no longer working.

using DataFrames
using DataFramesMeta
using Base.LinAlg.BLAS
using CSV # Read Data
using BenchmarkTools
using NLopt
using Distributions

# For Halton Draws
using StatsFuns
using Primes

cars = CSV.read("[filepath]/Automobile Data.csv", header = true,
    allowmissing = :none);

cars[:ln_hpwt] = log.(cars[:hpwt])
cars[:ln_space] = log.(cars[:space])
cars[:ln_mpg] = log.(cars[:mpg])
cars[:ln_mpd] = log.(cars[:mpd])
cars[:ln_price] = log.(cars[:price])
cars[:trend] = cars[:market] - 1 + 70
cars[:cons] = 1;

regSet = @linq cars |>
@by(:model_year, s_0 = log.(1 - sum(:share)))
 
regSet = join(cars, regSet, on = :model_year);
  
regSet = @linq regSet |>
    @transform(s_i = log.(:share)) |>
    @transform(dif = :s_i - :s_0);

regSet[:dif_2] = log.(regSet[:share]) - log.(regSet[:share_out]) ;
regSet[:ln_price] = log.(regSet[:price]);
 
regSet = sort!(regSet, [:market,  :firmid])
 
markets = convert(Matrix, regSet[:, [ :market] ]);
marks = unique(markets)
firms = convert(Matrix, regSet[:, [ :firmid] ]);
modlist = unique(regSet[:newmodv])
modLoc = regSet[:newmodv];

X = convert(Array{Float64,2}, regSet[:, [ :cons, :hpwt, :air, :mpd, :space] ]) # Demand Variables
W = convert(Array{Float64,2}, regSet[:, [ :cons, :ln_hpwt, :air, :ln_mpg, :ln_space, :trend] ]) # Supply Variables
price = convert(Array{Float64,2}, regSet[:, [ :price] ]) # Price
delta_0 = convert(Array, round.(regSet[:,:dif_2], 20));

βs = [-7.061, 2.883, 1.521, -0.122, 3.46]
γs = [0.952, 0.477, 0.619, -0.415, -0.046, 0.019]
α = 43.501
σs = [3.612, 4.628, 1.818, 1.050, 2.056];

Here’s where the changes actually start to matter. My previous function for computing marginal costs was very inefficient because it performed a lot of unnecessary calculations. In it, I calculated the derivative of shares with respect to price between all models in the market. However, it is only necessary to calculate these derivatives for models owned by the same company (e.g. Ford only cares about the substitution between cars it owns, and not about the substitution to other brands). The below code determines the start and end rows for each firm, so that I can efficiently index the data and perform the relevant calculations for each firm individually.

J_f = Array{Int64}(0,2)
count = 0
for ms = 1:20
    midx = find(markets .== ms)
    tfirms = unique(firms[midx])
    for f in tfirms
        J_f = vcat(J_f, [count + findfirst(firms[midx], f) count + findlast(firms[midx], f)])
    end
    count += length(midx)
end;

Similarly, I find the first and last index of each market in order to more efficiently index the data. However, there isn’t that much of a speed difference between using the find function once vs. thousands of times.

# Beginning and end of the market ids
idxs = [findfirst(markets, i) for i = 1:20]
idxe = [findlast(markets, i) for i = 1:20]
midx = hcat(idxs, idxe);

Model Type

As always, I define a custom type to hold all the variables I will use throughout estimation. This way nothing will be defined in global scope (in Julia, using variables in a function that are defined globally can seriously slow down your program).

type ModelData
    price::Array{Float64, 1}
    X::Array{Float64, 2}
    alpha::Float64
    beta::Array{Float64, 1}
    gamma::Array{Float64, 1}
    sigma::Array{Float64, 1}
    delta::Array{Float64, 1}
    xi::Array{Float64, 1}
    marketidx::Array{Int64, 1} # Market IDs in a vector
    midx::Array{Int64, 2} # Beginning and end index for each market
    J_f::Array{Int64, 2} # Beginning and end index for each firm's products
    y_it::Array{Float64, 2} # Random income draws
    v_ik::Array{Float64, 2} # Random coefficient draws
    unobs_weight::Array{Float64, 1}
    p_ijt::Array{Float64, 2} # Array for individual choice probability
    s_jt::Array{Float64, 1} # Array for implied market share
    act_s::Array{Float64, 1} # Array for actual market shares
    mc::Array{Float64, 1}  # Array for marginal costs
    optinst::Array{Float64,2} # Array of Optimal Instruments
    W::Array{Float64, 2}
    function ModelData( price , X , alpha , beta , gamma, sigma, delta, xi, marketidx, midx, J_f, y_it, v_ik,
                        unobs_weight, p_ijt, s_jt, act_s, mc, optinst)
        return new( price , X , alpha , beta , gamma, sigma, delta, xi, marketidx, midx, J_f, y_it, v_ik,
                        unobs_weight, p_ijt, s_jt, act_s, mc, optinst, zeros(X) )
    end
end

Generating Instruments

The instruments are generated in the same manner as before. I include two options for calculating the instruments. First, when normal == 1, they are calculated consistently with BLP 1995 (based on the mistake they made in their code), and can be used to recreate their Table III. Otherwise, the instruments are calculated following their description in Section 5.1 (for more information, see the discussion in their working paper as well).

function gen_inst( inX, normal = 1 )
    totMarket = similar(inX)
    totFirm = similar(inX)

    for m in marks
        sub = inX[find(markets .== m),:]
        firminfo = firms[find(markets .== m),:]
        #modelinfo = modLoc[find(markets .== m),:]
        sameFirm = convert(Array{Float64,2}, firminfo .== firminfo')
        #sameFirm = sameFirm - diagm(diag(sameFirm))

        #sameProduct = ones(sameFirm) - diagm(ones(size(sub,1), 1)[:])
        z_1 = similar(sub)
        for i = 1:size(sub, 2)
            if normal == 1
                z_1[:,i] = sum((sub[:,i] .* sameFirm), 1)' # Prime on sameFirm is key
            else
                z_1[:,i] = sum((sub[:,i] .* (sameFirm .- eye(sameFirm)))', 1)' # Prime on sameFirm is key
            end
        end
        totFirm[find(markets .== m),:] = z_1

        # Within Market
        sub = inX[find(markets .== m),:]
        z_1 = similar(sub)
        sameFirm = firminfo .== firminfo'
        for i = 1:size(sub, 2)
            if normal == 1
                z_1[:,i] = sum(sub[:,i] .* (sameFirm .+ .!sameFirm ),1)
            else
                z_1[:,i] = sum((sub[:,i] .* (.!sameFirm )),1)
            end
        end
        totMarket[find(markets .== m),:] = z_1
    end

    return [totFirm, totMarket]
end

Z_x_1, Z_x_2 = gen_inst(X,0)
Z_w_1, Z_w_2 = gen_inst(W,0)
Z_x = hcat(X, Z_x_1, Z_x_2, convert(Array{Float64,1})
Z_w = hcat(W, Z_w_1, Z_w_2, convert(Array{Float64,1}, regSet[:mpd]))
Z_ = [Z_x zeros(Z_w); zeros(Z_x) Z_w];

Random Draws

Following a recommendation in Skraina and Judd, I use different draws for each market. BLP used the same set of draws for each market, which is not ideal because the simulation error will be correlated across markets. By using a different set of draws, the simulation error will tend to cancel out, resulting in more precise estimates. Because we use the same number of draws in each market, this we get more precise estimates for little additional computational cost. I scale the number of draws and the weights by the number of markets (in this case 20) to ensure that the set of draws can be partitioned by market.

srand(41318)

ns = 1000*20
v_ik1 = randn(6,  ns )'
m_t = repeat(incomeMeans, inner = [ns, 1])
 
#y_it = exp(m_t + sigma_v * repeat(v_ik[:,end], outer = [length(incomeMeans),1]));
y_it1 = exp.(incomeMeans .+ sigma_v * v_ik1[:,end]');

unobs_weight = ones(ns)'/(ns)*20;

Contraction Mapping and Marginal Costs

Here is the bulk of the code. I have combined the calculation of the contraction mapping and the marginal costs into a single function. While I have written extensively about the bulk of this code already (again, here) there is one significant change. Raynaerts, Varadhan, and Nash (2012) note that the contraction mapping can also be formulated as a root-finding problem. With this insight, they explore three methods of solving for the $\delta_j$ : Newton’s Method, SQUAREM, and DF-SANE. Having implemented each of these methods, I found that a combination of the contraction mapping and Newton’s method worked best for this problem. Newton’s method has quadratic convergence but is unstable from starting points far away from the solution. I therefore use the contraction mapping to find a region of attraction and then switch to Newton’s method to quickly converge to the solution (for those of you familiar with John Rust’s bus engine replacement model, this is almost identical to what he used in his nested fixed point estimator).

While I don’t use the spectral methods SQUAREM or DF-SANE here, I have found them to be superior in a few contexts. First, with large datasets calculating the Jacobian for Newton’s method can be very expensive. SQUAREM and DF-SANE approximate the Jacobian with a diagonal matrix and so are much less computationally burdensome. You are then trading off between computational complexity and rates of convergence (Newton’s method converges quadratically while the other two converge superlinearly), often in favor of the spectral methods. Second, when using sparse grids for numerical integration, you potentially have to deal with negative market shares. This is becomes an issue for the contraction mapping, which we need to combine with Newton’s method for stability. The spectral methods are far more stable than Newton’s method, and therefore are more robust when using sparse grids. I have not found a situation in which the contraction mapping by itself was superior to any of these methods.

function gen_mm2(m::ModelData)
    α = m.alpha
    v_ik = m.v_ik
    J_f = m.J_f
    midx = m.midx
    for i = 1:length(m.p_ijt)
        m.p_ijt[i] = 0.0
    end
 
    m.delta .= exp.(m.delta)
       
    step = Int(size(m.v_ik,1)/size(m.midx,1))

    # iterate over markets
    for mi = 1:size(m.midx,1)
        muidx1 = step*(mi - 1) + 1; muidx2 = step*mi

        eps = 1.0
        count = 1

        r1 = midx[mi,1]; r2 = midx[mi,2];

        # precalculate random coeffs
        mui = zeros(r2-r1+1, size(m.p_ijt,2)) # Allocate Array
        randcoeff = (m.sigma' .* m.v_ik[muidx1:muidx2,:])
        yp = m.y_it[r1:r2,muidx1:muidx2].*m.price[r1:r2]
        # calculate random coefficient
        gemm!('N', 'T', 1.0, m.X[r1:r2,:], randcoeff, 0.0, mui)
        mui .-= α*yp
        mui .= exp.(mui);

        marketmat = ones(r2-r1+1, r2-r1+1)
        # contraction mapping/fixed point
        while (eps > 1e-14)*(200 > count)

            numer = mui.*m.delta[r1:r2];
            s0 = 1.0./(1.0 + sum(numer,1))
            m.p_ijt[r1:r2,:] .= numer .* s0;

            m.s_jt[r1:r2] .= m.p_ijt[r1:r2,:] * m.unobs_weight[muidx1:muidx2]
            
           # Finding the Fixed Point
            if eps > 0.1
                # Nevo Contraction
                m.delta[r1:r2] .= m.delta[r1:r2] .* m.act_s[r1:r2]./m.s_jt[r1:r2]
                eps = maximum(abs.((m.s_jt[r1:r2]./m.act_s[r1:r2]) - 1))
    
            else
                # Newton Root-finding
                m.delta[r1:r2] .= log.(m.delta[r1:r2])
                tmp2 = m.p_ijt[r1:r2,:] .* m.unobs_weight[muidx1:muidx2]'

                Jf = Base.LinAlg.I - (tmp2*m.p_ijt[r1:r2,:]')./m.s_jt[r1:r2,:]
                diff = log.(m.s_jt[r1:r2]) .- log.(m.act_s[r1:r2])
                stp = -Jf\diff
                m.delta[r1:r2,:] .= m.delta[r1:r2,:] .+ stp
                eps = maximum(abs.(stp))
                m.delta[r1:r2] .= exp.(m.delta[r1:r2])
            end
    
            count += 1
        end

    end
    m.delta .= log.(m.delta)

    # calculate mc
    for j = 1:size(J_f, 1)
        
        idx1 = J_f[j,1]; idx2 = J_f[j,2];
        muidx = searchsortedlast(m.midx[:,1], idx1 )
        muidx_1 = (muidx-1)*step + 1
        muidx_2 = (muidx)*step
        rp = m.p_ijt[idx1:idx2,:] 
        wα =  m.unobs_weight[muidx_1:muidx_2] .* m.y_it[idx1,muidx_1:muidx_2] * α
        nfirms = idx2 - idx1 + 1;
        tmp2 = rp .* wα'
        tmpvec = diagm(rp * wα);
        gemm!('N', 'T', -1.0, m.p_ijt[idx1:idx2,:], tmp2 , 1.0, tmpvec) 

        b = tmpvec\m.s_jt[idx1:idx2]
        m.mc[idx1:idx2] .= m.price[idx1:idx2] .- b
    end
    for i = 1:length(m.mc)
        m.mc[i] = ifelse(m.mc[i] < 0, 0.001, m.mc[i])
    end
    return nothing
end

Estimation

The rest of the code is pretty much the same as before. The only real change I made was to use an initial approximation of the optimal GMM weighting matrix rather than the identity matrix in the first step of my estimation routine. This appears to better approximate BLP’s method.

d_share = convert(Array{Float64,1}, regSet[:share])
m = ModelData(  price[:], X, α, βs, γs, σs, deepcopy(delta_0), deepcopy(delta_0), markets[:], midx, J_f, incy, v_ik1, 
  unobs_weight1[:], zeros(size(X,1), size(v_ik1,1)/20), convert(Array{Float64,1}, deepcopy(d_share)), 
    convert(Array{Float64,1}, deepcopy(d_share)), deepcopy(delta_0), zeros(size(X,1), 19));

@time gen_mm2(m)
mean(m.delta)

0.256679 seconds (10.21 k allocations: 152.647 MiB, 14.09% gc time)
-0.5655685632777482

function GMM(param, m, wght)
    m.alpha = abs(param[1])
    m.sigma = abs.(param[2:end])
    
    gen_mm2(m)
    Z_ = m.optinst
    y_ = vcat(m.delta, log.(m.mc))

    X_ = [m.X zeros(m.W); zeros(m.X) m.W]
    zx = Z_'*X_
    
    βols = (zx'wght*zx)\(zx'wght*Z_'y_)
    ϵ = y_ - X_*βols
    g = Z_'*ϵ/length(ϵ)
    m.beta = βols[1:size(m.X,2)]
    m.gamma = βols[size(m.X,2)+1:end]
    res = g'wght*g*length(g)
    println(m.alpha, m.sigma, res)
    return res
end

m.W = W
m.optinst = Z_;

tp = [α, σs...];
wght = inv(Z_'Z_);

y_ = vcat(m.delta, log.(m.mc))
X_ = [m.X zeros(m.W); zeros(m.X) m.W]
zx = Z_'*X_

βols = (zx'wght*zx)\(zx'wght*Z_'y_)
ϵ = y_ - X_*βols;
mm = Z_ .* ϵ
g = mean(mm, 1)
wght = inv(cholfact(Hermitian(mm'mm/size(mm,1) .- g.*g')));

opt = Opt(:LN_COBYLA, 6)
initial_step!(opt, abs.(tp)*0.1 ) 
ftol_rel!(opt,1e-4)
xtol_rel!(opt,1e-4)
maxeval!(opt, 250)
min_objective!(opt, (x,y) -> GMM(x, m, wght))
@time minf_,minx_,ret_ = NLopt.optimize(opt, tp )

323.130396 seconds (3.91 M allocations: 130.944 GiB, 6.67% gc time)
(4.330589470676522, [66.6355, 0.578001, 12.0295, 3.8127, 0.0768498, 5.59248], :FTOL_REACHED)

The code in my previous post took twice as long to run for the same number of draws, so this is a modest improvement. We can just repeat the above for the second stage to get your final estimate.

m.alpha = minx_[1]
m.sigma = abs.(minx_[2:end])
gen_mm2(m);

y_ = vcat(m.delta, log.(m.mc))
X_ = [m.X zeros(m.W); zeros(m.X) m.W]
zx = Z_'*X_

βols = (zx'wght*zx)\(zx'wght*Z_'y_)
ϵ = y_ - X_*βols;
mm = Z_ .* ϵ

g = mean(mm, 1)
wght2 = inv(cholfact(Hermitian(mm'mm/size(mm,1) .- g.*g')));

opt = Opt(:LN_COBYLA, 6)
initial_step!(opt, abs.(tp)*0.01 ) 

ftol_rel!(opt,1e-4)
maxeval!(opt, 250)
min_objective!(opt, (x,y) -> GMM(x, m, wght2))
@time minf_,minx_,ret_ = NLopt.optimize(opt, tp )

314.578066 seconds (3.66 M allocations: 121.229 GiB, 6.55% gc time)
(4.786798337602201, [60.0746, 4.58298, 6.35377, 2.11136, -1.02675e-6, 4.18812], :FTOL_REACHED)

Note that the parameter estimates are pretty off from what BLP actually estimate. This is because 1,000 random draws does a poor job of approximating the market share integrals. I have previously discussed some methods to reduce simulation error, such as Halton draws and quadrature, and will provide some notes on importance sampling in a future post.

References

Berry, Steven, James Levinsohn, and Ariel Pakes. “Automobile prices in market equilibrium.” Econometrica: Journal of the Econometric Society (1995): 841-890.
Reynaerts, Jo, R. Varadha, and John C. Nash. “Enhencing the Convergence Properties of the BLP (1995) Contraction Mapping.” (2012).
Skrainka, Benjamin S., and Kenneth L. Judd. “High performance quadrature rules: How numerical integration affects a popular model of product differentiation.” (2011).