5 Conclusions Through this example, we have learned that it is possible to arrive at similar model estimates using big data in place of a survey. However, this result is dependent on adjustments that must be made to one or the other data set to account for differences in the sampled universe and definitions of the variables. To arrive at similar model estimates using the CCP and SCF, one must first adjust for the CCP’s lack of observation of people with no credit records. Some nonborrowers need to be dropped from the SCF sample or added to the CCP sample. Also, the similarity in the models seems to be driven by the predictability of the largest category of debt, mortgages. Models of auto debt arrive at very different estimates using CCP data rather than SCF data even though the two sampled distributions of auto debt are very similar. Models of credit card and student loan debt show even more disparity. While surveys usually collect demographic data and questions on multiple related topics, big data sets will only contain variables created for the data sets’ original purposes. In the demonstration above, we see both the potential and limitations of merging in external data. The CCP data was augmented with ACS data by assigning tract-level measures according to location, age, and family structure. Estimates using the merged income and education data appear to do an adequate to good job of replicating individual observations. However, in the case of representing the influence of children on borrowing, the attempt is not successful. The prevalence of children in the borrower’s tract seems to be representing something different than an indicator of children in the borrower’s own household. The model coefficients are much higher on the tract-level measure.