دانلود رایگان مقاله انگلیسی مقایسه کلان داده یک نظرسنجی - الزویر 2018

عنوان فارسی
مقایسه کلان داده یک نظرسنجی
عنوان انگلیسی
Big Data versus a survey
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
32
سال انتشار
2018
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
کد محصول
E6168
رشته های مرتبط با این مقاله
اقتصاد
گرایش های مرتبط با این مقاله
اقتصاد مالی
مجله
فصلنامه اقتصاد و دارایی - The Quarterly Review of Economics and Finance
دانشگاه
Research Economist - Research Department Federal Reserve Bank of Cleveland
کلمات کلیدی
کلان داده، داده های نظرسنجی، بدهی خانواده
چکیده

Abstract


Economists are shifting resources from work on survey data to work involving “Big Data.” This analysis is an empirical exploration of the trade-offs this substitution requires. Parallel models are estimated using Equifax credit bureau data and Survey of Consumer Finances data. After adjustments to account for different variable definitions and sampled populations, it is possible to arrive at similar models of total household debt. However, the estimates are sensitive to the adjustments. In this example, some external education and income measures are successfully integrated with the big data, but other external aggregates fail to adequately substitute for survey responses.

نتیجه گیری

5 Conclusions Through this example, we have learned that it is possible to arrive at similar model estimates using big data in place of a survey. However, this result is dependent on adjustments that must be made to one or the other data set to account for differences in the sampled universe and definitions of the variables. To arrive at similar model estimates using the CCP and SCF, one must first adjust for the CCP’s lack of observation of people with no credit records. Some nonborrowers need to be dropped from the SCF sample or added to the CCP sample. Also, the similarity in the models seems to be driven by the predictability of the largest category of debt, mortgages. Models of auto debt arrive at very different estimates using CCP data rather than SCF data even though the two sampled distributions of auto debt are very similar. Models of credit card and student loan debt show even more disparity. While surveys usually collect demographic data and questions on multiple related topics, big data sets will only contain variables created for the data sets’ original purposes. In the demonstration above, we see both the potential and limitations of merging in external data. The CCP data was augmented with ACS data by assigning tract-level measures according to location, age, and family structure. Estimates using the merged income and education data appear to do an adequate to good job of replicating individual observations. However, in the case of representing the influence of children on borrowing, the attempt is not successful. The prevalence of children in the borrower’s tract seems to be representing something different than an indicator of children in the borrower’s own household. The model coefficients are much higher on the tract-level measure.


بدون دیدگاه