Aaltodoc publication archive (Aalto University institutional repository)
School of Business | Department of Economics | Economics | 2016
Thesis number: 14370
Machine learning in applied econometrics: Deriving personal income drivers with randomized decision forests
|Title:||Machine learning in applied econometrics: Deriving personal income drivers with randomized decision forests|
|Year:||2016 Language: eng|
|Department:||Department of Economics|
|Index terms:||taloustieteet; economic science; ekonometria; econometrics; tietämyksenhallinta; knowledge management; oppiminen; learning; varallisuus; wealth; kehitys; development; Yhdysvallat; United States|
» hse_ethesis_14370.pdf size:2 MB (1374443)
|Key terms:||econometrics; machine learning; decision trees; causality; big data; random forests; income; american community survey|
In this paper I explore a modern field of research in applied econometrics: machine learning and the estimation of synthetic treatment effects.
Data generation is currently on an exponential growth path: smart phones, social media and networks of interconnected devices are generating information at an unprecedented pace. The size, structure and velocity of these information streams vary to a great extent. The field of econometrics is also evolving: classic econometric models can lead to biased results with big data and will not scale up to modern data sets.
I propose the well- performing Random Forests algorithm for use in econometrics. To adjust this method for causal analysis, recent theory on causal decision trees is explored. The proposed framework is then tested by estimating personal income drivers for the top 1% in U.S. population. The data used is the American Community Survey 5- year sample consisting of approximately 20 million rows.
It appears that high income is in fact driven by four core factors: education, experience, working hours and gender. To rank these predictors, a synthetic treatment effect simulation is run. I find that investing in education after a master's degree has a significant positive effect in the likelihood of high income. Additionally, it appears that the negative gender income effect for females can be undone with a combination of work experience and exceptional work- ethic.
Electronic publications are subject to copyright. The publications can be read freely and printed for personal use. Use for commercial purposes is forbidden.