Aaltodoc publication archive (Aalto University institutional repository)
School of Business | Department of Information and Service Economy | Quantitative Methods of Economics | 2013
Thesis number: 13524
Predicting website audience demographics based on browsing history
|Title:||Predicting website audience demographics based on browsing history|
|Year:||2013 Language: eng|
|Department:||Department of Information and Service Economy|
|Academic subject:||Quantitative Methods of Economics|
|Index terms:||taloustieteet; economic science; media; media; internet; internet; kuluttajakäyttäytyminen; consumer behaviour; arviointi; evaluation; mittarit; ratings; tilastotiede; statistical science|
» hse_ethesis_13524.pdf size:6 MB (5608679)
|Key terms:||demographic prediction; demographic targeting; browsing behavior; clickstream analysis; web user profiling; web analytics; classification; logistic regression; web cookies|
Objectives of the Study:
The objective of the study was to explore the possibility to predict demographics from browsing behavior of web users. To achieve this objective, the issue of predicting online audience demographics was addressed from three different perspectives. Firstly, the study addressed quality of input data for models and its impact on the accuracy of predictions. Then, it was analyzed how demographics of web users influences their online behavior and, finally, the focus laid on defining factors useful for predictions.
Academic background and methodology:
Scientific literature has a record of several previous attempts to predict online audience demographics. Also, some studies examine demographic differences in online behavior. However, the issue of quality of input data for predictive models is almost entirely ignored. Two theoretical frameworks for the study were formed on the basis of the literature review. Other research method used in this study is statistical analysis including t-tests, z-tests, ANOVA, linear regression and logistic regression models.
Findings and conclusions:
The study showed existence of several factors greatly deteriorating quality of input data for models predicting online audience demographics. This results in a decrease in accuracy of predictions in several ways such as smaller datasets, overestimation of the size of some demographic groups and incorrect models. Also, the study indicated that demographic groups show differences in online behavior including preferred website content, website visiting patterns over time and likelihood to click online ads. Thus, information on these aspects of online behavior can be used for predicting demographics of web users.
Electronic publications are subject to copyright. The publications can be read freely and printed for personal use. Use for commercial purposes is forbidden.