Return to search

Predictive Modeling of Human Behavior: Supervised Learning from Telecom Metadata

Big data, specifically Telecom Metadata, opens new opportunities for human behavior understanding, applying machine learning and big data processing computational methods combined with interdisciplinary knowledge of human behavior. In this thesis new methods are developed for human behavior predictive modeling based on anonymized telecom metadata on individual level and on large scale group level, which were studied during research projects held in 2012-2016 in collaboration with Telecom Italia, Telefonica Research, MIT Media Lab and University of Trento. It is shown that human dynamics patterns could be reliably recognized based on human behavior metrics derived from the mobile phone and cellular network activity (call log, sms log, bluetooth interactions, internet consumption).
On individual level the results are validated on use cases of detecting daily stress and estimating subjective happiness. An original approach is introduced for feature extraction, selection, recognition model training and validation. Experimental results based on ensemble stochastic classification and regression tree models are discussed. On large group level, following big data for social good challenges, the problem of crime hotspot prediction is formulated and solved. In the proposed approach we use demographic information along with human mobility characteristics as derived from anonymized and aggregated mobile network data. The models, built on and evaluated against real crime data from London, obtain accuracy of almost 70% when classifying whether a specific area in the city will be a crime hotspot or not in the following month. Electric energy consumption patterns are correlated with human behavior patterns in highly nonlinear way. Second large scale group behavior prediction result is formulated as predicting next week energy consumption based on human dynamics analysis derived out of the anonymized and aggregated telecom data, processed from GSM network call detail records (CDRs). The proposed solution could act on energy producers/distributors as an essential aid to smart meters data for making better decisions in reducing total primary energy consumption by limiting energy production when the demand is not predicted, reducing energy distribution costs by efficient buy-side planning in time and providing insights for peak load planning in geographic space. All the studied experimental results combine the introduced methodology, which is efficient to implement for most of multimedia and real-time applications due to highly reduced low-dimensional feature space and reduced machine learning pipelines. Also the indicators which have strong predictive power are discussed opening new horizons for computational social science studies.

Identiferoai:union.ndltd.org:unitn.it/oai:iris.unitn.it:11572/367800
Date January 2017
CreatorsBogomolov, Andrey
ContributorsBogomolov, Andrey, Pianesi, Fabio
PublisherUniversità degli studi di Trento, place:TRENTO
Source SetsUniversità di Trento
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/doctoralThesis
Rightsinfo:eu-repo/semantics/openAccess
Relationfirstpage:1, lastpage:78, numberofpages:78

Page generated in 0.0023 seconds