Artificial neural networks are fascinating machine learning algorithms. They used to be considered unreliable and computationally very expensive. Now it is known that modern neural networks can be quite useful, but their computational expensiveness unfortunately remains. Statistical boosting is considered to be one of the most important machine learning ideas. It is based on an ensemble of weak models that together create a powerful learning system. The goal of this thesis is the comparison of these machine learning models on three use cases. The first use case deals with modeling the probability of burglary in the city of Chicago. The second use case is the typical example of customer churn prediction in telecommunication industry and the last use case is related to the problematic of the computer vision. The second goal of this thesis is to introduce an open-source machine learning platform called H2O. It includes, among other things, an interface for R and it is designed to run in standalone mode or on Hadoop. The thesis also includes the introduction into an open-source software library Apache Hadoop that allows for distributed processing of big data. Concretely into its open-source distribution Hortonworks Data Platform.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:264614 |
Date | January 2016 |
Creators | Sabo, Juraj |
Contributors | Bašta, Milan, Plašil, Miroslav |
Publisher | Vysoká škola ekonomická v Praze |
Source Sets | Czech ETDs |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0013 seconds