31 |
Využití statistických metod při oceňování nemovitostí / Valuation of real estates using statistical methodsFuniok, Ondřej January 2017 (has links)
The thesis deals with the valuation of real estates in the Czech Republic using statistical methods. The work focuses on a complex task based on data from an advertising web portal. The aim of the thesis is to create a prototype of the statistical predication model of the residential properties valuation in Prague and to further evaluate the dissemination of its possibilities. The structure of the work is conceived according to the CRISP-DM methodology. On the pre-processed data are tested the methods regression trees and random forests, which are used to predict the price of real estate.
|
32 |
Reálná úloha dobývání znalostí / The real task of data miningTrondin, Anton January 2012 (has links)
Diploma thesis " The real role of knowledge mining " is divided into two major parts, the theoretical and the practical. The practical part describes the basic concepts of data mining, various methods and types of tasks used for knowledge discovery in databases and algorithms used in this area . Main focus is devoted to the CRISP -DM methodology and to various stages of knowledge discovery from databases. This methodology will be later used as the basis for practical part of the thesis while other less known methods used for data mining won`t be neglected. List of paid and free software which can be used for knowledge mining in databases is presented at the end of theoretical part. The second part of the thesis is focused on the practical step by step application of the CRISP -DM methodology, which contains real data from the field of mobile communications. Data mining task used in practical part is the behavioral prediction of mobile carrier customers. Supporting the practical part of the thesis, IBM SPSS Modeler was used as a main software for knowledge mining. Key words: data mining, knowledge disvocery in databases. Churm management, prediction, CRISP-DM.
|
33 |
Aplikace metod DZD na otevřená data / Use of data mining techniques for open dataProkůpek, Miroslav January 2015 (has links)
This diploma thesis examines applications of datamining methods to open data. It is realized by solving analytical questions using the LISp-Miner system. Analytical questions are examined in data from The Czech Trade Inspection Authority from the perspective of the data owner. Procedure used to solve analytical questions is 4ft-Miner. There are presented and resolved four analytical questions, which are the results of the work. Work includes a detailed description of the transformation of the relational database into a format suitable for data mining. A detailed description of the data is also included. The theoretical part deals with the GUHA method and CRISP-DM methodology.
|
34 |
Aplikace data miningu v podnikové praxi / Data mining applications in business practiceTrávníček, Petr January 2011 (has links)
Throughout last decades, knowledge discovery from databases as one of the information and communicaiton technologies' disciplines has developed into its current state being showed increasing interest not only by major business corporates. Presented diploma thesis deals with problematique of data mining while paying prime attention to its practical utilization within business environment. Thesis objective is to review possibilities of data mining applications and to decompose implementation techniques focusing on specific data mining methods and algorithms as well as adaptation of business processes. This objective is subject of theoretical part of thesis focusing on principles of data mining, knowledge discovery from databases process, data mining commonly used methods and algorithms and finally tasks typically implemented in this domain. Further objective consists in presenting data mining benefits on the model example that is being displayed in the practical part of the thesis. Besides created data mining models evalution, practical part contains also design of subsequent steps that would enable higher efficiency in some specific areas of given business. I believe previous point together with characterization of knowledge discovery in databases process to be considered as the most beneficial one's of the thesis.
|
35 |
Dolování dat / Data MiningStehno, David January 2013 (has links)
The aim of the thesis was to study and describe data mining methodology CRISP-DM. From the collected database of calls to the call center a prediction was performed, based on CRISP-DM methodology. In phase of test situation modeling four different testing methods were used: the k-NN, neural network, linear regression and super vector machine. The input attributes importance for further prediction was evaluated based on different selections. The results and findings may provide data for further more accurate forecasts in the future; not only in number of calls but also other indicators relevant to the call center.
|
36 |
Využití data miningu v personální agentuře / Utilization of Data Mining for Personnel AgencyOndruš, Erik January 2017 (has links)
This master’s thesis will look into the use of data mining in the area of segmentation and the prediction of onboarding candidates of a recruitment agency. The obtained results should serve to make company processes more effective concerning the processing of orders, and should also facilitate a more personal approach to candidates. The first chapter includes imperetive theoretical bases from the studies of Business Intelligence, data warehouses, data mining and marketing. Thereafter an analysis of the current state is presented with a focus on the capture of the key processes in processing and order. The last chapter looks at the proposed solution and implementation on the platform Microsoft SQL Server 2014. To conclude there are proposals of utilizing data mining in direct marketing.
|
37 |
DATA MINING IN PRACTICE : An application of the CRISP-DM framework in healthcareLind, Emma, Glas, Sofi January 2022 (has links)
With extensive data available in today's organizations, it has become increasingly important to secure valuable insights through data. As a result, the management of data to support decision-making processes is receiving increasing attention in organizations' IT strategies. The healthcare sector is no exception. However, there is an urgent need for tools that help organizations extract valuable insights from the rapidly growing volumes of data, one of the most important steps of which is data mining. So far, the healthcare sector has not found a way to harness its full potential, due to limited methods to extract useful knowledge hidden in large data sets. Knowledge gained from data mining can help healthcare to better serve patients, but there is a limited comprehensive picture of applications regarding data mining processes in healthcare. Against this background, the purpose of this study is to investigate practical dimensions of the data mining process in healthcare and further identify barriers that can inhibit this process. To answer our research question, we used a qualitative case study with semi structured interviews based on the CRISP-DM framework. Our findings indicate barriers that can inhibit the data mining process, which are related to the objectives, data availability and final reports.
|
38 |
Robustness of Machine Learning algorithms applied to gas turbines / Robusthet av maskininlärningsalgoritmer i gasturbinerCardenas Meza, Andres Felipe January 2024 (has links)
This thesis demonstrates the successful development of a software sensor for Siemens Energy's SGT-700 gas turbines using machine learning algorithms. Our goal was to enhance the robustness of measurements and redundancies, enabling early detection of sensor or turbine malfunctions and contributing to predictive maintenance methodologies. The research is based on a real-world case study, implementing the Cross Industry Standard Process for Data Mining (CRISP DM) methodology in an industrial setting. The thesis details the process from dataset preparation and data exploration to algorithm development and evaluation, providing a comprehensive view of the development process. This work is a step towards integrating machine learning into gas turbine systems. The data preparation process highlights the challenges that arise in the industrial application of data-driven methodologies due to inevitable data quality issues. It provides insight into potential future improvements, such as the constraint programming approach used for dataset construction in this thesis, which remains a valuable tool for future research. The range of algorithms proposed for the software sensor's development spans from basic to more complex methods, including shallow networks, ensemble methods and recurrent neural networks. Our findings explore the limitations and potential of the proposed algorithms, providing valuable insights into the practical application of machine learning in gas turbines. This includes assessing the reliability of these solutions, their role in monitoring machine health over time, and the importance of clean, usable data in driving accurate and satisfactory estimates of different variables in gas turbines. The research underscores that, while replacing a physical sensor with a software sensor is not yet feasible, integrating these solutions into gas turbine systems for health monitoring is indeed possible. This work lays the groundwork for future advancements and discoveries in the field. / Denna avhandling dokumenterar den framgångsrika utvecklingen av en mjukvarusensor för Siemens Energy's SGT-700 gasturbiner med hjälp av maskininlärningsalgoritmer. Vårt mål var att öka mätkvaliten samt införa redundans, vilket möjliggör tidig upptäckt av sensor- eller turbinfel och bidrar till utvecklingen av prediktiv underhållsmetodik. Forskningen baseras på en verklig fallstudie, implementerad enligt Cross Industry Standard Process for Data Mining-metodiken i en industriell miljö. Avhandligen beskriver processen från datamängdsförberedelse och datautforskning till utveckling och utvärdering av algoritmer, vilket ger en heltäckande bild av utvecklingsprocessen. Detta arbete är ett steg mot att integrera maskininlärning i gasturbinssystem. Dataförberedelsesprocessen belyser de utmaningar som uppstår vid industriell tillämpning av datadrivna metoder på grund av oundvikliga datakvalitetsproblem. Det ger insikt i potentiella framtida förbättringar, såsom den begränsningsprogrammeringsansats som används för datamängdskonstruktion i denna avhandling, vilket förblir ett värdefullt verktyg för framtida forskning. Utvecklingen av mjukvarusensorn sträcker sig från grundläggande till mer komplexa metoder, inklusive ytliga nätverk, ensemblemetoder och återkommande neurala nätverk. Våra resultat utforskar begränsningarna och potentialen hos de föreslagna algoritmerna och ger värdefulla insikter i den praktiska tillämpningen av maskininlärning i gasturbiner. Detta inkluderar att bedöma tillförlitligheten hos dessa lösningar, deras roll i övervakning av maskinhälsa över tid och vikten av ren, användbar data för att generera korrekta och tillfredsställande uppskattningar av olika variabler i gasturbiner. Forskningen understryker att, medan det ännu inte är genomförbart att ersätta en fysisk sensor med en mjukvarusensor, är det verkligen möjligt att integrera dessa lösningar i gasturbinssystem för tillståndsövervakning. Detta arbete lägger grunden för vidare studier och upptäckter inom området. / Esta tesis demuestra el exitoso desarrollo de un sensor basado en software para las turbinas de gas SGT-700 de Siemens Energy utilizando algoritmos de aprendizaje automático. Esto con el objetivo de contribuir a las metodologías de mantenimiento predictivo. La investigación se basa en un estudio industrial que implementa la metodología de Proceso Estándar de la Industria para la Minería de Datos, cuyo acrónimo en inglés CRISP-DM. La tesis detalla el proceso desde la preparación del 'dataset', la exploración de datos hasta el desarrollo y evaluación de algoritmos, proporcionando una visión holistica del proceso de desarrollo. Este trabajo representa un paso hacia la integración del aprendizaje automático en turbinas de gas. Nuestros hallazgos exploran las limitaciones y el potencial de los algoritmos propuestos, proporcionando un analisis sobre la aplicación práctica del aprendizaje automático en turbinas de gas. Esto incluye evaluar la confiabilidad de estas soluciones, su papel en la monitorización de la salud de la máquina a lo largo del tiempo, y la importancia de los datos limpios y utilizables para impulsar estimaciones precisas y satisfactorias de diferentes variables en las turbinas de gas. La investigación sugiere que, aunque reemplazar un sensor físico con un sensor basado en aprendizaje automatico aún no es factible, sí es posible integrar estas soluciones en los sistemas de turbinas de gas para monitorear del estado de la maquina.
|
39 |
Praktické uplatnění technologií data mining ve zdravotních pojišťovnách / Practical applications of data mining technologies in health insurance companiesKulhavý, Lukáš January 2010 (has links)
This thesis focuses on data mining technology and its possible practical use in the field of health insurance companies. Thesis defines the term data mining and its relation to the term knowledge discovery in databases. The term data mining is explained, inter alia, with methods describing the individual phases of the process of knowledge discovery in databases (CRISP-DM, SEMMA). There is also information about possible practical applications, technologies and products available in the market (both products available free and commercial products). Introduction of the main data mining methods and specific algorithms (decision trees, association rules, neural networks and other methods) serves as a theoretical introduction, on which are the practical applications of real data in real health insurance companies build. These are applications seeking the causes of increased remittances and churn prediction. I have solved these applications in freely-available systems Weka and LISP-Miner. The objective is to introduce and to prove data mining capabilities over this type of data and to prove capabilities of Weka and LISP-Miner systems in solving tasks due to the methodology CRISP-DM. The last part of thesis is devoted the fields of cloud and grid computing in conjunction with data mining. It offers an insight into possibilities of these technologies and their benefits to the technology of data mining. Possibilities of cloud computing are presented on the Amazon EC2 system, grid computing can be used in Weka Experimenter interface.
|
40 |
Building a decision support system to predict the number of visitors to an amusement park : Using an Artificial Neural Network and Statistical AnalysisJohansson, Benjamin, Almqvist, Elias January 2018 (has links)
In this thesis, we develop a decision support system for the amusement park Skara Sommarland. The aim is to predict how many visitors will come to the park in order to help the management allocate the correct amount of personnel on any given day. In order to achieve this, the widely used CRoss-Industry Standard for Data Mining framework was applied to finally build a multiple linear regression (MLR) function and an artificial neural network (ANN) model. The data used to develop the models were Skara Sommarland’s historical business data and historical weather data for the surrounding area. Additionally, a fully functional web application was built which allowed the management at Skara Sommarland to use the predictions in their daily operations. The ANN outperformed the MLR and managed to achieve about 80% accuracy in predicting the number of visitors, reaching the initial data mining goal set by the project group. The conclusion formed by this thesis is that an ANN can be used to predict the number of visitors to an amusement park similar to Skara Sommarland. The final IT artifact produced can realistically help improve an amusement park’s operations by avoiding over- and under-staffing.
|
Page generated in 0.024 seconds