1 |
Expanding Data Mining Theory for Industrial ApplicationsJanuary 2012 (has links)
abstract: The field of Data Mining is widely recognized and accepted for its applications in many business problems to guide decision-making processes based on data. However, in recent times, the scope of these problems has swollen and the methods are under scrutiny for applicability and relevance to real-world circumstances. At the crossroads of innovation and standards, it is important to examine and understand whether the current theoretical methods for industrial applications (which include KDD, SEMMA and CRISP-DM) encompass all possible scenarios that could arise in practical situations. Do the methods require changes or enhancements? As part of the thesis I study the current methods and delineate the ideas of these methods and illuminate their shortcomings which posed challenges during practical implementation. Based on the experiments conducted and the research carried out, I propose an approach which illustrates the business problems with higher accuracy and provides a broader view of the process. It is then applied to different case studies highlighting the different aspects to this approach. / Dissertation/Thesis / M.S. Computer Science 2012
|
2 |
Evaluating Frameworks for Implementing Machine Learning in Signal Processing : A Comparative Study of CRISP-DM, SEMMA and KDDDåderman, Antonia, Rosander, Sara January 2018 (has links)
Machine learning is when a computer can learn from data and draw its own conclusions without being explicitly programmed to do so. To implement machine learning effectively and correctly, it is important to have a structured framework to follow. Today, there exist several different frameworks but no framework is suited for all purposes of machine learning. This thesis evaluates three chosen frameworks CRISP-DM, SEMMA and KDD for the purpose of imple- menting machine learning in signal processing. This study was conducted at Saab AB in Ja¨rf¨alla. The specific problem area of signal processing that was evaluated in the thesis was radar warn- ing systems. A hypothesis is that they could become more efficient with machine learning. To evaluate the chosen frameworks, it was studied what was demanded from a framework when implementing machine learning in the chosen problem area. The evaluation was done with a theoretical comparison where no implementations of the different frameworks were done. The frameworks were evaluated through an evaluation method created by the authors. The evaluation method was used for the purpose of finding a framework suitable for signal processing when developing the software for a radar warning system. The result is that CRISP-DM is the most well-suited of the three frame- works. This because it originates from a business perspective, is distinct in how to use it and is easy to implement in an agile process like Scrum. / Maskininlärning är när en dator kan lära sig från data och dra egna slutsatser utan att specifikt vara programmerad att göra det. För att lyckas med att implementera maskininlärning på ett effektivt sätt så krävs det att man följer ett tydligt ramverk. Idag finns det många ramverk men inget som är lämpat för alla typer av maskininlärning. Denna rapport utvärderar tre valda ramverk: CRISP- DM, SEMMA och KDD. Detta med syftet att implementera maskininlärn-ing i signalbehandling. Studien utfördes på Saab AB i Järfälla. Det specifika problemområde inom signalbehandling som utvärderades i rapporten var radarvarningssys- tem. En hypotes är att de kan bli mer effektiva med maskininlärning. För att utvärdera de valda ramverken så studerades vad som krävdes av ett ramverk för det valda problemområdet. Utvärderingen skedde genom en teoretisk jämförelse där ingen implementation av de olika ramverken genomfördes. Ramverken utvärderades genom en utvärderingsmetod skapad av förfat-tarna. Utvärderingsmetoden användes med syftet att finna ett ramverk som var lämpligt för signalbehandling vid utveckling av mjukvara för ett radarvarningssystem. Resultatet var att CRISP-DM var den mest lämpade metoden. Detta för att den utgår från ett affärsperspektiv, har tydliga riktlinjer hur den ska användas och att den enkelt kan implementeras i agila processer såsom Scrum.
|
3 |
Metodologías para el descubrimiento de conocimiento en bases de datos: un estudio comparativoMoine, Juan Miguel 23 September 2013 (has links)
Para llevar a cabo en forma sistemática el proceso de descubrimiento de conocimiento en bases de datos, conocido como minería de datos, es necesaria la implementación de una metodología.
Actualmente las metodologías para minería de datos se encuentran en etapas tempranas de madurez, aunque algunas como CRISP-DM ya están siendo utilizadas exitosamente por los equipos de trabajo para la gestión de sus proyectos.
En este trabajo se establece un análisis comparativo entre las metodologías de minería de datos más difundidas en la actualidad. Para lograr dicha tarea, y como aporte de esta tesis, se ha propuesto un marco comparativo que explicita las características que se deberían tener en cuenta al momento de efectuar esta confrontación.
|
4 |
Datamining - theory and it's application / Datamining - teorie a praxePopelka, Aleš January 2012 (has links)
This thesis deals with the topic of the technology called data mining. First, the thesis describes the term data mining as an independent discipline and then its processing methods and the most common use. The term data mining is thereafter explained with the help of methodologies describing all parts of the process of knowledge discovery in databases -- CRISP-DM, SEMMA. The study's purpose is presenting new data mining methods and particular algorithms -- decision trees, neural networks and genetic algorithms. These facts are used as theoretical introduction, which is followed by practical application searching for causes of meningoencephalitis development of certain sample of patients. Decision trees in system Clementine, which is one of the top datamining tools, were used for the analysys.
|
5 |
Webový portál pro správu a klasifikaci informací z distribuovaných zdrojů / Web Application for Managing and Classifying Information from Distributed SourcesVrána, Pavel January 2011 (has links)
This master's thesis deals with data mining techniques and classification of the data into specified categories. The goal of this thesis is to implement a web portal for administration and classification of data from distributed sources. To achieve the goal, it is necessary to test different methods and find the most appropriate one for web articles classification. From the results obtained, there will be developed an automated application for downloading and classification of data from different sources, which would ultimately be able to substitute a user, who would process all the tasks manually.
|
Page generated in 0.0235 seconds