• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 139
  • 60
  • 27
  • 12
  • 12
  • 11
  • 9
  • 8
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 318
  • 318
  • 101
  • 87
  • 85
  • 64
  • 56
  • 46
  • 46
  • 41
  • 41
  • 40
  • 36
  • 34
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Příprava cvičení pro dolování znalostí z báze dat - klasifikace a predikce / Design of exercises for data mining - Classification and prediction

Martiník, Jan January 2009 (has links)
My master's thesis on the topic of "Design of exercises for data mining - Classification and prediction" deals with the most frequently used methods classification and prediction. There are association rules, Bayesian classification, genetic algorithms, the nearest method neighbor, neural network and decision trees on the classification. There are linear and non-linear prediction on the prediction. This work also contains a summary of detail the issue of decision trees and a detailed algorithm for creating the decision tree, including development of individual diagrams. The proposed algorithm for creating the decision tree is tested through two tests of data dowloaded from Internet. The results are mutually compared and described differences between the two implementations. The work is written in a way that would provide the reader with a notion of the individual methods and techniques for data mining, their advantages, disadvantages and some of the issues that directly relate to this topic.
152

Návrh rozhodovacích stromů na základě evolučních algoritmů / Decision Tree Design Based on Evolutionary Algorithms

Benda, Ondřej January 2012 (has links)
Tato diplomová práce pojednává o dvou algoritmech pro dolování z proudu dat - Very Fast Decision Tree (VFDT) a Concept-adapting Very Fast Decision Tree (CVFDT). Je vysvětlen princip klasifikace rozhodovacím stromem. Je popsána základní myšlenka konstrukce stromu Hoeffding Tree, který je základem pro algoritmy VFDT a CVFDT. Tyto algoritmy jsou poté rozebrány detailněji. Dále se tato práce zabývá návrhem algoritmu Genetického Programování (GP), který je použit pro vytváření klasifikátoru obrazových dat. Vytvořený klasifikátor je použit jako alternativní způsob klasifikace objektů v obraze ve frameworku Viola-Jones. V práci je rozebrána implementace algoritmů, které jsou implementovány v jazyce Java. Algoritmus GP je integrován do knihovny “Image Processing Extension” programu RapidMiner. Algoritmy VFDT a CVFDT jsou testovány na syntetických a reálných textových datech. Algoritmus GP je testován na klasifikaci obrazových dat a následně vytvořený klasifikátor je otestován na detekci obličejů v obraze.
153

Reporting - ERP systém / Reporting - ERP System

Pála, Milan January 2013 (has links)
This work deals with creating a module for existing ERP system. Module should be able to produce dataprogress of production, monitor productivity of production and warn if some issue will happen. This work evaluates a processing of a large amount of data and it shows different possibilities how to precalculate data. It also deals with a draft how to predict information from known data.
154

A COMPARATIVE STUDY OF DEEP-LEARNING APPROACHES FOR ACTIVITY RECOGNITION USING SENSOR DATA IN SMART OFFICE ENVIRONMENTS

Johansson, Alexander, Sandberg, Oscar January 2018 (has links)
Syftet med studien är att jämföra tre deep learning nätverk med varandra för att ta reda på vilket nätverk som kan producera den högsta uppmätta noggrannheten. Noggrannheten mäts genom att nätverken försöker förutspå antalet personer som vistas i rummet där observation äger rum. Utöver att jämföra de tre djupinlärningsnätverk med varandra, kommer vi även att jämföra dem med en traditionell metoder inom maskininlärning - i syfte för att ta reda på ifall djupinlärningsnätverken presterar bättre än vad traditionella metoder gör. I studien används design and creation. Design and creation är en forskningsmetodologi som lägger stor fokus på att utveckla en IT produkt och använda produkten som dess bidrag till ny kunskap. Metodologin har fem olika faser, vi valde att göra en iterativ process mellan utveckling- och utvärderingfaserna. Observation är den datagenereringsmetod som används i studien för att samla in data. Datagenereringen pågick under tre veckor och under tiden hann 31287 rader data registreras i vår databas. Ett av våra nätverk fick vi en noggrannhet på 78.2%, de andra två nätverken fick en noggrannhet på 45.6% respektive 40.3%. För våra traditionella metoder använde vi ett beslutsträd med två olika formler, de producerade en noggrannhet på 61.3% respektive 57.2%. Resultatet av denna studie visar på att utav de tre djupinlärningsnätverken kan endast en av djupinlärningsnätverken producera en högre noggrannhet än de traditionella maskininlärningsmetoderna. Detta resultatet betyder nödvändigtvis inte att djupinlärningsnätverk i allmänhet kan producera en högre noggrannhet än traditionella maskininlärningsmetoder. Ytterligare arbete som kan göras är följande: ytterligare experiment med datasetet och hyperparameter av djupinlärningsnätverken, samla in mer data och korrekt validera denna data och jämföra fler djupinlärningsnätverk och maskininlärningsmetoder. / The purpose of the study is to compare three deep learning networks with each other to evaluate which network can produce the highest prediction accuracy. Accuracy is measured as the networks try to predict the number of people in the room where observation takes place. In addition to comparing the three deep learning networks with each other, we also compare the networks with a traditional machine learning approach - in order to find out if deep learning methods perform better than traditional methods do. This study uses design and creation. Design and creation is a methodology that places great emphasis on developing an IT product and uses the product as its contribution to new knowledge. The methodology has five different phases; we choose to make an iterative process between the development and evaluation phases. Observation is the data generation method used to collect data. Data generation lasted for three weeks, resulting in 31287 rows of data recorded in our database. One of our deep learning networks produced an accuracy of 78.2% meanwhile, the two other approaches produced an accuracy of 45.6% and 40.3% respectively. For our traditional method decision trees were used, we used two different formulas and they produced an accuracy of 61.3% and 57.2% respectively. The result of this thesis shows that out of the three deep learning networks included in this study, only one deep learning network is able to produce a higher predictive accuracy than the traditional ML approaches. This result does not necessarily mean that deep learning approaches in general, are able to produce a higher predictive accuracy than traditional machine learning approaches. Further work that can be made is the following: further experimentation with the dataset and hyperparameters, gather more data and properly validate this data and compare more and other deep learning and machine learning approaches.
155

AN AGENT-BASED SYSTEMATIC ENSEMBLE APPROACH FOR AUTO AUCTION PREDICTION

Alfuhaid, Abdulaziz Ataallah January 2018 (has links)
No description available.
156

A Statistical Analysis of Medical Data for Breast Cancer and Chronic Kidney Disease

Yang, Kaolee 05 May 2020 (has links)
No description available.
157

Probability of Default Machine Learning Modeling : A Stress Testing Evaluation

Andersson, Tobias, Mentes, Mattias January 2023 (has links)
This thesis aims to assist in the development of machine learning models tailored for stress testing. The main objective is to create models that can predict loan defaults while considering the impact of macroeconomic stress. By achieving this, Nordea can continue the development of machine learning models for stress testing by utilizing the models as a basis for further advancement. The research begins with an analysis of historical loan data, encompassing diverse customer and macroeconomic variables that influence loan default rates. Leveraging machine learning algorithms, feature selection methods, data imbalance management and model training techniques, a set of predictive models is constructed. These models aim to capture the intricate relationships between the identified variables and loan defaults, ensuring their suitability for stress testing purposes. The subsequent phase of the research focuses on subjecting the developed models to simulated adverse economic conditions during stress testing. By evaluating the models’ performance under various stressed scenarios, their ability to provide predictions is assessed. This stress testing process allows us to analyse the models’ capabilities of incorporating a stressed scenario in their predictions. The thesis concludes with an evaluation of the developed machine learning models and their abilities to identify defaulted loans in a stressed macroeconomy. By creating these models specifically tailored for stress testing loans, we will provide a basis for further development within the area of stress testing modeling. / Denna uppsats syftar till att bidra till utvecklingen av maskininlärningsmodeller lämpade för stress testing. Det främsta målet är att skapa modeller som kan förutsäga lån som kommer att misslyckas samtidigt som de beaktar påverkan av makroekonomisk stress. Genom att uppnå detta kan Nordea fortsätta utvecklingen av maskininlärningsmodeller för stress testning genom att använda modellerna som grund för ytterligare utveckling. Arbetet inleds med en analys av historisk lånedata, som omfattar olika kund- och makroekonomiska variabler som påverkar lån. Genom att använda oss av maskininlärningsalgoritmer, metoder för urval av förklarande variabler, hantering av dataobalans och tekniker för modellträning konstrueras en uppsättning prediktiva modeller. Dessa modeller syftar till att fånga de komplexa relationerna mellan de identifierade variablerna och låneavvikelser och säkerställa deras lämplighet för stress testning. Den efterföljande fasen av arbetet fokuserar på att utsätta de utvecklade modellerna för simulerade stressade ekonomiska förhållanden. Genom att utvärdera modellernas prestanda under olika stressade förhållanden bedöms deras förmåga att prediktera uteblivna lån. Denna process för stress testning gör det möjligt för oss att analysera modellernas förmåga att inkludera stressade förhållanden i sina prediktioner. Uppsatsen avslutas med en utvärdering av de utvecklade maskininlärningsmodellerna och deras förmåga att identifiera uteblivna lån i en stressad makroekonomi. Genom att skapa dessa modeller specifikt anpassade för stresstestning av lån kommer vi att ge en grund för ytterligare utveckling inom området.
158

Tillämpning av maskininlärning för att införa automatisk adaptiv uppvärmning genom en studie på KTH Live-In Labs lägenheter / Using machine learning to implement adaptive heating; A study on KTH Live-In Labs apartments

Åsenius, Ingrid January 2020 (has links)
The purpose of this study is to investigate if it is possible to decrease Sweden's energy consumption through adaptive heating that uses climate data to detect occupancy in apartments using machine learning. The application of the study has been made using environmental data from one of KTH Live-In Labs apartments. The data was first used to investigate the possibility to detect occupancy through machine learning and was then used as input in an adaptive heating model to investigate potential benefits on the energy consumption and costs of heating. The result of the study show that occupancy can be detected using environmental data but not with 100% accuracy. It also shows that the features that have greatest impact in detecting occupancy is light and carbon dioxide and that the best performing machine learning algorithm, for the used dataset, is the Decision Tree algorithm. The potential energy savings through adaptive heating was estimated to be up to 10,1%. In the final part of the paper, it is discussed how a value creating service can be created around adaptive heating and its possibility to reach the market. / Syftet med den här rapporten är att undersöka om det är möjligt att sänka Sveriges energikonsumtion genom att i lägenheter införa adaptiv uppvärmning som baserar sig på närvaroklassificering av klimatdata. Klimatdatan som använts i studien är tagen från en av KTH Live-In Labs lägenheter. Datan användes först för att undersöka om det var möjligt att detektera närvaro  genom maskininlärning och sedan som input i en modell för adaptiv uppvärmning. I modellen för adaptiv uppvärmning undersöktes de potentiella besparingarna i energibehov och uppvärmningskostnader. Resultaten visar att de bästa featuresen för att klassificera närvaro är ljus och koldioxid. Den maskininlärningsalgoritm som presterade bäst på datasetet var Decision Tree algoritmen. Den potentiella energibesparingen genom införandet av adaptiv uppvärmning uppskattas vara upp till 10,1%. I rapportens sista del diskuteras det hur en värdeskapande tjänst kan skapas kring adaptiv uppvärmning samt dess potential att nå marknaden.
159

An Analysis Of Misclassification Rates For Decision Trees

Zhong, Mingyu 01 January 2007 (has links)
The decision tree is a well-known methodology for classification and regression. In this dissertation, we focus on the minimization of the misclassification rate for decision tree classifiers. We derive the necessary equations that provide the optimal tree prediction, the estimated risk of the tree's prediction, and the reliability of the tree's risk estimation. We carry out an extensive analysis of the application of Lidstone's law of succession for the estimation of the class probabilities. In contrast to existing research, we not only compute the expected values of the risks but also calculate the corresponding reliability of the risk (measured by standard deviations). We also provide an explicit expression of the k-norm estimation for the tree's misclassification rate that combines both the expected value and the reliability. Furthermore, our proposed and proven theorem on k-norm estimation suggests an efficient pruning algorithm that has a clear theoretical interpretation, is easily implemented, and does not require a validation set. Our experiments show that our proposed pruning algorithm produces accurate trees quickly that compares very favorably with two other well-known pruning algorithms, CCP of CART and EBP of C4.5. Finally, our work provides a deeper understanding of decision trees.
160

Automated dust storm detection using satellite images. Development of a computer system for the detection of dust storms from MODIS satellite images and the creation of a new dust storm database.

El-Ossta, Esam E.A. January 2013 (has links)
Dust storms are one of the natural hazards, which have increased in frequency in the recent years over Sahara desert, Australia, the Arabian Desert, Turkmenistan and northern China, which have worsened during the last decade. Dust storms increase air pollution, impact on urban areas and farms as well as affecting ground and air traffic. They cause damage to human health, reduce the temperature, cause damage to communication facilities, reduce visibility which delays both road and air traffic and impact on both urban and rural areas. Thus, it is important to know the causation, movement and radiation effects of dust storms. The monitoring and forecasting of dust storms is increasing in order to help governments reduce the negative impact of these storms. Satellite remote sensing is the most common method but its use over sandy ground is still limited as the two share similar characteristics. However, satellite remote sensing using true-colour images or estimates of aerosol optical thickness (AOT) and algorithms such as the deep blue algorithm have limitations for identifying dust storms. Many researchers have studied the detection of dust storms during daytime in a number of different regions of the world including China, Australia, America, and North Africa using a variety of satellite data but fewer studies have focused on detecting dust storms at night. The key elements of this present study are to use data from the Moderate Resolution Imaging Spectroradiometers on the Terra and Aqua satellites to develop more effective automated method for detecting dust storms during both day and night and generate a MODIS dust storm database. / Libyan Centre for Remote Sensing and Space Science / Appendix A was submitted with extra data files which are not available online.

Page generated in 0.0572 seconds