• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 45
  • 6
  • 5
  • 2
  • 1
  • 1
  • Tagged with
  • 67
  • 67
  • 42
  • 40
  • 33
  • 32
  • 32
  • 32
  • 32
  • 31
  • 29
  • 28
  • 28
  • 28
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Experiments with GMTI Radar using Micro-Doppler

Dilsaver, Benjamin Walter 24 June 2013 (has links) (PDF)
As objects move, their changing shape produces a signature that can be measured by a radar system. That signature is called the micro-Doppler signature. The micro-Doppler signature of an object is a distinguishing characteristic for certain classes of objects. In this thesis features are extracted from the micro-Doppler signature and are used to classify objects. The scope of the objects is limited to humans walking and traveling vehicles. The micro-Doppler features are able to distinguish the two classes of objects. With a sufficient amount of training data, the micro-Doppler features may be used with learning algorithms to predict unknown objects detected by the radar with high accuracy.
12

A machine learning approach for ethnic classification: the British Pakistani face

Khalid Jilani, Shelina, Ugail, Hassan, Bukar, Ali M., Logan, Andrew J., Munshi, Tasnim January 2017 (has links)
No / Ethnicity is one of the most salient clues to face identity. Analysis of ethnicity-specific facial data is a challenging problem and predominantly carried out using computer-based algorithms. Current published literature focusses on the use of frontal face images. We addressed the challenge of binary (British Pakistani or other ethnicity) ethnicity classification using profile facial images. The proposed framework is based on the extraction of geometric features using 10 anthropometric facial landmarks, within a purpose-built, novel database of 135 multi-ethnic and multi-racial subjects and a total of 675 face images. Image dimensionality was reduced using Principle Component Analysis and Partial Least Square Regression. Classification was performed using Linear Support Vector Machine. The results of this framework are promising with 71.11% ethnic classification accuracy using a PCA algorithm + SVM as a classifier, and 76.03% using PLS algorithm + SVM as a classifier.
13

An Evaluation of Classification Algorithms for Machinery Fault Diagnosis

Buzza, Matthew 15 June 2017 (has links)
No description available.
14

Automating debugging through data mining / Automatisering av felsökning genom data mining

Thun, Julia, Kadouri, Rebin January 2017 (has links)
Contemporary technological systems generate massive quantities of log messages. These messages can be stored, searched and visualized efficiently using log management and analysis tools. The analysis of log messages offer insights into system behavior such as performance, server status and execution faults in web applications. iStone AB wants to explore the possibility to automate their debugging process. Since iStone does most parts of their debugging manually, it takes time to find errors within the system. The aim was therefore to find different solutions to reduce the time it takes to debug. An analysis of log messages within access – and console logs were made, so that the most appropriate data mining techniques for iStone’s system would be chosen. Data mining algorithms and log management and analysis tools were compared. The result of the comparisons showed that the ELK Stack as well as a mixture between Eclat and a hybrid algorithm (Eclat and Apriori) were the most appropriate choices. To demonstrate their feasibility, the ELK Stack and Eclat were implemented. The produced results show that data mining and the use of a platform for log analysis can facilitate and reduce the time it takes to debug. / Dagens system genererar stora mängder av loggmeddelanden. Dessa meddelanden kan effektivt lagras, sökas och visualiseras genom att använda sig av logghanteringsverktyg. Analys av loggmeddelanden ger insikt i systemets beteende såsom prestanda, serverstatus och exekveringsfel som kan uppkomma i webbapplikationer. iStone AB vill undersöka möjligheten att automatisera felsökning. Eftersom iStone till mestadels utför deras felsökning manuellt så tar det tid att hitta fel inom systemet. Syftet var att därför att finna olika lösningar som reducerar tiden det tar att felsöka. En analys av loggmeddelanden inom access – och konsolloggar utfördes för att välja de mest lämpade data mining tekniker för iStone’s system. Data mining algoritmer och logghanteringsverktyg jämfördes. Resultatet av jämförelserna visade att ELK Stacken samt en blandning av Eclat och en hybrid algoritm (Eclat och Apriori) var de lämpligaste valen. För att visa att så är fallet så implementerades ELK Stacken och Eclat. De framställda resultaten visar att data mining och användning av en plattform för logganalys kan underlätta och minska den tid det tar för att felsöka.
15

Εξόρυξη γνώσης από ιατροβιολογικά δεδομένα / Biomedical data mining

Καλλά, Μαρία-Παυλίνα 28 February 2013 (has links)
Πίσω από όλα αυτά τα δεδομένα που υπάρχουν κρύβεται ένας τεράστιος θησαυρός γνώσεων τον οποίο δεν μπορούμε να αντιληφθούμε καθώς η μορφή των πληροφοριών δεν μας το επιτρέπει. Έτσι αναπτύχθηκαν μέθοδοι και τεχνικές που μας βοηθούν να βρούμε την κρυμμένη γνώση και να την αξιοποιήσουμε προς όφελος κυρίως του κοινού και η πιο γνωστή μέθοδος, με την οποία θα ασχοληθούμε και εμείς είναι η Εξόρυξη Γνώσης. Στην εργασία που ακολουθεί θα μιλήσουμε για την χρήση των μεθόδων Εξόρυξης Γνώσης (όπως λέγονται) σε βιοϊατρικά δεδομένα. Στην αρχή θα κάνουμε αναφορά στην Μοριακή Βιολογία και στην Βιοπληροφορική. Ακολούθως θα δουμε την Ανακάλυψη γνώσης από βάσεις δεδομένων. Θα δούμε αναλυτικά την Εξόρυξη γνώσης και πιο πολύ τις μεθόδους κατηγοριοποίησης. Τέλος θα εφαρμόσουμε τους αλγορίθμους σε ιατροβιολογικά δεδομένα και θα δούμε τα συμπεράσματα που προκύπτουν αλλά και μελλοντικές επεκτάσεις. / Behind all these data there is hidden a huge treasure of knowledge which we can not understand . Thus developed methods and techniques that help us find the hidden knowledge and to utilize it for the benefit of the public. The most famous method, which we will study, is Data Mining. In the work that follows we will discuss the use of data mining methods (as they are called) in biomedical data. In the beginning, we will report information about Molecular Biology and Bioinformatics. Then. we will see the knowledge discovery in databases. We will see in detail the Data Mining and the classification methods. Finally we implement the algorithms in biomedical data and see the conclusions and future extensions.
16

Data-Driven Emptying Detection for Smart Recycling Containers

Rutqvist, David January 2018 (has links)
Waste Management is one of the biggest challenges for modern cities caused by urbanisation and increased population. Smart Waste Management tries to solve this challenge with the help of techniques such as Internet of Things, machine learning and cloud computing. By utilising smart algorithms the time when a recycling container is going to be full can be predicted. By continuously measuring the filling level of containers and then partitioning the filling level data between consecutive emptyings a regression model can be used for prediction. In order to do this an accurate emptying detection is a requirement. This thesis investigates different data-driven approaches to solve the problem of an accurate emptying detection in a setting where the majority of the data are non-emptyings, i.e. suspected emptyings which by manual examination have been concluded not to be actual emptyings. This is done by starting with the currently deployed legacy solution and step-by-step increasing the performance by optimisation and machine learning models. The final solution achieves the classification accuracy of 99.1 % and the recall of 98.2 % by using a random forest classifier on a set of features based on the filling level at different given time spans. To be compared with the recall of 50 % by the legacy solution. In the end, it is concluded that the final solution, with a few minor practical modifications, is feasible for deployment in the next release of the system.
17

Mineração de dados aplicada à classificação do risco de evasão de discentes ingressantes em instituições federais de ensino superior

AMARAL, Marcelo Gomes do 08 July 2016 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2017-07-11T14:35:16Z No. of bitstreams: 3 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) / Made available in DSpace on 2017-07-11T14:35:16Z (GMT). No. of bitstreams: 3 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) Previous issue date: 2016-07-08 / As Instituições Federais de Ensino Superior (IFES) possuem um importante papel no desenvolvimento social e econômico do país, contribuindo para o avanço tecnológico e cientifico e fomentando investimentos. Nesse sentido, entende-se que um melhor aproveitamento dos recursos educacionais ofertados pelas IFES contribui para a evolução da educação superior, como um todo. Uma maneira eficaz de atender esta necessidade é analisar o perfil dos estudantes ingressos e procurar prever, com antecedência, casos indesejáveis de evasão que, quanto mais cedo identificados, melhor poderão ser estudados e tratados pela administração. Neste trabalho, propõe-se a definição de uma abordagem para aplicação de técnicas diretas de Mineração de Dados objetivando a classificação dos discentes ingressos de acordo com o risco de evasão que apresentam. Como prova de conceito, a análise dos aspectos inerentes ao processo de Mineração de Dados proposto se deu por meio de experimentações conduzidas no ambiente da Universidade Federal de Pernambuco (UFPE). Para alguns dos algoritmos classificadores, foi possível obter uma acurácia de classificação de 73,9%, utilizando apenas dados socioeconômicos disponíveis quando do ingresso do discente na instituição, sem a utilização de nenhum dado dependente do histórico acadêmico. / The Brazilian's Federal Institutions of Higher Education have an important role in the social and economic development of the country, contributing to the technological and scientific advances and encouraging investments. Therefore, it is possible to infer that a better use of the educational resources offered by those institutions contributes to the evolution of higher education as a whole. An effective way to meet this need is to analyze the profile of the freshmen students and try to predict, as soon as possible, undesirable cases of dropout that when earlier identified can be examined and addressed by the institution's administration. This work propose the development of a approach for direct application of Data Mining techniques to classify newcomer students according to their dropout risk. As a viability proof, the proposed Data Mining approach was evaluated through experimentations conducted in the Federal University of Pernambuco. Some of the classification algorithms tested had an classification accuracy of 73.9% using only socioeconomic data available since the student's admission to the institution, without the use of any academic related data.
18

Análise temporal da sinalização elétrica em plantas de soja submetidas a diferentes perturbações externas / Temporal analysis of electrical signaling in soybean plants subjected to different external disturbances

Saraiva, Gustavo Francisco Rosalin 31 March 2017 (has links)
Submitted by Michele Mologni (mologni@unoeste.br) on 2018-07-27T17:57:40Z No. of bitstreams: 1 Gustavo Francisco Rosalin Saraiva.pdf: 5041218 bytes, checksum: 30127a7816b12d3bd7e57182e6229bc2 (MD5) / Made available in DSpace on 2018-07-27T17:57:40Z (GMT). No. of bitstreams: 1 Gustavo Francisco Rosalin Saraiva.pdf: 5041218 bytes, checksum: 30127a7816b12d3bd7e57182e6229bc2 (MD5) Previous issue date: 2017-03-31 / Plants are complex organisms with dynamic processes that, due to their sessile way of life, are influenced by environmental conditions at all times. Plants can accurately perceive and respond to different environmental stimuli intelligently, but this requires a complex and efficient signaling system. Electrical signaling in plants has been known for a long time, but has recently gained prominence with the understanding of the physiological processes of plants. The objective of this thesis was to test the following hypotheses: temporal series of data obtained from electrical signaling of plants have non-random information, with dynamic and oscillatory pattern, such dynamics being affected by environmental stimuli and that there are specific patterns in responses to stimuli. In a controlled environment, stressful environmental stimuli were applied in soybean plants, and the electrical signaling data were collected before and after the application of the stimulus. The time series obtained were analyzed using statistical and computational tools to determine Frequency Spectrum (FFT), Autocorrelation of Values and Approximate Entropy (ApEn). In order to verify the existence of patterns in the series, classification algorithms from the area of machine learning were used. The analysis of the time series showed that the electrical signals collected from plants presented oscillatory dynamics with frequency distribution pattern in power law. The results allow to differentiate with great efficiency series collected before and after the application of the stimuli. The PSD and autocorrelation analyzes showed a great difference in the dynamics of the electric signals before and after the application of the stimuli. The ApEn analysis showed that there was a decrease in the signal complexity after the application of the stimuli. The classification algorithms reached significant values in the accuracy of pattern detection and classification of the time series, showing that there are mathematical patterns in the different electrical responses of the plants. It is concluded that the time series of bioelectrical signals of plants contain discriminant information. The signals have oscillatory dynamics, having their properties altered by environmental stimuli. There are still mathematical patterns built into plant responses to specific stimuli. / As plantas são organismos complexos com processos dinâmicos que, devido ao seu modo séssil de vida, sofrem influência das condições ambientais todo o tempo. Plantas podem percebem e responder com precisão a diferentes estímulos ambientais de forma inteligente, mas para isso se faz necessário um complexo e eficiente sistema de sinalização. A sinalização elétrica em plantas já é conhecida há muito tempo, mas vem ganhando destaque recentemente com seu entendimento em relação aos processos fisiológicos das plantas. O objetivo desta tese foi testar as seguintes hipóteses: séries temporais de dados obtidos da sinalização elétrica de plantas possuem informação não aleatória, com padrão dinâmico e oscilatório, sendo tal dinâmica afetada por estímulos ambientais e que há padrões específicos nas respostas a estímulos. Em ambiente controlado, foram aplicados estímulos ambientais estressantes em plantas de soja, e captados os dados de sinalização elétrica antes e após a aplicação dos mesmos. As séries temporais obtidas foram analisadas utilizando ferramentas estatísticas e computacionais para se determinar o Espectro de Frequências (FFT), Autocorrelação dos valores e Entropia Aproximada (ApEn). Para se verificar a existência de padrões nas séries, foram utilizados algoritmos de classificação da área de aprendizado de máquina. A análise das séries temporais mostrou que os sinais elétricos coletados de plantas apresentaram dinâmica oscilatória com padrão de distribuição de frequências em lei de potência. Os resultados permitem diferenciar com grande eficácia séries coletadas antes e após a aplicação dos estímulos. As análises de PSD e autocorrelação mostraram grande diferença na dinâmica dos sinais elétricos antes e após a aplicação dos estímulos. A análise de ApEn mostrou haver diminuição da complexidade do sinal após a aplicação dos estímulos. Os algoritmos de classificação alcançaram valores significativos na acurácia de detecção de padrões e classificação das séries temporais, mostrando haver padrões matemáticos nas diferentes respostas elétricas das plantas. Conclui-se que as séries temporais de sinais bioelétricos de plantas possuem informação discriminante. Os sinais possuem dinâmica oscilatória, tendo suas propriedades alteradas por estímulos ambientais. Há ainda padrões matemáticos embutidos nas respostas da planta a estímulos específicos.
19

Comparison of Machine learningalgorithms on Predicting Churn withinMusic streaming service

Gaddam, Lahari, Kadali, Sree Lakshmi Hiranmayee January 2022 (has links)
Background: Customer churn prediction is one of the most popular part of bigbusinesses and often help the companies in customer retention and revenue generation.Customer churn may lead to huge loss of revenue and is important to analyzeand determine the cause for churn. Moreover, it is easier to retain an existing customerrather than acquiring new clients.Therefore, to get a better understanding onchurn prediction, this research work focuses on finding the best performing machinelearning model after effective comparision among four machine learning models. Theresearch also gives a brief report of latest literature work done in churn analysis ofmusic streaming services. Objectives: In this thesis work, we aim to research about churn prediction done inmusic streaming services. We focus on two main objectives, first one includes literaturereview on the latest research work done in churn prediction of music streamingservices. Secondly, we aim in comparing the performance of four supervised machinelearning algorithms, to find out the best performing algorithm for churn prediction. Methods: This thesis involves two methods literature review and experimentationto answer our research questions. We chose to use literature review for RQ1 soit can give a better understanding on our selected problem and works as base workfor our research and helps in clear and better comprehension. Experimentation ischosen for RQ2 to to build and train the selected machine learning model to validatethe performance of algorithms. Experimentation is chosen because it gives betterresults and prediction compared to surveys and reviews. Results: We have selected four classification supervised machine learning algorithmsnamely, Logistic regression, Naive Bayes, KNN, and RF in this research.Upon experimentation and training the models using the algorithms with a preprocessingthe KKBox’s dataset, RF achieved highest accuracy of 97% compared toother models. Conclusions: We have trained four models using the four machine learning algorithmsfor the prediction of churn in music streaming service domain. Upon trainingthe models with the KKBox’s dataset and upon experimentation, we came to a conclusionthat RF has the best performance with better accuracy and AUC score.
20

Contributions to evaluation of machine learning models. Applicability domain of classification models

Rado, Omesaad A.M. January 2019 (has links)
Artificial intelligence (AI) and machine learning (ML) present some application opportunities and challenges that can be framed as learning problems. The performance of machine learning models depends on algorithms and the data. Moreover, learning algorithms create a model of reality through learning and testing with data processes, and their performance shows an agreement degree of their assumed model with reality. ML algorithms have been successfully used in numerous classification problems. With the developing popularity of using ML models for many purposes in different domains, the validation of such predictive models is currently required more formally. Traditionally, there are many studies related to model evaluation, robustness, reliability, and the quality of the data and the data-driven models. However, those studies do not consider the concept of the applicability domain (AD) yet. The issue is that the AD is not often well defined, or it is not defined at all in many fields. This work investigates the robustness of ML classification models from the applicability domain perspective. A standard definition of applicability domain regards the spaces in which the model provides results with specific reliability. The main aim of this study is to investigate the connection between the applicability domain approach and the classification model performance. We are examining the usefulness of assessing the AD for the classification model, i.e. reliability, reuse, robustness of classifiers. The work is implemented using three approaches, and these approaches are conducted in three various attempts: firstly, assessing the applicability domain for the classification model; secondly, investigating the robustness of the classification model based on the applicability domain approach; thirdly, selecting an optimal model using Pareto optimality. The experiments in this work are illustrated by considering different machine learning algorithms for binary and multi-class classifications for healthcare datasets from public benchmark data repositories. In the first approach, the decision trees algorithm (DT) is used for the classification of data in the classification stage. The feature selection method is applied to choose features for classification. The obtained classifiers are used in the third approach for selection of models using Pareto optimality. The second approach is implemented using three steps; namely, building classification model; generating synthetic data; and evaluating the obtained results. The results obtained from the study provide an understanding of how the proposed approach can help to define the model’s robustness and the applicability domain, for providing reliable outputs. These approaches open opportunities for classification data and model management. The proposed algorithms are implemented through a set of experiments on classification accuracy of instances, which fall in the domain of the model. For the first approach, by considering all the features, the highest accuracy obtained is 0.98, with thresholds average of 0.34 for Breast cancer dataset. After applying recursive feature elimination (RFE) method, the accuracy is 0.96% with 0.27 thresholds average. For the robustness of the classification model based on the applicability domain approach, the minimum accuracy is 0.62% for Indian Liver Patient data at r=0.10, and the maximum accuracy is 0.99% for Thyroid dataset at r=0.10. For the selection of an optimal model using Pareto optimality, the optimally selected classifier gives the accuracy of 0.94% with 0.35 thresholds average. This research investigates critical aspects of the applicability domain as related to the robustness of classification ML algorithms. However, the performance of machine learning techniques depends on the degree of reliable predictions of the model. In the literature, the robustness of the ML model can be defined as the ability of the model to provide the testing error close to the training error. Moreover, the properties can describe the stability of the model performance when being tested on the new datasets. Concluding, this thesis introduced the concept of applicability domain for classifiers and tested the use of this concept with some case studies on health-related public benchmark datasets. / Ministry of Higher Education in Libya

Page generated in 0.1431 seconds