Global ETD Search

41	Remotely Sensed Data Assimilation Technique to Develop Machine Learning Models for Use in Water Management Zaman, Bushra 01 May 2010 (has links) Increasing population and water conflicts are making water management one of the most important issues of the present world. It has become absolutely necessary to find ways to manage water more efficiently. Technological advancement has introduced various techniques for data acquisition and analysis, and these tools can be used to address some of the critical issues that challenge water resource management. This research used learning machine techniques and information acquired through remote sensing, to solve problems related to soil moisture estimation and crop identification on large spatial scales. In this dissertation, solutions were proposed in three problem areas that can be important in the decision making process related to water management in irrigated systems. A data assimilation technique was used to build a learning machine model that generated soil moisture estimates commensurate with the scale of the data. The research was taken further by developing a multivariate machine learning algorithm to predict root zone soil moisture both in space and time. Further, a model was developed for supervised classification of multi-spectral reflectance data using a multi-class machine learning algorithm. The procedure was designed for classifying crops but the model is data dependent and can be used with other datasets and hence can be applied to other landcover classification problems. The dissertation compared the performance of relevance vector and the support vector machines in estimating soil moisture. A multivariate relevance vector machine algorithm was tested in the spatio-temporal prediction of soil moisture, and the multi-class relevance vector machine model was used for classifying different crop types. It was concluded that the classification scheme may uncover important data patterns contributing greatly to knowledge bases, and to scientific and medical research. The results for the soil moisture models would give a rough idea to farmers/irrigators about the moisture status of their fields and also about the productivity. The models are part of the framework which is devised in an attempt to provide tools to support irrigation system operational decisions. This information could help in the overall improvement of agricultural water management practices for large irrigation systems. Conclusions were reached based on the performance of these machines in estimating soil moisture using remotely sensed data, forecasting spatial and temporal variation of soil moisture and data classification. These solutions provide a new perspective to problem–solving techniques by introducing new methods that have never been previously attempted. Data assimilation Data classification Relevance vector machine Remote sensing Soil moisture Water management Agricultural engineering Environmental Engineering
42	Exploring Alarm Data for Improved Return Prediction in Radios : A Study on Imbalanced Data Classification Färenmark, Sofia January 2023 (has links) The global tech company Ericsson has been tracking the return rate of their products for over 30 years, using it as a key performance indicator (KPI). These KPIs play a critical role in making sound business decisions, identifying areas for improvement, and planning. To enhance the customer experience, the company highly values the ability to predict the number of returns in advance each month. However, predicting returns is a complex problem affected by multiple factors that determine when radios are returned. Analysts at the company have observed indications of a potential correlation between alarm data and the number of returns. This paper aims to address the need for better prediction models to improve return rate forecasting for radios, utilizing alarm data. The alarm data, which is stored in an internal database, includes logs of activated alarms at various sites, along with technical and logistical information about the products, as well as the historical records of returns. The problem is approached as a classification task, where radios are classified as either "return" or "no return" for a specific month, using the alarm dataset as input. However, due to the significantly smaller number of returned radios compared to the distributed ones, the dataset suffers from a heavy class imbalance. The imbalance class problem has garnered considerable attention in the field of machine learning in recent years, as traditional classification models struggle to identify patterns in the minority class of imbalanced datasets. Therefore, a specific method that addresses the imbalanced class problem was required to construct an effective prediction model for returns. Therefore, this paper has adopted a systematic approach inspired by similar problems. It applies the feature selection methods LASSO and Boruta, along with the resampling technique SMOTE, and evaluates various classifiers including the Support vector machine (SVM), Random Forest classifier (RFC), Decision tree (DT), and a Neural network (NN) with weights to identify the best-performing model. As accuracy is not suitable as an evaluation metric for imbalanced datasets, the AUC and AUPRC values were calculated for all models to assess the impact of feature selection, weights, resampling techniques, and the choice of classifier. The best model was determined to be the NN with weights, achieving a median AUC value of 0.93 and a median AUPRC value of 0.043. Likewise, both the LASSO+SVM+SMOTE and LASSO+RFC+SMOTE models demonstrated similar performance with median AUC values of 0.92 and 0.93, and median AUPRC values of 0.038 and 0.041, respectively. The baseline for the AUPRC value for this data set was 0.005. Furthermore, the results indicated that resampling techniques are necessary for successful classification of the minority class. Thorough pre-processing and a balanced split between the test and training sets are crucial before applying resampling, as this technique is sensitive to noisy data. While feature selection improved performance to some extent, it could also lead to unreadable results due to noise. The choice of classifier did not have an equal impact on model performance compared to the effects of resampling and feature selection. Imbalanced data classification LASSO Boruta SVM RFC neural network decision tree AUC AUPRC Computer Sciences Datavetenskap (datalogi)
43	Reading buddies : cross-age tutoring as empowering pedagogy for young English language learners Moriarty, Kristen S. January 2018 (has links) Globalization, and the movement of workers in the high technology industries of Silicon Valley have far reaching effects on the school systems which serve their children. This study takes place in a neighborhood public school in the heart of the area known as Silicon Valley, California, during the early implementation of the Common Core State Standards. During the time of this study, the student population in the valley was growing in number and diversity due to the impact of developments in the high technology industries in the valley, and the education system was recovering from drastic budget cuts as well as embracing a nationwide curriculum movement aimed at more standardization, high-stakes testing, and accountability. As the teacher in the role of participant observer and researcher, employing ethnographic methods of data collection, including video recordings, observations, interviews, and reflective journals and video journaling, student interactions were recorded and analyzed through the application of Bernstein’s theories of pedagogic interactions as well as sociocultural learning theory and the work of Vygotsky. The results indicate that Reading Buddies could be an example of an ‘empowering pedagogy’ which gives linguistically and socially marginalized children a voice in an educational milieu driven by high stakes testing and accountability with an emphasis on the use of English. The study highlights strategies used by young children acquiring English as an additional language to interact with and co-construct meaning of English language texts during weekly Reading Buddy sessions. Seeing the diversity found in the classrooms as a strength and benefit to the education system, this study explores how allowing space for children to bring every day knowledge, home languages, and personal experiences into literacy practices impacts their interactions with English Language texts. 370
44	Geometric Methods for Mining Large and Possibly Private Datasets Chen, Keke 07 July 2006 (has links) With the wide deployment of data intensive Internet applications and continued advances in sensing technology and biotechnology, large multidimensional datasets, possibly containing privacy-conscious information have been emerging. Mining such datasets has become increasingly common in business integration, large-scale scientific data analysis, and national security. The proposed research aims at exploring the geometric properties of the multidimensional datasets utilized in statistical learning and data mining, and providing novel techniques and frameworks for mining very large datasets while protecting the desired data privacy. The first main contribution of this research is the development of iVIBRATE interactive visualization-based approach for clustering very large datasets. The iVIBRATE framework uniquely addresses the challenges in handling irregularly shaped clusters, domain-specific cluster definition, and cluster-labeling of the data on disk. It consists of the VISTA visual cluster rendering subsystem, and the Adaptive ClusterMap Labeling subsystem. The second main contribution is the development of ``Best K Plot'(BKPlot) method for determining the critical clustering structures in multidimensional categorical data. The BKPlot method uniquely addresses two challenges in clustering categorical data: How to determine the number of clusters (the best K) and how to identify the existence of significant clustering structures. The method consists of the basic theory, the sample BKPlot theory for large datasets, and the testing method for identifying no-cluster datasets. The third main contribution of this research is the development of the theory of geometric data perturbation and its application in privacy-preserving data classification involving single party or multiparty collaboration. The key of geometric data perturbation is to find a good randomly generated rotation matrix and an appropriate noise component that provides satisfactory balance between privacy guarantee and data quality, considering possible inference attacks. When geometric perturbation is applied to collaborative multiparty data classification, it is challenging to unify the different geometric perturbations used by different parties. We study three protocols under the data-mining-service oriented framework for unifying the perturbations: 1) the threshold-satisfied voting protocol, 2) the space adaptation protocol, and 3) the space adaptation protocol with a trusted party. The tradeoffs between the privacy guarantee, the model accuracy and the cost are studied for the protocols. Geometric methods Information visualization Data mining Privacy-preserving data mining Data clustering Data classification Distributed collaborative data mining Categorical data clustering
45	Large Data Clustering And Classification Schemes For Data Mining Babu, T Ravindra 12 1900 (has links) Data Mining deals with extracting valid, novel, easily understood by humans, potentially useful and general abstractions from large data. A data is large when number of patterns, number of features per pattern or both are large. Largeness of data is characterized by its size which is beyond the capacity of main memory of a computer. Data Mining is an interdisciplinary field involving database systems, statistics, machine learning, visualization and computational aspects. The focus of data mining algorithms is scalability and efficiency. Large data clustering and classification is an important activity in Data Mining. The clustering algorithms are predominantly iterative requiring multiple scans of dataset, which is very expensive when data is stored on the disk. In the current work we propose different schemes that have both theoretical validity and practical utility in dealing with such a large data. The schemes broadly encompass data compaction, classification, prototype selection, use of domain knowledge and hybrid intelligent systems. The proposed approaches can be broadly classified as (a) compressing the data by some means in a non-lossy manner; cluster as well as classify the patterns in their compressed form directly through a novel algorithm, (b) compressing the data in a lossy fashion such that a very high degree of compression and abstraction is obtained in terms of 'distinct subsequences'; classify the data in such compressed form to improve the prediction accuracy, (c) with the help of incremental clustering, a lossy compression scheme and rough set approach, obtain simultaneous prototype and feature selection, (d) demonstrate that prototype selection and data-dependent techniques can reduce number of comparisons in multiclass classification scenario using SVMs, and (e) by making use of domain knowledge of the problem and data under consideration, we show that we obtaina very high classification accuracy with less number of iterations with AdaBoost. The schemes have pragmatic utility. The prototype selection algorithm is incremental, requiring a single dataset scan and has linear time and space requirements. We provide results obtained with a large, high dimensional handwritten(hw) digit data. The compression algorithm is based on simple concepts, where we demonstrate that classification of the compressed data improves computation time required by a factor 5 with prediction accuracy with both compressed and original data being exactly the same as 92.47%. With the proposed lossy compression scheme and pruning methods, we demonstrate that even with a reduction of distinct sequences by a factor of 6 (690 to 106), the prediction accuracy improves. Specifically, with original data containing 690 distinct subsequences, the classification accuracy is 92.47% and with appropriate choice of parameters for pruning, the number of distinct subsequences reduces to 106 with corresponding classification accuracy as 92.92%. The best classification accuracy of 93.3% is obtained with 452 distinct subsequences. With the scheme of simultaneous feature and prototype selection, we improved classification accuracy to better than that obtained with kNNC, viz., 93.58%, while significantly reducing the number of features and prototypes, achieving a compaction of 45.1%. In case of hybrid schemes based on SVM, prototypes and domain knowledge based tree(KB-Tree), we demonstrated reduction in SVM training time by 50% and testing time by about 30% as compared to complete data and improvement of classification accuracy to 94.75%. In case of AdaBoost the classification accuracy is 94.48%, which is better than those obtained with NNC and kNNC on the entire data; the training timing is reduced because of use of prototypes instead of the complete data. Another important aspect of the work is to devise a KB-Tree (with maximum depth of 4), that classifies a 10-category data in just 4 comparisons. In addition to hw data, we applied the schemes to Network Intrusion Detection Data (10% dataset of KDDCUP99) and demonstrated that the proposed schemes provided less overall cost than the reported values. Data Mining Data Classification Image Processing Data Clustering Data Compaction Data Mining - Algorithms Hybrid Intelligent Systems Data Reduction Data Representation Hybrid Schemes Hybrid Intelligent Methods Computer Science
46	Τεχνικές ταξινόμησης σεισμογραμμάτων Πίκουλης, Βασίλης 01 October 2008 (has links) Σεισμικά γεγονότα τα οποία προέρχονται από σεισμικές πηγές των οποίων η απόσταση μεταξύ τους είναι πολύ μικρότερη από την απόσταση μέχρι τον κοντινότερο σταθμό καταγραφής, είναι γνωστά στη βιβλιογραφία σαν όμοια σεισμικά γεγονότα και αποτελούν αντικείμενο έρευνας εδώ και μια εικοσαετία. Η διαδικασία επαναπροσδιορισμού των υποκεντρικών παραμέτρων ή επανεντοπισμού όμοιων σεισμικών γεγονότων οδηγεί σε εκτιμήσεις των παραμέτρων που είναι συνήθως μεταξύ μίας και δύο τάξεων μεγέθους μικρότερου σφάλματος από τις αντίστοιχες των συνηθισμένων διαδικασιών εντοπισμού και επομένως, μπορεί εν δυνάμει να παράξει μια λεπτομερέστερη εικόνα της σεισμικότητας μιας περιοχής, από την οποία μπορεί στη συνέχεια να προκύψει η ακριβής χαρτογράφηση των ενεργών ρηγμάτων της. Πρόκειται για μια σύνθετη διαδικασία που μπορεί να αναλυθεί στα παρακάτω τρία βασικά βήματα: 1. Αναγνώριση ομάδων όμοιων σεισμικών γεγονότων. 2. Υπολογισμός διαφορών χρόνων άφιξης μεταξύ όμοιων σεισμικών γεγονότων. 3. Επίλυση προβλήματος αντιστροφής. Το πρώτο από τα παραπάνω βήματα είναι η αναγνώριση των λεγόμενων σεισμικών οικογενειών που υπάρχουν στον διαθέσιμο κατάλογο και έχει ξεχωριστή σημασία για την ολική επιτυχία της διαδικασίας. Μόνο εάν εξασφαλιστεί η ορθότητα της επίλυσης αυτού του προβλήματος τίθενται σε ισχύ οι προϋποθέσεις για την εφαρμογή της διαδικασίας και άρα έχει νόημα η γεωλογική ανάλυση που ακολουθεί. Είναι επίσης ένα πρόβλημα που απαντάται και σε άλλες γεωλογικές εφαρμογές, όπως είναι για παράδειγμα ο αυτόματος εντοπισμός του ρήγματος γένεσης ενός άγνωστου σεισμικού γεγονότος μέσω της σύγκρισής του με διαθέσιμες αντιπροσωπευτικές οικογένειες. Το πρόβλημα της αναγνώρισης είναι στην ουσία ένα πρόβλημα ταξινόμησης και ως εκ τούτου προϋποθέτει την επίλυση δύο σημαντικών επιμέρους υποπροβλημάτων. Συγκεκριμένα, αυτό της αντιστοίχισης των σεισμικών κυματομορφών (matching problem) και αυτό της κατηγοριοποίησής τους (clustering problem). Το πρώτο έχει να κάνει με τη σύγκριση όλων των δυνατών ζευγών σεισμογραμμάτων του καταλόγου ώστε να εντοπισθούν όλα τα όμοια ζεύγη, ενώ το δεύτερο αφορά την ομαδοποίηση των ομοίων σεισμογραμμάτων ώστε να προκύψουν οι σεισμικές οικογένειες. Στα πλαίσια αυτής της εργασίας, λαμβάνοντας υπόψη τις ιδιομορφίες που υπεισέρχονται στο παραπάνω πρόβλημα ταξινόμησης από τις ιδιαιτερότητες των σεισμογραμμάτων αλλά και την ιδιαίτερη φύση της εφαρμογής, προτείνουμε μια μέθοδο σύγκρισης που βασίζεται σε μια γενικευμένη μορφή του συντελεστή συσχέτισης και μια μέθοδο κατηγοριοποίησης βασισμένη σε γράφους, με στόχο την αποτελεσματική αλλά και αποδοτική επίλυσή του. / Seismic events that occur in a confined region, meaning that the distance separating the sources is very small compared to the distance between the sources and the recording station, are known in the literature as similar seismic events and have been under study for the past two decades. The re-estimation of the hypocenter parameters or the relocation of similar events gives an estimation error that is between one and two orders of magnitude lower that the one produced by the conventional location procedures. As a result, the application of this approach creates a much more detailed image of the seismicity of the region under study, from which the exact mapping of the active faults of the region can occur. The relocation procedure is in fact a complex procedure, consisting of three basic steps: 1. Identification of groups of similar seismic events. 2. Estimation of the arrival time differences between events of the same group. 3. Solution of the inverse problem. The first of the above steps, namely the identification of the seismic families of the given catalog plays an important role in the total success of the procedure, since only the correct solution of this problem can ensure that the requirements for the application of the procedure are met and therefore the geological analysis that is based on its outcome is meaningful. The problem is also encountered in other geological applications, such as the automatic location of the fault mechanism of an unknown event by comparison with available representative families. The problem of the identification of the seismic families is a classification problem and as such, requires the solution of two subproblems, namely the matching problem and the clustering problem. The object of the first one is the comparison of all the possible event pairs of the catalog with the purpose of locating all the existing similar pairs, while the second one is concerned with the grouping of the similar pairs into seismic families. In this work, taking into consideration the particularities that supersede the classification problem described above due to the special nature of the seismograms and also the specific requirements of the application, we propose a comparing method which is based on a generalized form of the correlation coefficient and a graph – based clustering technique, as an effective solution of the problem at hand. Ταξινόμηση δεδομένων Αντιστοίχηση σήματος Συσχέτιση Κατηγοριοποίηση Θεωρία Γράφων 551.220 287 Similar seismic events Data classification Signal matching Correlation Clustering Hierarchical clustering Graph Theory
47	Desenvolvimento e aplicação de Heurística para calcular pesos e bias iniciais para o “Back-Propagation” treinar Rede Neural Perceptron Multicamadas / Development and application of a Heuristic to initialize weights and bias for the Back-Propagation to train Multilayer Perceptron Network Neural Silva, Aldemário Alves da 18 August 2017 (has links) Submitted by Lara Oliveira (lara@ufersa.edu.br) on 2017-09-08T22:30:39Z No. of bitstreams: 1 AldemárioAS_DISSERT.pdf: 18856416 bytes, checksum: dcd37bbe9d111ef051c4d27c3481a41f (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2017-09-11T16:27:51Z (GMT) No. of bitstreams: 1 AldemárioAS_DISSERT.pdf: 18856416 bytes, checksum: dcd37bbe9d111ef051c4d27c3481a41f (MD5) / Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2017-09-11T16:28:25Z (GMT) No. of bitstreams: 1 AldemárioAS_DISSERT.pdf: 18856416 bytes, checksum: dcd37bbe9d111ef051c4d27c3481a41f (MD5) / Made available in DSpace on 2017-09-11T16:29:16Z (GMT). No. of bitstreams: 1 AldemárioAS_DISSERT.pdf: 18856416 bytes, checksum: dcd37bbe9d111ef051c4d27c3481a41f (MD5) Previous issue date: 2017-08-18 / The training of Multilayer Perceptron Neural Network (MLPNN) done by exact algorithm to find the maximum accuracy is NP-hard. Thus, we use the algorithm Back-Propagation who needs a starting point (weights and bias initials) to compute the training of the MLPNN. This research has developed and implemented a heuristic algorithm HeCI - Heuristic to Calculate Weights and Bias Initials - to compute the data to train the MLPNN and return the starting point for the Back-Propagation. HeCI uses Principal Component Analysis, Least Square Method, Probability Density Function of the Normal Gaussian Distribution, two strategic configurations, and partially controls the number of MLPNN training epochs. Experimentally, HeCI was used with Back-Propagation in MLPNN training to recognize patterns and solve data classification problems. Six case studies with datasets between Health, Business and Botany were used in the experiments. The methodology of this research uses Deductive analysis by the Experimental method with Quantitative approach and hypothesis tests: Test of Fridman with post Teste of Tukey HSD Post-hoc and Wilcoxon Test-M W. The results of accuracy have increased significantly improving attested by evaluation of tests of hypotheses, inferringstatistical robustness of the result motivated by HeCI / O treinamento de Rede Neural Perceptron Multicamadas (RNPM) feito por algoritmo exato para encontrar a máxima acurácia é NP-Difícil. Sendo assim, usa-se o algoritmo "Back-Propagation" que necessita de um ponto de partida (pesos e bias iniciais) para computar o treinamento da RNPM. Esta pesquisa desenvolveu e aplicou um algoritmo heurístico HeCI - Heurística para Calcular Pesos e Bias Iniciais - para computar os dados de treinamento da RNPM e retornar o ponto de partida para o "Back-Propagation". A HeCI usa Análise de Componentes Principais, Método dos Mínimos Quadrados, Função de Densidade de Probabilidade da Normal Distribuição Gaussiana, duas configurações estratégicas e controla parcialmente o número de épocas de treinamento da RNPM. Experimentalmente, a RNPM foi treinada usando "Back-Propagation" com HeCI, para reconhecer padrões e resolver problemas de classificação de dados. Seis estudos de caso com "datasets" entre as áreas de Saúde, Negócio e Botânica foram usados nos experimentos. A metodologia desta pesquisa usa análise Dedutiva pelo método Experimental com abordagem Quantitativa e testes de hipóteses: Teste de Fridman com Pós Teste de Tukey HSD Post-hoc e Teste de Wilcoxon-M-W. Os resultados de acurácia incrementaram melhoria significativa atestada pela avaliação dos testes de hipóteses, inferindo estatisticamente robustez de resultado motivado pela HeCI / 2017-09-08 Aprendizado de máquina Rede Neural Perceptron Multicamadas Heurística Reconhecimento de padrões Classificação de dados Machine learning Neural Network Multilayer Perceptron Heuristic Pattern recognition Data classification CNPQ::CIENCIAS EXATAS E DA TERRA
48	Classificação da marcha em parkinsonianos: análise dos algoritmos de aprendizagem supervisionada / Classification of the parkinsonian gait: analysis of supervised learning algorithms Souza, Hugo Araújo 12 April 2017 (has links) Parkinson’s disease is the second most prevalent neurodegenerative disease in the elderly, although its dominance and incidence vary according to age, gender and race/ethnicity. Studies indicate that the prevalence increases with age, with an estimate of 5 to 26 cases per 100,000 people per year, being approximately 1% among individuals aged 65- 69 and ranging from 3% to 14.3% among the elderly over 85 years. The most common clinical signs in the inflammatory process include the presence of resting tremor, muscle stiffness, bradykinesia and postural instability. The diagnosis of the disease is not a simple task, as it is known that there are stages patterns of disease progression in the human organism. However, many patients do not follow this progress because of the heterogeneity of manifestations that may arise. The gait analysis has become an attractive and non-invasive quantitative mechanism that can aid in the detection and monitoring of PD patients. Feature extraction is a very important task for quality of the data to be used by the algorithms, aiming as main objective the reduction in the dimensionality of the data in a classification process. From the reduction of dimensionality it is possible to identify which attributes are important and to facilitate the visualization of the data. For data related to human gait, the purpose is to detect relevant attributes that may help in identifying gait cycle phases, such as support and swing phases, cadence, stride length, velocity, etc. To do this, it is necessary to identify and select which attributes are most relevant, as well as the classification method. This work evaluates the performance of supervised learning algorithms in the classification of human gait characteristics in an open database, also identifies which attributes are most relevant to the performance of the classifiers in aiding the identification of gait characteristics in PD patients. / A Doença de Parkinson é a segunda doença neurodegenerativa mais prevalente em idosos, embora seu domínio e incidência variem de acordo com a idade, sexo e raça/etnia. Estudos apontam que a prevalência aumenta com a idade, tendo estimativa de 5 a 26 casos a cada 100 mil pessoas por ano, sendo de aproximadamente 1% entre os indivíduos de 65 a 69 anos e, variando de 3% a 14,3% entre os idosos acima de 85 anos. Os sinais clínicos mais comuns no processo inflamatório incluem a presença de tremor em repouso, rigidez muscular, bradicinesia e instabilidade postural. O diagnóstico da doença não é uma tarefa simples, pois sabe-se que há padrões de estágios no avanço da doença no organismo humano. Porém, muitos pacientes não seguem esse progresso devido a heterogeneidade de manifestações que podem surgir. A análise da marcha tornou-se um mecanismo quantitativo atrativo e não invasivo que pode auxiliar na detecção e monitoramento de portadores de DP. A extração de características é uma tarefa de suma importância para a qualidade dos dados a serem empregados pelos algoritmos de AM, visando como principal objetivo a redução na dimensionalidade dos dados em um processo de classificação. A partir da redução da dimensionalidade é possível identificar, principalmente, quais atributos são importantes e facilitar a visualização dos dados. Para dados relacionados à marcha humana, o propósito é detectar relevantes atributos que possam ajudar na identificação das fases do ciclo da marcha, como as fases de apoio e swing, cadência, comprimento da passada, velocidade, entre outras. Para tal, é preciso identificar e selecionar quais atributos são mais relevantes, assim como o método de classificação. Este trabalho avalia o desempenho de algoritmos de aprendizagem supervisionada na classificação das características da marcha humana em uma base de dados aberta, também identifica quais atributos são mais relevantes para o desempenho dos classificadores no auxílio à identificação de características da marcha em portadores da DP. Aprendizagem supervisionada - Algoritmos Classificação de dados Seleção de atributos Marcha humana Doença de Parkinson Machine learning Data classification Feature selection Human gait Parkinson disease
49	An Empirical Study of Machine Learning Techniques for Classifying Emotional States from EEG Data Sohaib, Ahmad Tauseef, Qureshi, Shahnawaz January 2012 (has links) With the great advancement in robot technology, smart human-robot interaction is considered to be the most wanted success by the researchers these days. If a robot can identify emotions and intentions of a human interacting with it, that would make robots more useful. Electroencephalography (EEG) is considered one effective way of recording emotions and motivations of a human using brain. Various machine learning techniques are used successfully to classify EEG data accurately. K-Nearest Neighbor, Bayesian Network, Artificial Neural Networks and Support Vector Machine are among the suitable machine learning techniques to classify EEG data. The aim of this thesis is to evaluate different machine learning techniques to classify EEG data associated with specific affective/emotional states. Different methods based on different signal processing techniques are studied to find a suitable method to process the EEG data. Various number of EEG data features are used to identify those which give best results for different classification techniques. Different methods are designed to format the dataset for EEG data. Formatted datasets are then evaluated on various machine learning techniques to find out which technique can accurately classify EEG data according to associated affective/emotional states. Research method includes conducting an experiment. The aim of the experiment was to find the various emotional states in subjects as they look on different pictures and record the EEG data. The obtained EEG data is processed, formatted and evaluated on various machine learning techniques to find out which technique can accurately classify EEG data according to associated affective/emotional states. The experiment confirms the choice of a technique for improving the accuracy of results. According to the results, Support Vector Machine is the first and Regression Tree is the second best to classify EEG data associated with specific affective/emotional states with accuracies up to 70.00% and 60.00% respectively. SVM is better in performance than RT. However, RT is famous for providing better accuracies for diverse EEG data. Human Robot Interaction EEG Data Classification Emotional States Classification Machine Learning Techniques Computer Sciences Datavetenskap (datalogi) Information Systems
50	Inteligentní klient pro hudební přehrávací server MPD / Intelligent Client for Music Player Daemon Wagner, Tomáš January 2012 (has links) The content of this master thesis project is about design and implementation of intelligent client application for Music Player Daemon (MPD), which searches and presents the metadata related to played content. The actual design precedes the theoretical analysis, which includes analysis of agent systems, methods of data classification, web communication protocols and languages for describing HTML document. At the same time is analyzed the MPD server and communication protocol used by clients application. Furthermore, this work describes the current client applications that presents metadata. In the last chapters of the thesis describes the design and implementation of intelligent client. It describes the methods of solution the implementation and solution of problems. Lastest chapters describes the testing result.

Search results