Global ETD Search

41	A comparison of driving characteristics and environmental characteristics using factor analysis and k-means clustering algorithm Jung, Heejin 19 September 2012 (has links) The dissertation aims to classify drivers based on driving and environmental behaviors. The research determined significant factors using factor analysis, identified different driver types using k-means clustering, and studied how the same drivers map in each classification domain. The research consists of two study cases. In the first study case, a new variable is proposed and then is used for classification. The drivers were divided into three groups. Two alternatives were designed to evaluate the environmental impact of driving behavior changes. In the second study case, two types of data sets were constructed: driving data and environmental data. The driving data represents driving behavior of individual drivers. The environmental data represents emissions and fuel consumption estimated by microscopic energy and emissions models. Significant factors were explored in each data set using factor analysis. A pair of factors was defined for each data set. Each pair of factors was used for each k-means clustering: driving clustering and environmental clustering. Then the factors were used to identify groups of drivers in each clustering domain. In the driving clustering, drivers were grouped into three clusters. In the environmental clustering, drivers were clustered into two groups. The groups from the driving clustering were compared to the groups from the environmental clustering in terms of emissions and fuel consumption. The three groups of drivers from the driving clustering were also mapped in the environmental domain. The results indicate that the differences in driving patterns among the three driver groups significantly influenced the emissions of HC, CO, and NOx. As a result, it was determined that the average target operating acceleration and braking did essentially influence the amount of emissions in terms of HC, CO, and NOx. Therefore, if drivers were to change their driving behavior to be more defensive, it is expected that emissions of HC, CO, and NOx would decrease. It was also found that spacing-based driving tended to produce less emissions but consumed more fuel than other groups, while speed-based driving produced relatively more emissions. On the other hand, the defensively moderate drivers consumed less fuel and produced fewer emissions. / Ph. D. NGSIM driving characteristics factor analysis k-means clustering CMEM
42	Wide Area Power System Monitoring Device Design and Data Analysis Khan, Kevin Jamil Hiroshi 14 September 2006 (has links) The frequency disturbance recorder (FDR) is a cost effective data acquisition device used to measure power system frequency at the distribution level. FDRs are time synchronized via the global positioning system (GPS) timing and data recorded by FDRs are time stamped to allow for comparative analysis between FDRs. The data is transmitted over the internet to a central server where the data is collected and stored for post mortem analysis. Currently, most of the analysis is done with power system frequency. The purpose of this study is to take a first in depth look at the angle data collected by FDRs. Different data conditioning techniques are proposed and tested before one is chosen. The chosen technique is then used to extract useable angle data for angle analysis on eight generation trip events. The angle differences are then used to create surface plot angle difference movies for further analysis. A new event detection algorithm, the k-means algorithm, is also presented in this paper. The algorithm is proposed as a simple and fast alternative to the current detection method. Next, this thesis examines several GPS modules and recommends one for a replacement of the current GPS chip, which is no longer in production. Finally, the manufacturing process for creating an FDR is documented. This thesis may have raised more questions than it answers and it is hoped that this work will lay the foundation for further analysis of angles from FDR data. / Master of Science angle difference FNET k-means Frequency Disturbance Recorder voltage angle
43	Classification of ADHD Using Heterogeneity Classes and Attention Network Task Timing Hanson, Sarah Elizabeth 21 June 2018 (has links) Throughout the 1990s ADHD diagnosis and medication rates have increased rapidly, and this trend continues today. These sharp increases have been met with both public and clinical criticism, detractors stating over-diagnosis is a problem and healthy children are being unnecessarily medicated and labeled as disabled. However, others say that ADHD is being under-diagnosed in some populations. Critics often state that there are multiple factors that introduce subjectivity into the diagnosis process, meaning that a final diagnosis may be influenced by more than the desire to protect a patient's wellbeing. Some of these factors include standardized testing, legislation affecting special education funding, and the diagnostic process. In an effort to circumvent these extraneous factors, this work aims to further develop a potential method of using EEG signals to accurately discriminate between ADHD and non-ADHD children using features that capture spectral and perhaps temporal information from evoked EEG signals. KNN has been shown in prior research to be an effective tool in discriminating between ADHD and non-ADHD, therefore several different KNN models are created using features derived in a variety of fashions. One takes into account the heterogeneity of ADHD, and another one seeks to exploit differences in executive functioning of ADHD and non-ADHD subjects. The results of this classification method vary widely depending on the sample used to train and test the KNN model. With unfiltered Dataset 1 data over the entire ANT1 period, the most accurate EEG channel pair achieved an overall vector classification accuracy of 94%, and the 5th percentile of classification confidence was 80%. These metrics suggest that using KNN of EEG signals taken during the ANT task would be a useful diagnosis tool. However, the most accurate channel pair for unfiltered Dataset 2 data achieved an overall accuracy of 65% and a 5th percentile of classification confidence of 17%. The same method that worked so well for Dataset 1 did not work well for Dataset 2, and no conclusive reason for this difference was identified, although several methods to remove possible sources of noise were used. Using target time linked intervals did appear to marginally improve results in both Dataset 1 and Dataset 2. However, the changes in accuracy of intervals relative to target presentation vary between Dataset 1 and Dataset 2. Separating subjects into heterogeneity classes does appear to result in good (up to 83%) classification accuracy for some classes, but results are poor (about 50%) for other heterogeneity classes. A much larger data set is necessary to determine whether or not the very positive results found with Dataset 1 extend to a wide population. / Master of Science ADHD EEG KNN K-Means Heterogeneity Attention Network Task
44	Agrupamento de textos utilizando divergência Kullback-Leibler / Texts grouping using Kullback-Leibler divergence Willian Darwin Junior 22 February 2016 (has links) O presente trabalho propõe uma metodologia para agrupamento de textos que possa ser utilizada tanto em busca textual em geral como mais especificamente na distribuição de processos jurídicos para fins de redução do tempo de resolução de conflitos judiciais. A metodologia proposta utiliza a divergência Kullback-Leibler aplicada às distribuições de frequência dos radicais (semantemas) das palavras presentes nos textos. Diversos grupos de radicais são considerados, formados a partir da frequência com que ocorrem entre os textos, e as distribuições são tomadas em relação a cada um desses grupos. Para cada grupo, as divergências são calculadas em relação à distribuição de um texto de referência formado pela agregação de todos os textos da amostra, resultando em um valor para cada texto em relação a cada grupo de radicais. Ao final, esses valores são utilizados como atributos de cada texto em um processo de clusterização utilizando uma implementação do algoritmo K-Means, resultando no agrupamento dos textos. A metodologia é testada em exemplos simples de bancada e aplicada a casos concretos de registros de falhas elétricas, de textos com temas em comum e de textos jurídicos e o resultado é comparado com uma classificação realizada por um especialista. Como subprodutos da pesquisa realizada, foram gerados um ambiente gráfico de desenvolvimento de modelos baseados em Reconhecimento de Padrões e Redes Bayesianas e um estudo das possibilidades de utilização de processamento paralelo na aprendizagem de Redes Bayesianas. / This work proposes a methodology for grouping texts for the purposes of textual searching in general but also specifically for aiding in distributing law processes in order to reduce time applied in solving judicial conflicts. The proposed methodology uses the Kullback-Leibler divergence applied to frequency distributions of word stems occurring in the texts. Several groups of stems are considered, built up on their occurrence frequency among the texts and the resulting distributions are taken regarding each one of those groups. For each group, divergences are computed based on the distribution taken from a reference text originated from the assembling of all sample texts, yelding one value for each text in relation to each group of stems. Finally, those values are taken as attributes of each text in a clusterization process driven by a K-Means algorithm implementation providing a grouping for the texts. The methodology is tested for simple toy examples and applied to cases of electrical failure registering, texts with similar issues and law texts and compared to an expert\'s classification. As byproducts from the conducted research, a graphical development environment for Pattern Recognition and Bayesian Networks based models and a study on the possibilities of using parallel processing in Bayesian Networks learning have also been obtained. Agrupamento de textos Algoritmo K-Means Divergência Kullback-Leibler Informação mútua K-Means algorithm Kullback-Leibler divergence Mutual information Text clustering
45	Agrupamento de textos utilizando divergência Kullback-Leibler / Texts grouping using Kullback-Leibler divergence Darwin Junior, Willian 22 February 2016 (has links) O presente trabalho propõe uma metodologia para agrupamento de textos que possa ser utilizada tanto em busca textual em geral como mais especificamente na distribuição de processos jurídicos para fins de redução do tempo de resolução de conflitos judiciais. A metodologia proposta utiliza a divergência Kullback-Leibler aplicada às distribuições de frequência dos radicais (semantemas) das palavras presentes nos textos. Diversos grupos de radicais são considerados, formados a partir da frequência com que ocorrem entre os textos, e as distribuições são tomadas em relação a cada um desses grupos. Para cada grupo, as divergências são calculadas em relação à distribuição de um texto de referência formado pela agregação de todos os textos da amostra, resultando em um valor para cada texto em relação a cada grupo de radicais. Ao final, esses valores são utilizados como atributos de cada texto em um processo de clusterização utilizando uma implementação do algoritmo K-Means, resultando no agrupamento dos textos. A metodologia é testada em exemplos simples de bancada e aplicada a casos concretos de registros de falhas elétricas, de textos com temas em comum e de textos jurídicos e o resultado é comparado com uma classificação realizada por um especialista. Como subprodutos da pesquisa realizada, foram gerados um ambiente gráfico de desenvolvimento de modelos baseados em Reconhecimento de Padrões e Redes Bayesianas e um estudo das possibilidades de utilização de processamento paralelo na aprendizagem de Redes Bayesianas. / This work proposes a methodology for grouping texts for the purposes of textual searching in general but also specifically for aiding in distributing law processes in order to reduce time applied in solving judicial conflicts. The proposed methodology uses the Kullback-Leibler divergence applied to frequency distributions of word stems occurring in the texts. Several groups of stems are considered, built up on their occurrence frequency among the texts and the resulting distributions are taken regarding each one of those groups. For each group, divergences are computed based on the distribution taken from a reference text originated from the assembling of all sample texts, yelding one value for each text in relation to each group of stems. Finally, those values are taken as attributes of each text in a clusterization process driven by a K-Means algorithm implementation providing a grouping for the texts. The methodology is tested for simple toy examples and applied to cases of electrical failure registering, texts with similar issues and law texts and compared to an expert\'s classification. As byproducts from the conducted research, a graphical development environment for Pattern Recognition and Bayesian Networks based models and a study on the possibilities of using parallel processing in Bayesian Networks learning have also been obtained. Agrupamento de textos Algoritmo K-Means Divergência Kullback-Leibler Informação mútua K-Means algorithm Kullback-Leibler divergence Mutual information Text clustering
46	Localização de danos em estruturas isotrópicas com a utilização de aprendizado de máquina / Localization of damages in isotropic strutures with the use of machine learning Oliveira, Daniela Cabral de [UNESP] 28 June 2017 (has links) Submitted by DANIELA CABRAL DE OLIVEIRA null (danielacaboliveira@gmail.com) on 2017-07-31T18:25:34Z No. of bitstreams: 1 Dissertacao.pdf: 4071736 bytes, checksum: 8334dda6779551cc88a5687ed7937bb3 (MD5) / Approved for entry into archive by Luiz Galeffi (luizgaleffi@gmail.com) on 2017-08-03T16:52:18Z (GMT) No. of bitstreams: 1 oliveira_dc_me_ilha.pdf: 4071736 bytes, checksum: 8334dda6779551cc88a5687ed7937bb3 (MD5) / Made available in DSpace on 2017-08-03T16:52:18Z (GMT). No. of bitstreams: 1 oliveira_dc_me_ilha.pdf: 4071736 bytes, checksum: 8334dda6779551cc88a5687ed7937bb3 (MD5) Previous issue date: 2017-06-28 / Este trabalho introduz uma nova metodologia de Monitoramento da Integridade de Estruturas (SHM, do inglês Structural Health Monitoring) utilizando algoritmos de aprendizado de máquina não-supervisionado para localização e detecção de dano. A abordagem foi testada em material isotrópico (placa de alumínio). Os dados experimentais foram cedidos por Rosa (2016). O banco de dados disponibilizado é abrangente e inclui medidas em diversas situações. Os transdutores piezelétricos foram colados na placa de alumínio com dimensões de 500 x 500 x 2mm, que atuam como sensores e atuadores ao mesmo tempo. Para manipulação dos dados foram analisados os sinais definindo o primeiro pacote do sinal (first packet), considerando apenas o intervalo de tempo igual ao tempo da força de excitação. Neste caso, na há interferência dos sinais refletidos nas bordas da estrutura. Os sinais são obtidos na situação sem dano (baseline) e, posteriormente nas diversas situações de dano. Como método de avaliação do quanto o dano interfere em cada caminho, foram implementadas as seguintes métricas: pico máximo, valor médio quadrático (RMSD), correlação entre os sinais, normas H2 e H∞ entre os sinais baseline e sinais com dano. Logo após o cálculo das métricas para as diversas situações de dano, foi implementado o algoritmo de aprendizado de máquina não-supervisionado K-Means no matlab e também testado no toolbox Weka. No algoritmo K-Means há a necessidade da pré-determinação do número de clusters e isto pode dificultar sua utilização nas situações reais. Então, fez se necessário a implementação de um algoritmo de aprendizado de máquina não-supervisionado que utiliza propagação de afinidades, onde a determinação do número de clusters é definida pela matriz de similaridades. O algoritmo de propagação de afinidades foi desenvolvido para todas as métricas separadamente para cada dano. / This paper introduces a new Structural Health Monitoring (SHM) methodology using unsupervised machine learning algorithms for locating and detecting damage. The approach was tested with isotropic material in an aluminum plate. Experimental data were provided by Rosa (2016). This provided database is open and includes measures in a variety of situations. The piezoelectric transducers were bonded to the aluminum plate with dimensions 500 x 500 x 2mm, and act as sensors and actuators simultaneously. In order to manipulate the data, signals defining the first packet were analyzed. It considers strictly the time interval equal to excitation force length. In this case, there is no interference of reflected signals in the structure boundaries. Signals are gathered at undamaged situation (baseline) and at several damage situations. As an evaluating method of how damage interferes in each path, it was implemented the following metrics: maximum peak, root-mean-square deviation (RMSD), correlation between signals, H2 and H∞ norms regarding baseline and damaged signals. The metrics were computed for numerous damage situations. The data were evaluated in an unsupervised K-Means machine learning algorithm implemented in matlab and also tested in Weka toolbox. However, the K-Means algorithm requires the specification of the number of clusters and it is a problem for practical applications. Therefore, an implementation of an unsupervised machine learning algorithm, which uses affinity propagation was made. In this case, the determination of the number of clusters is defined by the data similarity matrix. The affinity propagation algorithm was developed for all metrics separately for each damage. SHM Algoritmo K-Means Algoritmo propagação de afinidade Unsupervised machine learning K-Means algorithm Affinity propagation algorithm
47	Sistema computacional de medidas de colorações humanas para exame médico de sudorese / Human coloring measures computer system for medical sweat test Rodrigues, Lucas Cerqueira, 1988- 27 August 2018 (has links) Orientador: Marco Antonio Garcia de Carvalho / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Tecnologia / Made available in DSpace on 2018-08-27T14:19:19Z (GMT). No. of bitstreams: 1 Rodrigues_LucasCerqueira_M.pdf: 3544177 bytes, checksum: ffa0c5e0ad4701affb1f2910bdd85ca4 (MD5) Previous issue date: 2015 / Resumo: Na pesquisa médica, o exame de sudorese é utilizado para destacar as regiões do corpo onde o paciente transpira, sendo estas úteis para o médico identificar possíveis lesões no sistema nervoso simpático. Os estudos acerca deste exame apontam a inexistência de um processo de identificação automática das regiões do corpo. Neste projeto, utilizou-se o Kinect® para ajudar nesta solução. Este dispositivo é capaz escanear objetos 3D e possui uma biblioteca para desenvolvimento de sistemas. Este trabalho tem o objetivo de construir um sistema computacional cujo propósito é desenvolver uma solução semi-automática para análise de imagens digitais provenientes de exames de sudorese. O sistema em foco permite classificar as regiões do corpo onde o paciente transpira, por intermédio de seu escaneamento 3D, utilizando o Kinect®, e gerar um relatório para o médico com as informações consolidadas de forma a realizar o diagnóstico com facilidade, rapidez e precisão. O projeto teve início em 2013, no laboratório IMAGELab da FT/UNICAMP em Limeira/SP e contou com o apoio de uma das equipes do Hospital das Clínicas da USP de Ribeirão Preto/SP que realiza os estudos sobre o Exame de Sudorese iodo-amido. A contribuição do trabalho consistiu na construção do aplicativo, que utiliza o algoritmo de segmentação de imagem K-Means para segmentação das regiões sobre a superfície do paciente, além do desenvolvimento do sistema que inclui o Kinect®. A aplicação validou-se por meio de experimentos em pacientes reais / Abstract: In medical research, the Sweat Test is used to highlight regions where the patient sweats, which are useful for the doctor to identify possible lesions on the sympathetic nervous system. Studies on this test indicate some difficulties in the automatic identification of body regions. In this project, we used the Kinect® device to help in this solution. Created by Microsoft®, the Kinect® is able to identify distance and has a library for systems development. This work aims to build a computer system intending to resolve some of the difficulties encountered during the research in the examination of sweating. The system created allows classify regions of the body where the patient sweats, through its 3D scanning, using the Kinect®, and export to the doctor the consolidated information in order to make a diagnosis quickly, easily and accurately. The project began in 2013 in ImageLab laboratory FT / UNICAMP in Limeira / SP and had the support of one of the USP Clinical Hospital teams in Ribeirão Preto / SP that performs studies on the Sweating Exam Iodine-Starch. The contribution to knowledge was in the software construction using the Kinect® and the image segmentation using K-Means algorithm for targeting regions on the surface of the patient. The application is validated by experiments on real patients / Mestrado / Tecnologia e Inovação / Mestre em Tecnologia Processamento de imagens Algoritmo K-means Exame de sudorese Sensor Kinect Image processing K-Means algorithm Sweat test Kinect sensor
48	Improving character recognition by thresholding natural images / Förbättra optisk teckeninläsning genom att segmentera naturliga bilder Granlund, Oskar, Böhrnsen, Kai January 2017 (has links) The current state of the art optical character recognition (OCR) algorithms are capable of extracting text from images in predefined conditions. OCR is extremely reliable for interpreting machine-written text with minimal distortions, but images taken in a natural scene are still challenging. In recent years the topic of improving recognition rates in natural images has gained interest because more powerful handheld devices are used. The main problem faced dealing with recognition in natural images are distortions like illuminations, font textures, and complex backgrounds. Different preprocessing approaches to separate text from its background have been researched lately. In our study, we assess the improvement reached by two of these preprocessing methods called k-means and Otsu by comparing their results from an OCR algorithm. The study showed that the preprocessing made some improvement on special occasions, but overall gained worse accuracy compared to the unaltered images. / Dagens optisk teckeninläsnings (OCR) algoritmer är kapabla av att extrahera text från bilder inom fördefinierade förhållanden. De moderna metoderna har uppnått en hög träffsäkerhet för maskinskriven text med minimala förvrängningar, men bilder tagna i en naturlig scen är fortfarande svåra att hantera. De senaste åren har ett stort intresse för att förbättra tecken igenkännings algoritmerna uppstått, eftersom fler kraftfulla och handhållna enheter används. Det huvudsakliga problemet när det kommer till igenkänning i naturliga bilder är olika förvrängningar som infallande ljus, textens textur och komplicerade bakgrunder. Olika metoder för förbehandling och därmed separation av texten och dess bakgrund har studerats under den senaste tiden. I våran studie bedömer vi förbättringen som uppnås vid förbehandlingen med två metoder som kallas för k-means och Otsu genom att jämföra svaren från en OCR algoritm. Studien visar att Otsu och k-means kan förbättra träffsäkerheten i vissa förhållanden men generellt sett ger det ett sämre resultat än de oförändrade bilderna. OCR natural images thresholding image segmentation k-means Otsu OCR naturliga bilder thresholding k-means Otsu Computer Sciences Datavetenskap (datalogi)
49	Thresholded K-means Algorithm for Image Segmentation Girish, Deeptha S. January 2016 (has links) No description available. Electrical Engineering Extended pixel representation optimal number of clusters K-means algorithm Thresholded K-means algorithm image segmentation clusters
50	設計與實作一個針對遊戲論壇的中文文章整合系統 / Design and Implementation of a Chinese Document Integration System for Game Forums 黃重鈞, Huang, Chung Chun Unknown Date (has links) 現今網路發達便利，人們資訊交換的方式更多元，取得資訊的方式，不再僅是透過新聞，透過論壇任何人都可以快速地、較沒有門檻地分享資訊。也因為這個特性造成資訊量暴增，就算透過搜尋引擎，使用者仍需要花費許多精力蒐集、過濾與處理特定的主題。本研究以巴哈姆特電玩資訊站─英雄聯盟哈拉討論板為例，期望可以為使用者提供一個全面且精要的遊戲角色描述，讓使用者至少對該角色有大概的認知。本研究參考網路論壇探勘及新聞文件摘要系統，設計適用於論壇多篇文章的摘要系統。首先必須了解並分析論壇的特性，實驗如何從論壇挖掘出潛藏的資訊，並認識探勘論壇會遭遇的困難。根據前面的論壇分析再設計系統架構大致可分為三階段：1. 資料前處理：論壇文章與新聞文章不同，很難直接將名詞、動詞作為關鍵字，因此使用TF-IDF篩選出論壇文章中有代表性的詞彙，作為句子的向量空間維度。2. 分群：使用K-Means分群法分辨哪些句子是比較相似的，並將相似的句子分在同一群。 3. 句子挑選：根據句子的分群結果，依句子的關鍵字含量及TF-IDF選擇出最能代表文件集的句子。我們發現實驗分析過程中可以看到一些有用的相關資訊，在論文的最後提出可能的改善方法，期望未來可以開發更好的論壇文章分類方式。 / With the establishment of network infrastructure, forum users can provide information fast and easily. However, users can have information retrieved through search engines, but they still have difficulty handling the articles. This is usually beyond the ability of human processing. In this study, we design a tool to automate retrieval of information from each topic in a Chinese game forum. We analyze the characteristics of the game forum, and refer to English news summary system. Our method is divided into three phases. The first phase attempts to discover the keywords in documents by TF-IDF instead of part of speech, and builds a vector space model. The second phase distinguishes the sentences by the vector space model built in the first phase. Also in the second phase, K-means clustering algorithm is exploited to gather sentences with the same sense into the same cluster. In the third phase, we choose two features to weight sentences and order sentences according to their weights. The two features are keywords of a sentence and TF-IDF. We conduct an experiment with data collected from the game forum, and find useful information through the experiment. We believe the developed techniques and the results of the analysis can be used to design a better system in the future. 中文遊戲論壇文件摘要關鍵字擷取 K-Means分群 Chinese game forum summary keyword selection K-means clustering

Search results