471 |
The aCDOM spatial and temporal distribution analysis in Funil reservoir / Análise da distribuição espaço-temporal do aCDOM no reservatório de FunilMartins, Sarah Cristina Araújo [UNESP] 03 August 2017 (has links)
Submitted by SARAH CRISTINA ARAUJO MARTINS null (sarahca.martins@gmail.com) on 2017-08-27T12:54:53Z
No. of bitstreams: 1
Dissertacao_MartinsSarah.pdf: 3974138 bytes, checksum: 73a1c2c28d4a0cbbde72b9e8a49211ce (MD5) / Approved for entry into archive by Luiz Galeffi (luizgaleffi@gmail.com) on 2017-08-29T17:18:44Z (GMT) No. of bitstreams: 1
martins_sca_me_prud.pdf: 3974138 bytes, checksum: 73a1c2c28d4a0cbbde72b9e8a49211ce (MD5) / Made available in DSpace on 2017-08-29T17:18:44Z (GMT). No. of bitstreams: 1
martins_sca_me_prud.pdf: 3974138 bytes, checksum: 73a1c2c28d4a0cbbde72b9e8a49211ce (MD5)
Previous issue date: 2017-08-03 / Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) / A matéria orgânica dissolvida (DOM) é a componente da água que pode ser usada como indicativo de sua qualidade, pois possui duas fontes: uma alóctone, relacionada com descargas de material terrestre, estando vinculada aos ácidos húmicos, e outra autóctone, associada às descargas fluviais ou produção própria do corpo hídrico estudado, estando relacionada aos ácidos fúlvicos. A matéria orgânica dissolvida colorida (CDOM) é a fração colorida da DOM, que pode ser usada como proxy para a observação desta última em águas interiores. O reservatório hidrelétrico de Funil (FHR) foi o corpo hídrico escolhido como área de estudo deste trabalho. Neste contexto, o objetivo geral desta pesquisa foi identificar e avaliar as mudanças no coeficiente de absorção da CDOM (aCDOM) na superfície da água ao longo do tempo (1995 – 2010), bem como entender a sua relação com mudanças no uso e cobertura da terra (LULC) na bacia de contribuição do FHR. Para alcançar tal objetivo foram realizados: (i) o mapeamento histórico de LULC (1995 – 2010, com 5 anos de intervalo) para detecção de mudanças; (ii) o estudo de um conjunto de modelos bio-ópticos baseados na literatura, bem como de um novo modelo empírico desenvolvido para estimar aCDOM via reflectância simulada (Rrs_simulated) para o sensor Thematic Mapper (TM); (iii) a distribuição espaço-temporal do aCDOM por meio da aplicação de um modelo bio-óptico em imagens TM/Landsat-5 de 1995 a 2010, e (iv) a análise das fontes possíveis de CDOM/DOM , assim como do comportamento/distribuição do aCDOM no FHR ao longo do tempo. Assim, o primeiro estudo desenvolvido nesta pesquisa foi o da parametrização do algoritmo maquinas de vetores de suporte (SVM) de acordo com as características da área de estudo para classificação supervisionada de LULC na bacia de contribuição do FHR. A detecção de mudança da classificação obtida para LULC demonstrou que a parametrização proposta para o SVM tornou o algoritmo capaz de diferenciar classes grandes e contínuas, classes estreitas e alongadas, além de áreas não contínuas e pequenas localizadas dentro de outra classe maior. A classificação obtida para o SVM apresentou boa avaliação estatística, com acurácia geral entre 86% e 96% para toda a série temporal, acurácia do produtor de 90%, acurácia do usuário maior do que 86% e índice Kappa entre 86% e 91%. Ainda, foi observado que o LULC desenvolvido na área de estudo se manteve relativamente estável ao longo da série histórica analisada. O segundo estudo realizado proporcionou o desenvolvimento de um modelo empírico em um comprimento de onda (485 nm) e uma razão de bandas (B4/B1) alternativos para estimativa de aCDOM via Rra_simulated para o TM/Landsat-5 (RMSE = 7%, Nash = 0.91). Este modelo também pôde identificar mesmo pequenas variações nos valores de reflectância via dados orbitais, assim como pode diferenciar variações sutis no aCDOM. Ainda, foram identificados dois padrões de comportamento da CDOM para o FHR: um associado ao LULC e à ocorrência de chuva/lixiviação, bem como outro relacionado à Clorofila-a (Chl-a) em situações de floração de algas. Os referidos estudos que compõe esta pesquisa foram padronizados como artigos científicos para a confecção deste documento. O primeiro estudo, sobre a parametrização do SVM, foi publicado na revista Modelling Earth Systems Environment – Springer (DOI 10.1007/s40808-016-0190-y). O segundo estudo, sobre a distribuição histórica do aCDOM está na etapa de revisão para futura submissão. / The dissolved organic matter (DOM) is a water compound related to water quality, since it has two sources: one allochthonous, related to terrestrial discharges that can be linked to humic acids, and another autochthonous, associated with river input and itself production, so related to or fulvic acids. The colored dissolved organic carbon (CDOM) is the colored fraction of DOM that could be used as a proxy for its occurrence in inland waters. The Funil hydroelectric reservoir (FHR) was chosen as the study site for this work. In this context, the general aim of this research was to identify and to evaluate the changes in CDOM absorption coefficient (aCDOM) at the water surface over time (1995 – 2010), and to understand its relationship with land cover land use (LULC) changes in FHR watershed. For match this goal, (i) a LULC historical mapping (1995 – 2010, with 5 years of interval) was made to change detection, (ii) a bio-optical model set and a new model were studied in order to estimate aCDOM from simulated reflectance (Rrs_simulated) for Thematic Mapper (TM) sensor, (iii) a aCDOM spatial and temporal distribution was obtained by applying a bio-optical model in TM/Landsat-5 imagery from 1995 to 2010, and (iv) the possible CDOM/DOM sources in FHR were found, as well aCDOM historical behavior/distribution over time was analysed. Thus, the first study was the support vector machine algorithm (SVM) parameterization according to study area characteristics to LULC supervised classification in FHR watershed. The obtained LULC change detection analysis demonstrates that the proposed SVM parameterization made the algorithm able to differentiate large and continuous classes, lengthy and thin areas, and non-continuous small areas located inside wide classes. The obtained classification had great statistics with overall accuracy among 86% and 98% over the time series, the producer accuracy of 90%, the user accuracy higher than 86%, and the Kappa statistics ranged from 86% to 91%. In addition, no significant changes in LULC were identified in the study site over all time series. The second study provides a bio-optical model at alternatives wavelength (485 nm) and a band ratio (B4/B1) for aCDOM estimation using simulated Rrs for TM/Landsat-5 (RMSE = 7%, Nash = 0.91). This model could identify even small variations in reflectance values from orbital data, as well as differentiate even slight alterations in aCDOM. Two significantly different aCDOM behaviors were also identified for FHR: one associated with LULC and rainfall/runoff occurrence, and other correlated to Chlorophyll-a high concentrations (Chl-a) in algal blooms situations. The referred studies that compose this research ware standardized as academic articles in this document. The first study, about SVM parameterization, was published yet in Modeling Earth Systems Environment – Springer (DOI 10.1007/s40808-016-0190-y). The second study, about aCDOM historical distribution is in the revision step to future submission.
|
472 |
An?lise e classifica??o de imagens de les?es da pele por atributos de cor, forma e textura utilizando m?quina de vetor de suporteSoares, Heliana Bezerra 22 February 2008 (has links)
Made available in DSpace on 2014-12-17T14:54:49Z (GMT). No. of bitstreams: 1
HelianaBS_da_capa_ate_cap4.pdf: 2361373 bytes, checksum: 3e1e43e8ba1aadc274663b8b8e3de72f (MD5)
Previous issue date: 2008-02-22 / Conselho Nacional de Desenvolvimento Cient?fico e Tecnol?gico / The skin cancer is the most common of all cancers and the increase of its incidence must, in part, caused by the behavior of the people in relation to the exposition to the sun. In Brazil, the non-melanoma skin cancer is the most incident in the majority of the regions. The dermatoscopy and videodermatoscopy are the main types of examinations for the diagnosis of dermatological illnesses of the skin.
The field that involves the use of computational tools to help or follow medical diagnosis in dermatological injuries is seen as very recent. Some methods had been proposed for automatic classification of pathology of the skin using images. The present work has the objective to present a new intelligent methodology for analysis and classification of skin cancer images, based on the techniques of digital processing of images for extraction of color characteristics, forms and texture, using Wavelet Packet Transform (WPT) and learning techniques called Support Vector Machine (SVM). The Wavelet Packet Transform is applied for extraction of texture characteristics in the images. The WPT consists of a set of base functions that represents the image in different bands of frequency, each one with distinct resolutions corresponding to each scale. Moreover, the characteristics of color of the injury are also computed that are dependants of a visual context, influenced for the existing colors in its surround, and the attributes of form through the Fourier describers. The Support Vector Machine is used for the classification task, which is based on the minimization principles of the structural risk, coming from the statistical learning theory. The SVM has the objective to construct optimum hyperplanes that represent the separation between classes. The generated hyperplane is determined by a subset of the classes, called support vectors. For the used database in this work, the results had revealed a good performance getting a global rightness of 92,73% for melanoma, and 86% for non-melanoma and benign injuries. The extracted describers and the SVM classifier became a method capable to recognize and to classify the analyzed skin injuries / O c?ncer de pele ? o mais comum de todos os c?nceres e o aumento da sua incid?ncia deve-se, em parte, ao comportamento das pessoas em rela??o ? exposi??o ao sol. No Brasil, o c?ncer de pele n?o melanoma ? o mais incidente na maioria das regi?es. A dermatoscopia e ideodermatoscopia s?o os principais tipos de exames para o diagn?stico de doen?as da pele dermatol?gicas. O campo que envolve o uso de ferramentas computacionais para o aux?lio ou acompanhamento do diagn?stico m?dico em les?es dermatol?gicas ainda ? visto como muito recente. V?rios m?todos foram propostos para classifica??o autom?tica de patologias da pele utilizando imagens. O presente trabalho tem como objetivo apresentar uma nova metodologia inteligente para an?lise e classifica??o de imagens de c?ncer de pele, baseada nas t?cnicas de processamento digital de imagens para extra??o de caracter?sticas de cor, forma e textura, utilizando a Transformada Wavelet Packet (TWP) e a t?cnicas de aprendizado de m?quina denominada M?quina de Vetor de Suporte (SVM Support Vector Machine). A Transformada Wavelet Packet ? aplicada para extra??o de caracter?sticas de textura nas imagens. Esta consiste de um conjunto de fun??es base que representa a imagem em diferentes bandas de freq??ncia, cada uma com resolu??es distintas correspondente a cada escala. Al?m disso, s?o calculadas tamb?m as caracter?sticas de cor da les?o que s?o dependentes de um contexto visual, influenciada pelas cores existentes em sua volta, e os atributos de forma atrav?s dos descritores de Fourier. Para a tarefa de classifica??o ? utilizado a M?quina de Vetor de Suporte, que baseia-se nos princ?pios da minimiza??o do risco estrutural, proveniente da teoria do aprendizado estat?stico. A SVM tem como objetivo construir hiperplanos ?timos que apresentem a maior margem de separa??o entre classes. O hiperplano gerado ? determinado por um subconjunto dos pontos das classes, chamado vetores de suporte. Para o banco de dados utilizado neste trabalho, os resultados apresentaram um bom desempenho obtendo um acerto global de 92,73% para melanoma, e 86% para les?es n?o-melanoma e benigna. O potencial dos descritores extra?dos aliados ao classificador SVM tornou o m?todo capaz de reconhecer e classificar as les?es analisadas
|
473 |
\"Processamento e análise de imagens para medição de vícios de refração ocular\" / Image Processing and Analysis for Measuring Ocular Refraction ErrorsAntonio Valerio Netto 18 August 2003 (has links)
Este trabalho apresenta um sistema computacional que utiliza técnicas de Aprendizado de Máquina (AM) para auxiliar o diagnóstico oftalmológico. Trata-se de um sistema de medidas objetivas e automáticas dos principais vícios de refração ocular, astigmatismo, hipermetropia e miopia. O sistema funcional desenvolvido aplica técnicas convencionais de processamento a imagens do olho humano fornecidas por uma técnica de aquisição chamada Hartmann-Shack (HS), ou Shack-Hartmann (SH), com o objetivo de extrair e enquadrar a região de interesse e remover ruídos. Em seguida, vetores de características são extraídos dessas imagens pela técnica de transformada wavelet de Gabor e, posteriormente, analisados por técnicas de AM para diagnosticar os possíveis vícios refrativos presentes no globo ocular representado. Os resultados obtidos indicam a potencialidade dessa abordagem para a interpretação de imagens de HS de forma que, futuramente, outros problemas oculares possam ser detectados e medidos a partir dessas imagens. Além da implementação de uma nova abordagem para a medição dos vícios refrativos e da introdução de técnicas de AM na análise de imagens oftalmológicas, o trabalho contribui para a investigação da utilização de Máquinas de Vetores Suporte e Redes Neurais Artificiais em sistemas de Entendimento/Interpretação de Imagens (Image Understanding). O desenvolvimento deste sistema permite verificar criticamente a adequação e limitações dessas técnicas para a execução de tarefas no campo do Entendimento/Interpretação de Imagens em problemas reais. / This work presents a computational system that uses Machine Learning (ML) techniques to assist in ophthalmological diagnosis. The system developed produces objective and automatic measures of ocular refraction errors, namely astigmatism, hypermetropia and myopia from functional images of the human eye acquired with a technique known as Hartmann-Shack (HS), or Shack-Hartmann (SH). Image processing techniques are applied to these images in order to remove noise and extract the regions of interest. The Gabor wavelet transform technique is applied to extract feature vectors from the images, which are then input to ML techniques that output a diagnosis of the refractive errors in the imaged eye globe. Results indicate that the proposed approach creates interesting possibilities for the interpretation of HS images, so that in the future other types of ocular diseases may be detected and measured from the same images. In addition to implementing a novel approach for measuring ocular refraction errors and introducing ML techniques for analyzing ophthalmological images, this work investigates the use of Artificial Neural Networks and Support Vector Machines (SVMs) for tasks in Image Understanding. The description of the process adopted for developing this system can help in critically verifying the suitability and limitations of such techniques for solving Image Understanding tasks in \"real world\" problems.
|
474 |
Desenvolvimento de um algoritmo morfológico para detecção e classificação de lesões em imagens de mamografiaLIMA, Sidney Marlon Lopes de 25 February 2016 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2017-02-23T14:02:54Z
No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
tese-completa-Sidney_Lima_v21.pdf: 4757211 bytes, checksum: 205170db8b002cc2ab72255ab77628a3 (MD5) / Made available in DSpace on 2017-02-23T14:02:54Z (GMT). No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
tese-completa-Sidney_Lima_v21.pdf: 4757211 bytes, checksum: 205170db8b002cc2ab72255ab77628a3 (MD5)
Previous issue date: 2016-02-25 / REUNI / O câncer de mama é a principal causa de morte de mulheres adultas por
câncer no mundo. Do ponto de vista clínico, a mamografia é ainda a mais efetiva
tecnologia de diagnóstico do câncer de mama, dada a grande difusão de uso e
interpretações dessas imagens. De acordo com o estado da arte da classificação de
lesões em mamogramas, as wavelets têm apresentado os melhores resultados do
ponto de vista da taxa de classificação, quando utilizadas como etapa de préprocessamento
que decompõe a imagem original em imagens de detalhes (verticais,
horizontais e diagonais) e aproximações para, a partir dessas imagens
componentes, serem extraídos atributos de textura. Neste trabalho, propõe-se a
Decomposição baseada em Aproximações Morfológicas em regiões de interesse em
mamogramas. O método proposto tem por base a decomposição inspirada em
wavelets que emprega filtros não lineares passa-baixas e passa-altas, baseados em
aberturas e fechamentos, que por sua vez são construídos a partir dos operadores
morfológicos clássicos de erosão e dilatação. Neste trabalho, são propostas
aproximações aritméticas para esses dois operadores morfológicos clássicos,
substituindo os desvios condicionais, presentes na Morfologia Matemática, por
operações aritméticas de somas, subtrações e multiplicações, computacionalmente
mais rápidas. O trabalho proposto compara o tempo estimado de execução entre as
aproximações aritméticas propostas e as operações morfológicas clássicas
utilizando a notação Big-Oh e também faz uso de estimativas baseadas em
arquitetura de hardware pipeline. Em todas as estimativas e cenários reais, as
aproximações morfológicas propostas são mais rápidas do que a morfologia
clássica. Além disso, por não empregar unidade de hardware em ambiente pipeline
para tratamento de desvios condicionais, as aproximações morfológicas propostas
se tornam uma solução mais barata, ocupa menos espaço, mais propícia à
miniaturização, consome menos energia e reduz o número de codificações da UC
(Unidade de Controle). Logo, as Aproximações Morfológicas criadas são superiores
à morfologia clássica nos principais requisitos para o bom funcionamento do
hardware. Quanto à classificação, a Decomposição baseada em Aproximações
Morfológicas alcança um desempenho médio de 84,65% na distinção entre casos
normais, benignos e malignos. Os classificadores empregados são redes neurais
ELM e SVM, cujas classes são definidas de acordo com os critérios da American
College of Radiology. Foram usadas 355 imagens de mama adiposa da base de
dados IRMA, com 233 casos normais, 66 benignos e 56 malignos. Como método de
tratamento da base de dados, foram estudados pesos ponderando a fronteira de
decisão das redes neurais. / According to the World Health Organization, breast cancer is the main cause of
death of women round the world. From the clinical point of view, mammography is
still the most effective diagnostic technology, given the wide diffusion of the use and
interpretation of these images. According the state-of-the-art lesions classification on
mammograms, wavelets have produced the best results from the viewpoint of
precision, when used as a preprocessing step that decomposes the original image
into approximation and detail images (vertical, horizontal and diagonal) in order to,
from these components images, extract shape or texture attributes. This work
proposes the decomposition Morphological-based in regions of interest on
mammograms. The proposed method is inspired on wavelets decomposition
employing nonlinear low-pass and high-pass filters, based on openings and closings,
which are constructed from classical morphological operators of erosion and dilation.
In this work, we propose approaches of classical morphology, replacing the
conditional branches, present in Mathematical Morphology, by arithmetic operations
of addition, subtraction and multiplication, computationally faster. The proposed work
compares the estimated run time of proposed arithmetic approximations and classical
morphological operations using Big-Oh notation and also the thesis uses notation
based on pipeline hardware architecture. In all real scenarios, our morphological
operations are faster than classical morphology. Also, by not employing hardware
unit in pipeline environment for treating conditional branches, the proposed
morphology approximation becomes a cheaper solution, occupies less space, more
propitious to miniaturization, consumes less power, and reduces the Control Unit
coding number. Then, our approaches of classical morphology are superior to
classical morphology in key requirements of hardware solution. Regarding the
classification, the proposed decomposition reaches an average performance of
84.65% in distinguishing normal, benign, and malignant cases. Classifiers are neural
networks ELM and SVM, classes are defined according American College of
Radiology criteria. They are employed 355 adipose breast images with 233 normal
cases, 66 benign and 56 malignant. As database processing method, weights were
studied considering the decision boundary of neural networks.
|
475 |
PCA based dimensionality reduction of MRI images for training support vector machine to aid diagnosis of bipolar disorder / PCA baserad dimensionalitetsreduktion av MRI bilder för träning av stödvektormaskin till att stödja diagnostisering av bipolär sjukdomChen, Beichen, Chen, Amy Jinxin January 2019 (has links)
This study aims to investigate how dimensionality reduction of neuroimaging data prior to training support vector machines (SVMs) affects the classification accuracy of bipolar disorder. This study uses principal component analysis (PCA) for dimensionality reduction. An open source data set of 19 bipolar and 31 control structural magnetic resonance imaging (sMRI) samples was used, part of the UCLA Consortium for Neuropsychiatric Phenomics LA5c Study funded by the NIH Roadmap Initiative aiming to foster breakthroughs in the development of novel treatments for neuropsychiatric disorders. The images underwent smoothing, feature extraction and PCA before they were used as input to train SVMs. 3-fold cross-validation was used to tune a number of hyperparameters for linear, radial, and polynomial kernels. Experiments were done to investigate the performance of SVM models trained using 1 to 29 principal components (PCs). Several PC sets reached 100% accuracy in the final evaluation, with the minimal set being the first two principal components. Accumulated variance explained by the PCs used did not have a correlation with the performance of the model. The choice of kernel and hyperparameters is of utmost importance as the performance obtained can vary greatly. The results support previous studies that SVM can be useful in aiding the diagnosis of bipolar disorder, and that the use of PCA as a dimensionality reduction method in combination with SVM may be appropriate for the classification of neuroimaging data for illnesses not limited to bipolar disorder. Due to the limitation of a small sample size, the results call for future research using larger collaborative data sets to validate the accuracies obtained. / Syftet med denna studie är att undersöka hur dimensionalitetsreduktion av neuroradiologisk data före träning av stödvektormaskiner (SVMs) påverkar klassificeringsnoggrannhet av bipolär sjukdom. Studien använder principalkomponentanalys (PCA) för dimensionalitetsreduktion. En datauppsättning av 19 bipolära och 31 friska magnetisk resonanstomografi(MRT) bilder användes, vilka tillhör den öppna datakällan från studien UCLA Consortium for Neuropsychiatric Phenomics LA5c som finansierades av NIH Roadmap Initiative i syfte att främja genombrott i utvecklingen av nya behandlingar för neuropsykiatriska funktionsnedsättningar. Bilderna genomgick oskärpa, särdragsextrahering och PCA innan de användes som indata för att träna SVMs. Med 3-delad korsvalidering inställdes ett antal parametrar för linjära, radiala och polynomiska kärnor. Experiment gjordes för att utforska prestationen av SVM-modeller tränade med 1 till 29 principalkomponenter (PCs). Flera PC uppsättningar uppnådde 100% noggrannhet i den slutliga utvärderingen, där den minsta uppsättningen var de två första PCs. Den ackumulativa variansen över antalet PCs som användes hade inte någon korrelation med prestationen på modellen. Valet av kärna och hyperparametrar är betydande eftersom prestationen kan variera mycket. Resultatet stödjer tidigare studier att SVM kan vara användbar som stöd för diagnostisering av bipolär sjukdom och användningen av PCA som en dimensionalitetsreduktionsmetod i kombination med SVM kan vara lämplig för klassificering av neuroradiologisk data för bipolär och andra sjukdomar. På grund av begränsningen med få dataprover, kräver resultaten framtida forskning med en större datauppsättning för att validera de erhållna noggrannheten.
|
476 |
Detection and Classification of Sparse Traffic Noise Events / Detektering och klassificering av bullerhändelser från gles trafikGolshani, Kevin, Ekberg, Elias January 2023 (has links)
Noise pollution is a big health hazard for people living in urban areas, and its effects on humans is a growing field of research. One of the major contributors to urban noise pollution is the noise generated by traffic. Noise simulations can be made in order to build noise maps used for noise management action plans, but in order to test their accuracy real measurements needs to be done, in this case in the form of noise measurements taken adjacent to a road. The aim of this project is to test machine learning based methods in order to develop a robust way of detecting and classifying vehicle noise in sparse traffic conditions. The primary focus is to detect traffic noise events, and the secondary focus is to classify what kind of vehicle is producing the noise. The data used in this project comes from sensors installed on a testbed at a street in southern Stockholm. The sensors include a microphone that is continuously measuring the local noise environment, a radar that detects each time a vehicle is passing by, and a camera that also detects a vehicle by capturing its license plate. Only sparse traffic noises are considered for this thesis, as such the audio recordings used are those where the radar has only detected one vehicle in a 40 second window. This makes the data gathered weakly labeled. The resulting detection method is a two-step process: First, the unsupervised learning method k-means is implemented for the generation of strong labels. Second, the supervised learning method random forest or support vector machine uses the strong labels in order to classify audio features. The detection system of sparse traffic noise achieved satisfactory results. However, the unsupervised vehicle classification method produced inadequate results and the clustering could not differentiate different vehicle classes based on the noise data. / Buller är en stor hälsorisk för människor som bor i stadsområden, och dess effekter på människor är ett växande forskningsfält. En av de största bidragen till stadsbuller är oljud som genereras av trafiken. Man kan utföra simuleringar i syfte att skapa bullerkartor som kan användas till planer för att minska dessa ljud. För att testa deras noggrannhet måste verkliga mätningar tas, i detta fall i formen av ljudmätningar tagna intill en väg. Syftet med detta projekt är att testa maskininlärningsmetoder för att utveckla ett robust sätt att detektera och klassificera fordonsljud i glesa trafikförhållanden. Primärt fokus ligger på att detektera bullerhändelser från trafiken, och sekundärt fokus är att försöka klassificera vilken typ av fordon som producerade ljudet. Datan som används i detta projekt kommer från sensorer installerade på en testbädd på en gata i södra Stockholm. Sensorerna inkluderar en mikrofon som kontinuerligt mäter den lokala ljudmiljön, en radar som detekterar varje gång ett fordon passerar, och en kamera som också detekterar ett fordon genom att ta bild på dess registreringsskylt. Endast ljud från gles trafik kommer att beaktas och användas i detta arbete, och därför används bara de ljudinspelningar där radarn har upptäckt ett enskilt fordon under ett 40 sekunders intervall. Detta gör att den insamlade datan har svaga etiketter. Den resulterande detekteringsmetoden är en tvåstegsprocess: För det första används den oövervakade inlärningsmetoden k-means för att generera starka etiketter. För det andra används de starka etiketterna av den övervakade inlärningsmetoden slumpmässig beslutsskog eller stödvektormaskin i syfte att klassificera ljudegenskaper. Detekteringssystemet av glest trafikljud uppnådde tillfredsställande resultat. Däremot producerade den oövervakade klassificeringsmetoden för fordonsljud otillräckliga resultat, och klustringen kunde inte urskilja mellan olika fordonsklasser baserat på ljuddatan.
|
477 |
運用財經文本情感分析於台灣電子類股價指數趨勢預測之研究 / Research of applying Sentimental Analysis on financial documents to predict Taiwan Electronic Sub-Index trend劉羿廷 Unknown Date (has links)
電子工業為台灣最具競爭力之產業,使得電子類股在集中市場成交比重高達 69.49%,可見電子類股的波動足以對整個台股市場造成相當大的影響。而許多研究指出,網路上的文本訊息藉由社會網路的催化而快速傳遞,會對群眾情緒造成影響,進而影響股價波動,故對於投資者而言,如果能快速分析大量網路財經文本來推測投資大眾情緒進而預測股價走勢,即可提升獲利。然而,每天有近百篇的財經文本產生,傳統的人工抽樣分析方式效率不彰且過於耗力, 已不足以負荷此巨量資料。
過去文本情感分析的研究中已證實監督式學習方法可以透過簡單量化的方式達到良好的分類效果,但監督式學習方法所使用的訓練資料集須有事先定義好的已知類別,故其有無法預期未知類別的限制,造成無法判斷文本中可能存在的未知主題,所以本研究提出一套針對財經文本的混合監督式學習與非監督式學習之情感分析方法,透過非監督式學習將 2014 整年度的電子工業財經文本進行文本主題判別、情緒指數計算與情緒傾向標注。之後配合視覺化工具作趨勢線圖分析,找出具有領先指標特性之主題,接著再用監督式學習將其結合國際指標、總體經濟指標、台股指標、技術指標等,建立分類模型以預測台灣電子類股價指數走勢。
在實驗結果中,主題標注方面,本研究發現因文本數量遠大於議題詞數量造成 TFIDF 矩陣過於稀疏,使得 TFIDF-Kmeans 主題模型分類效果不佳;而文本具有多主題之特性造成 NPMI-Concor 分群之議題詞過於複雜不易歸納,然而LDA 主題模型基於所有主題被所有文章共享的特性,使得在字詞分群與主題分類準確度都優於 TFIDF-Kmeans 和 NPMI-Concor 主題模型,分類準確度高達 98%,故後續採用 LDA 主題模型進行主題標注。情緒傾向標注方面,證實本研 究擴充後的情感詞集比起 NTUSD 有更好的字詞極性判斷效果,計算出的情緒 指數之趨勢線也較投資人常用的 MACD 之趨勢線更符合電子類股價指數之趨 勢。此外,亦發現並非所有文本的情緒指數皆具有領先特性,僅企業營運主題與總體經濟主題之文本的情緒指數能提前反應電子類股價指數趨勢,故本研究用此二主題之文本的情緒指數來建立分類模型。
接著,本研究透過比較情緒指數結合技術指標之分類模型與單純技術指標分類模型的準確率發現,前者較後者高出 7%的準確率。進一步結合間接情緒指標的分類模型更有高達 71%準確率,故證實了情感分析確實能有效提升電子股價類股指數趨勢預測準確度,以提升投資人之投資報酬率。 / The electronic industry is the most competitive industry in Taiwan, and its large volume could have strong influence on the whole stock market. Many research show that text documents on the Internet have great effect on public emotion, and the public emotion could also affect the stock price. For investors, it is important to know how to analyze the potential emotion in text documents then use this information to predict the stock trend. However, the traditional way to analyze text documents by human resource cannot afford the large volume of financial text documents on the Internet.
In past Sentimental Analysis research, supervised method is proven as a method could reach high accuracy, but there are limits about predicting the future trend. This research found a solution which mixed supervised and unsupervised methods to deal with these large financial text documents. First, we use unsupervised method to find out the topic of documents, and then calculate the sentimental index to judge the document’s emotional direction. After that we will produce trend line charts by visualization tools to find out which theme documents’ sentiment index are leading indicators. Furthermore, we use supervised method to integrate the sentimental index with other 24 indirect sentimental index to build the prediction model.
According to the result, we found that LDA model’s performance is better than TFIDF-Kmeans model and NPMI-Concor mode because of document characteristic. Besides, sentimental dictionary I build has higher accuracy than NTUSD on judging word polarity. The trend of sentimental index and Taiwan electronic sub-index(TE) to each other is more similar than MACD line and TE to each other. We also discover that the sentiment index produced from documents about enterprise operation and macroeconomics are leading indicators, so we use these to build prediction model.
Moreover, we found that the prediction model which include the sentiment index better than which only include the technical indicators. As mentioned above, the sentimental index could make the prediction of Taiwan electronic sub-index trend be more accurate and promote the return of investment.
|
478 |
Classification automatique de flux radiophoniques par Machines à Vecteurs de SupportRamona, Mathieu 21 June 2010 (has links) (PDF)
Nous présentons ici un système de classification audio parole/musique tirant parti des excellentes propriétés statistiques des Machines à Vecteurs de Support. Ce problème pose les trois questions suivantes : comment exploiter efficacement les SVM, méthode d'essence discriminatoire, sur un problème à plus de deux classes, comment caractériser un signal audio de manière pertinente, et enfin comment traiter l'aspect temporel du problème ? Nous proposons un système hybride de classification multi-classes tirant parti des approches un-contre-un et par dendogramme, et permettant l'estimation de probabilités a posteriori. Ces dernières sont exploitées pour l'application de méthodes de post-traitement prenant en compte les interdépendances entre trames voisines. Nous proposons ainsi une méthode de classification par l'application de Modèles de Markov Cachés (HMM) sur les probabilités a posteriori, ainsi qu'une approche basée sur la détection de rupture entre segments au contenu acoustique "homogène". Par ailleurs, la caractérisation du signal audio étant opérée par une grande collection des descripteurs audio, nous proposons de nouveaux algorithmes de sélection de descripteurs basés sur le récent critère d'Alignement du noyau ; critère que nous avons également exploité pour la sélection de noyau dans le processus de classification. Les algorithmes proposés sont comparés aux méthodes les plus efficaces de l'état de l'art auxquelles elles constituent une alternative pertinente en termes de coût de calcul et de stockage. Le système construit sur ces contributions a fait l'objet d'une participation à la campagne d'évaluation ESTER 2, que nous présentons, accompagnée de nos résultats.
|
479 |
Reconnaissance automatique des gestes de la langue française parlée complétéeBurger, Thomas 26 October 2007 (has links) (PDF)
Le LPC est un complément à la lecture labiale qui facilite la communication des malentendants. Sur le principe, il s'agit d'effectuer des gestes avec une main placée à côté du visage pour désambigüiser le mouvement des lèvres, qui pris isolément est insuffisant à la compréhension parfaite du message. Le projet RNTS TELMA a pour objectif de mettre en place un terminal téléphonique permettant la communication des malentendants en s'appuyant sur le LPC. Parmi les nombreuses fonctionnalités que cela implique, il est nécessaire de pouvoir reconnaître le geste manuel du LPC et de lui associer un sens. L'objet de ce travail est la segmentation vidéo, l'analyse et la reconnaissance des gestes de codeur LPC en situation de communication. Cela fait appel à des techniques de segmentation d'images, de classification, d'interprétation de geste, et de fusion de données. Afin de résoudre ce problème de reconnaissance de gestes, nous avons proposé plusieurs algorithmes originaux, parmi lesquels (1) un algorithme basé sur la persistance rétinienne permettant la catégorisation des images de geste cible et des images de geste de transition, (2) une amélioration des méthodes de multi-classification par SVM ou par classifieurs unaires via la théorie de l'évidence, assortie d'une méthode de conversion des probabilités subjectives en fonction de croyance, et (3) une méthode de décision partielle basée sur la généralisation de la Transformée Pignistique, afin d'autoriser les incertitudes dans l'interprétation de gestes ambigus.
|
480 |
Bearing Diagnosis Using Fault Signal Enhancing Teqniques and Data-driven ClassificationLembke, Benjamin January 2019 (has links)
Rolling element bearings are a vital part in many rotating machinery, including vehicles. A defective bearing can be a symptom of other problems in the machinery and is due to a high failure rate. Early detection of bearing defects can therefore help to prevent malfunction which ultimately could lead to a total collapse. The thesis is done in collaboration with Scania that wants a better understanding of how external sensors such as accelerometers, can be used for condition monitoring in their gearboxes. Defective bearings creates vibrations with specific frequencies, known as Bearing Characteristic Frequencies, BCF [23]. A key component in the proposed method is based on identification and extraction of these frequencies from vibration signals from accelerometers mounted near the monitored bearing. Three solutions are proposed for automatic bearing fault detection. Two are based on data-driven classification using a set of machine learning methods called Support Vector Machines and one method using only the computed characteristic frequencies from the considered bearing faults. Two types of features are developed as inputs to the data-driven classifiers. One is based on the extracted amplitudes of the BCF and the other on statistical properties from Intrinsic Mode Functions generated by an improved Empirical Mode Decomposition algorithm. In order to enhance the diagnostic information in the vibration signals two pre-processing steps are proposed. Separation of the bearing signal from masking noise are done with the Cepstral Editing Procedure, which removes discrete frequencies from the raw vibration signal. Enhancement of the bearing signal is achieved by band pass filtering and amplitude demodulation. The frequency band is produced by the band selection algorithms Kurtogram and Autogram. The proposed methods are evaluated on two large public data sets considering bearing fault classification using accelerometer data, and a smaller data set collected from a Scania gearbox. The produced features achieved significant separation on the public and collected data. Manual detection of the induced defect on the outer race on the bearing from the gearbox was achieved. Due to the small amount of training data the automatic solutions were only tested on the public data sets. Isolation performance of correct bearing and fault mode among multiplebearings were investigated. One of the best trade offs achieved was 76.39 % fault detection rate with 8.33 % false alarm rate. Another was 54.86 % fault detection rate with 0 % false alarm rate.
|
Page generated in 0.0339 seconds