Global ETD Search

261	Otimização e análise das máquinas de vetores de suporte aplicadas à classificação de documentos. / Optimization and analysis of support vector machine applied to text classification. Kinto, Eduardo Akira 17 June 2011 (has links) A análise das informações armazenadas é fundamental para qualquer tomada de decisão, mas para isso ela deve estar organizada e permitir fácil acesso. Quando temos um volume de dados muito grande, esta tarefa torna-se muito mais complicada do ponto de vista computacional. É fundamental, então, haver mecanismos eficientes para análise das informações. As Redes Neurais Artificiais (RNA), as Máquinas de Vetores-Suporte (Support Vector Machine - SVM) e outros algoritmos são frequentemente usados para esta finalidade. Neste trabalho, iremos explorar o SMO (Sequential Minimal Optimization) e alterá-lo, com a finalidade de atingir um tempo de treinamento menor, mas, ao mesmo tempo manter a capacidade de classificação. São duas as alterações propostas, uma, no seu algoritmo de treinamento e outra, na sua arquitetura. A primeira modificação do SMO proposta neste trabalho é permitir a atualização de candidatos ao vetor suporte no mesmo ciclo de atualização de um coeficiente de Lagrange. Dos algoritmos que codificam o SVM, o SMO é um dos mais rápidos e um dos que menos consome memória. A complexidade computacional do SMO é menor com relação aos demais algoritmos porque ele não trabalha com inversão de uma matriz de kernel. Esta matriz, que é quadrada, costuma ter um tamanho proporcional ao número de amostras que compõem os chamados vetores-suporte. A segunda proposta para diminuir o tempo de treinamento do SVM consiste na subdivisão ordenada do conjunto de treinamento, utilizando-se a dimensão de maior entropia. Esta subdivisão difere das abordagens tradicionais pelo fato de as amostras não serem constantemente submetidas repetidas vezes ao treinamento do SVM. Finalmente, é aplicado o SMO proposto para classificação de documentos ou textos por meio de uma abordagem nova, a classificação de uma-classe usando classificadores binários. Como toda classificação de documentos, a análise dos atributos é uma etapa fundamental, e aqui uma nova contribuição é apresentada. Utilizamos a correlação total ponto a ponto para seleção das palavras que formam o vetor de índices de palavras. / Stored data analysis is very important when taking a decision in every business, but to accomplish this task data must be organized in a way it can be easily accessed. When we have a huge amount of information, data analysis becomes a very computational hard job. So, it is essential to have an efficient mechanism for information analysis. Artificial neural networks (ANN), support vector machine (SVM) and other algorithms are frequently used for information analysis, and also in huge volume information analysis. In this work we will explore the sequential minimal optimization (SMO) algorithm, a learning algorithm for the SVM. We will modify it aiming for a lower training time and also to maintaining its classification generalization capacity. Two modifications are proposed to the SMO, one in the training algorithm and another in its architecture. The first modification to the SMO enables more than one Lagrange coefficient update by choosing the neighbor samples of the updating pair (current working set). From many options of SVM implementation, SMO was chosen because it is one of the fastest and less memory consuming one. The computational complexity of the SMO is lower than other types of SVM because it does not require handling a huge Kernel matrix. Matrix inversion is one of the most time consuming step of SVM, and its size is as bigger as the number of support vectors of the sample set. The second modification to the SMO proposes the creation of an ordered subset using as a reference one of the dimensions; entropy measure is used to choose the dimension. This subset creation is different from other division based SVM architectures because samples are not used in more than one training pair set. All this improved SVM is used on a one-class like classification task of documents. Every document classification problem needs a good feature vector (feature selection and dimensionality reduction); we propose in this work a novel feature indexing mechanism using the pointwise total correlation. Aprendizado computacional Artificial intelligence Artificial neural network Information retrieval Inteligência artificial Machine learning Recuperação da informação Redes neurais Support vector machine Text classification
262	Optimization of identification of particle impacts using acoustic emission Hedayetullah, Amin Mohammad January 2018 (has links) Air borne or liquid-laden solid particle transport is a common phenomenon in various industrial applications. Solid particles, transported at severe operating conditions such as high flow velocity, can cause concerns for structural integrity through wear originated from particle impacts with structure. To apply Acoustic Emission (AE) in particle impact monitoring, previous researchers focused primarily on dry particle impacts on dry target plate and/or wet particle impacts on wet or dry target plate. For dry particle impacts on dry target plate, AE events energy, calculated from the recorded free falling or air borne particle impact AE signals, were correlated with particle size, concentration, height, target material and thickness. For a given system, once calibrated for a specific particle type and operating condition, this technique might be sufficient to serve the purpose. However, if more than one particle type present in the system, particularly with similar size, density and impact velocity, calculated AE event energy is not unique for a specific particle type. For wet particle impacts on dry or wet target plate (either submerged or in a flow loop), AE event energy was related to the particle size, concentration, target material, impact velocity and angle between the nozzle and the target plate. In these studies, the experimental arrangements and the operating conditions considered either did not allow any bubble formation in the system or even if there is any at least an order of magnitude lower in amplitude than the sand particle impact and so easily identifiable. In reality, bubble formation can be comparable with particle impacts in terms of AE amplitude in process industries, for example, sand production during oil and gas transportation from reservoir. Current practice is to calibrate an installed AE monitoring system against a range of sand free flow conditions. In real time monitoring, for a specific calibrated flow, the flow generated AE amplitude/energy is deducted from the recorded AE amplitude/energy and the difference is attributed to the sand particle impacts. However, if the flow condition changes, which often does in the process industry, the calibration is not valid anymore and AE events from bubble can be misinterpreted as sand particle impacts and vice versa. In this research, sand particles and glass beads with similar size, density and impact velocity have been studied dropping from 200 mm on a small cylindrical stepped mild steel coupon as a target plate. For signal recording purposes, two identical broadband AE sensors are installed, one at the centre and one 30 mm off centred, on the opposite of the impacting surface. Signal analysis have been carried out by evaluating 7 standard AE parameters (amplitude, energy, rise time, duration, power spectral density(PSD), peak frequency at PSD and spectral centroid) in the time and frequency domain and time-frequency domain analysis have been performed applying Gabor Wavelet Transform. The signal interpretation becomes difficult due to reflections, dispersions and mode conversions caused by close proximity of the boundaries. So, a new signal analysis parameter - frequency band energy ratio - has been proposed. This technique is able to distinguish between population of two very similar groups (in terms of size and mass and energy) of sand particles and glass beads, impacting on mild steel based on the coefficient of variation (Cv) of the frequency band AE energy ratios. To facilitate individual particle impact identification, further analysis has been performed using Support Vector Machine (SVM) based classification algorithm using 7 standard AE parameters, evaluated in both the time and frequency domain. Available data set has been segmented into two parts of training set (80%) and test set (20%). The developed model has been applied on the test data for model performance evaluation purpose. The overall success rate of individually identifying each category (PLB, Glass bead and Sand particle impacts) at S1 has been found as 86% and at S2 as 92%. To study wet particle impacts on wet target surface, in presence of bubbles, the target plate has been sealed to a cylindrical perspex tube. Single and multiple sand particles have been introduced in the system using a constant speed blower to impact the target surface under water loading. Two sensor locations, used in the previous sets of experiments, have been monitored. From frequency domain analysis it has been observed that characteristic frequency for particle impacts are centred at 300-350 kHz and for bubble formations are centred at 135 – 150 kHz. Based upon this, two frequency bands 100 – 200 kHz (E1) and 300 – 400 kHz (E3) and the frequency band energy ratio (E3E1,) have been identified as optimal for identification particle impacts for the given system. E3E1, > 1 has been associated with particle impacts and E3E1, < 1 has been associated with bubble formations. Applying these frequency band energy ratios and setting an amplitude threshold, an automatic event identification technique has been developed for identification of sand particle impacts in presence of bubbles. The method developed can be used to optimize the identification of sand particle impacts. The optimal setting of an amplitude threshold is sensitive to number of particles and noise levels. A high threshold of say 10% will clearly identify sand particle impacts but for multiparticle tests is likely to not detect about 20% of lower (impact) energy particles. A threshold lower than 3% is likely to result in detection of AE events with poor frequency content and wrong classification of the weakest events. Optimal setting of the parameters used in the framework such as thresholds, frequency bands and ratios of AE energy is likely to make identification of sand particle impacts in the laboratory environment within 10% possible. For this technique, once the optimal frequency bands and ratios have been identified, then an added advantage is that calibration of the signal levels is not required. 620
263	Otimização e análise das máquinas de vetores de suporte aplicadas à classificação de documentos. / Optimization and analysis of support vector machine applied to text classification. Eduardo Akira Kinto 17 June 2011 (has links) A análise das informações armazenadas é fundamental para qualquer tomada de decisão, mas para isso ela deve estar organizada e permitir fácil acesso. Quando temos um volume de dados muito grande, esta tarefa torna-se muito mais complicada do ponto de vista computacional. É fundamental, então, haver mecanismos eficientes para análise das informações. As Redes Neurais Artificiais (RNA), as Máquinas de Vetores-Suporte (Support Vector Machine - SVM) e outros algoritmos são frequentemente usados para esta finalidade. Neste trabalho, iremos explorar o SMO (Sequential Minimal Optimization) e alterá-lo, com a finalidade de atingir um tempo de treinamento menor, mas, ao mesmo tempo manter a capacidade de classificação. São duas as alterações propostas, uma, no seu algoritmo de treinamento e outra, na sua arquitetura. A primeira modificação do SMO proposta neste trabalho é permitir a atualização de candidatos ao vetor suporte no mesmo ciclo de atualização de um coeficiente de Lagrange. Dos algoritmos que codificam o SVM, o SMO é um dos mais rápidos e um dos que menos consome memória. A complexidade computacional do SMO é menor com relação aos demais algoritmos porque ele não trabalha com inversão de uma matriz de kernel. Esta matriz, que é quadrada, costuma ter um tamanho proporcional ao número de amostras que compõem os chamados vetores-suporte. A segunda proposta para diminuir o tempo de treinamento do SVM consiste na subdivisão ordenada do conjunto de treinamento, utilizando-se a dimensão de maior entropia. Esta subdivisão difere das abordagens tradicionais pelo fato de as amostras não serem constantemente submetidas repetidas vezes ao treinamento do SVM. Finalmente, é aplicado o SMO proposto para classificação de documentos ou textos por meio de uma abordagem nova, a classificação de uma-classe usando classificadores binários. Como toda classificação de documentos, a análise dos atributos é uma etapa fundamental, e aqui uma nova contribuição é apresentada. Utilizamos a correlação total ponto a ponto para seleção das palavras que formam o vetor de índices de palavras. / Stored data analysis is very important when taking a decision in every business, but to accomplish this task data must be organized in a way it can be easily accessed. When we have a huge amount of information, data analysis becomes a very computational hard job. So, it is essential to have an efficient mechanism for information analysis. Artificial neural networks (ANN), support vector machine (SVM) and other algorithms are frequently used for information analysis, and also in huge volume information analysis. In this work we will explore the sequential minimal optimization (SMO) algorithm, a learning algorithm for the SVM. We will modify it aiming for a lower training time and also to maintaining its classification generalization capacity. Two modifications are proposed to the SMO, one in the training algorithm and another in its architecture. The first modification to the SMO enables more than one Lagrange coefficient update by choosing the neighbor samples of the updating pair (current working set). From many options of SVM implementation, SMO was chosen because it is one of the fastest and less memory consuming one. The computational complexity of the SMO is lower than other types of SVM because it does not require handling a huge Kernel matrix. Matrix inversion is one of the most time consuming step of SVM, and its size is as bigger as the number of support vectors of the sample set. The second modification to the SMO proposes the creation of an ordered subset using as a reference one of the dimensions; entropy measure is used to choose the dimension. This subset creation is different from other division based SVM architectures because samples are not used in more than one training pair set. All this improved SVM is used on a one-class like classification task of documents. Every document classification problem needs a good feature vector (feature selection and dimensionality reduction); we propose in this work a novel feature indexing mechanism using the pointwise total correlation. Aprendizado computacional Inteligência artificial Recuperação da informação Redes neurais Artificial intelligence Artificial neural network Information retrieval Machine learning Support vector machine Text classification
264	A influência do contexto de discurso na segmentação automática das fases do gesto com aprendizado de máquina supervisionado / The influence of the speech context on the automatic segmentation of the phases of the gesture with supervised machine learning Jallysson Miranda Rocha 27 April 2018 (has links) Gestos são ações que fazem parte da comunicação humana. Frequentemente, eles ocorrem junto com a fala e podem se manifestar por uma ação proposital, como o uso das mãos para explicar o formato de um objeto, ou como um padrão de comportamento, como coçar a cabeça ou ajeitar os óculos. Os gestos ajudam o locutor a construir sua fala e também ajudam o ouvinte a compreender a mensagem que está sendo transmitida. Pesquisadores de diversas áreas são interessados em entender como se dá a relação dos gestos com outros elementos do sistema linguístico, seja para suportar estudos das áreas da Linguística e da Psicolinguística, seja para melhorar a interação homem-máquina. Há diferentes linhas de estudo que exploram essa temática e entre elas está aquela que analisa os gestos a partir de fases: preparação, pré-stroke hold, stroke, pós-stroke hold, hold e retração. Assim, faz-se útil o desenvolvimento de sistemas capazes de automatizar a segmentação de um gesto em suas fases. Técnicas de aprendizado de máquina supervisionado já foram aplicadas a este problema e resultados promissores foram obtidos. Contudo, há uma dificuldade inerente à análise das fases do gesto, a qual se manifesta na alteração do contexto em que os gestos são executados. Embora existam algumas premissas básicas para definição do padrão de manifestação de cada fase do gesto, em contextos diferentes tais premissas podem sofrer variações que levariam a análise automática para um nível de alta complexidade. Este é o problema abordado neste trabalho, a qual estudou a variabilidade do padrão inerente à cada uma das fases do gesto, com apoio de aprendizado de máquina, quando a manifestação delas se dá a partir de um mesmo indivíduo, porém em diferentes contextos de produção do discurso. Os contextos de discurso considerados neste estudo são: contação de história, improvisação, descrição de cenas, entrevistas e aulas expositivas / Gestures are actions that make part of human communication. Commonly, gestures occur at the same time as the speech and they can manifest either through an intentional act, as using the hands to explain the format of an object, or as a pattern of behavior, as scratching the head or adjusting the glasses. Gestures help the speaker to build their speech and also help the audience to understand the message being communicated. Researchers from several areas are interested in understanding what the relationship of gestures with other elements of the linguistic system is like, whether in supporting studies in Linguistics or Psycho linguistics, or in improving the human-machine interaction. There are different lines of study that explore such a subject, and among them is the line that analyzes gestures according to their phases: preparation, pre-stroke hold, stroke, post-stroke hold, hold and retraction. Thus, the development of systems capable of automating the segmentation of gestures into their phases can be useful. Techniques that implement supervised machine learning have already been applied in this problem and promising results have been achieved. However, there is an inherent difficulty to the analysis of phases of gesture that is revealed when the context (in which the gestures are performed) changes. Although there are some elementary premises to set the pattern of expression of each gesture phase, such premises may vary and lead the automatic analysis to high levels of complexity. Such an issue is addressed in the work herein, whose purpose was to study the variability of the inherent pattern of each gesture phase, using machine learning techniques, when their execution is made by the same person, but in different contexts. The contexts of discourse considered in this study are: storytelling, improvisation, description of scenes, interviews and lectures Análise de Gesto Aprendizado de Máquina Contexto de Discurso Fases do Gesto Máquina de Vetores Suporte Segmentação Automática Analysis of Gesture Automatic Segmentation Discourse Context Gesture Phases Machine Learning Support Vector Machine
265	Metodologia computacional para detecção automática de estrabismo em imagens digitais através do Teste de Hirschberg / Computational Methods for Detection Automatic Strabismus in Pictures Digital by Hirschberg's test ALMEIDA, João Dallyson Sousa de 12 February 2010 (has links) Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-08-14T17:54:04Z No. of bitstreams: 1 JoaoDallysonAlmeida.pdf: 4607146 bytes, checksum: 8e76d2b2ba34e77fcc3d20c8cfa92e17 (MD5) / Made available in DSpace on 2017-08-14T17:54:05Z (GMT). No. of bitstreams: 1 JoaoDallysonAlmeida.pdf: 4607146 bytes, checksum: 8e76d2b2ba34e77fcc3d20c8cfa92e17 (MD5) Previous issue date: 2010-02-12 / Strabismus is a pathology that affects about 4% of the population causing aesthetic problems, reversible at any age, and irreversible tensorial alterations, modifying the vision mechanism. Hirschberg's test is one of the available exams to detect such pathology. Computer Aided Diagnosis and Detection Systems have been used with relative success to help health professionals. Nevertheless, the increasingly application of high technology resources to help diagnosis and therapy in ophthalmology is not a reality in the Strabismus sub-specialty. This way, the present work has the objective of introduing a methodology for automatic detection Strabismus in digital images through Hirschberg's test. For such, it is organized in four stages: finding the region of the eyes, precise location of the eyes, limb and bright, and identi cation of Strabismus The methodology presents results of 100% of sensibility, 91,3% of specificity and 94% of match in the identification of Strabismus, comproving the eficiency of the geostatistical functions in the extraction of the texture of the eyes and of the calculations of the alignment between eyes in digital images acquired from Hirschberg's test. / O estrabismo é uma patologia que afeta cerca de 4% da população provocando problemas estéticos, reversíveis a qualquer idade, e alterações sensoriais irreversíveis, modificando o mecanismo da visão. O teste de Hirschberg é um dos tipos de exames existentes para detectar tal patologia. Sistemas de Detecção e Diagnóstico auxiliados por computador (Computer Aided Detection/Diagnosis) estão sendo usados com relativo sucesso no auxílio aos profissionais de saúde. No entanto, o emprego cada vez mais rotineiro de recursos de alta tecnologia, no auxílio diagnóstico e terapêutico em oftalmologia, não é uma realidade dentro da subespecialidade estrabismo. Sendo assim, o presente trabalho tem como objetivo apresentar uma metodologia para detectar automaticamente o estrabismo em imagens digitais através do teste de Hirschberg. Para tal, o estudo está organizado em quatro fases: localização da região dos olhos, localização precisa dos olhos, localização do limbo e do brilho, e identificação do estrabismo. A metodologia apresenta resultados de 100% de sensibilidade, 91,3% de especificidade e 94% de acerto na identificação do estrabismo comprovando a eficiência das funções geoestatísticas na extração de textura dos olhos e do cálculo da alinhamento entre os olhos em imagens digitais adquiridas a partir do teste de Hirschberg. Funções Geoestatísticas Máquina de Vetores de Suporte Método de Hirschberg Estrabismo Reconhecimento de Padrões Processamento de Imagens Image Processing Pattern Recognition Strabismus Support Vector Machine Geostatistical Functions Processamento Gráfico Engenharia Biomédica
266	Classificação de tecidos da mama em massa e não-massa usando índice de diversidade taxonômico e máquina de vetores de suporte / Classification of breast tissues in mass and non-mass using index of Taxonomic diversity and support vector machine OLIVEIRA, Fernando Soares Sérvulo de 20 February 2013 (has links) Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-08-17T17:25:58Z No. of bitstreams: 1 FernandoOliveira.pdf: 2347086 bytes, checksum: 0b2d54b7d13b7467bee9db13f63100f5 (MD5) / Made available in DSpace on 2017-08-17T17:25:58Z (GMT). No. of bitstreams: 1 FernandoOliveira.pdf: 2347086 bytes, checksum: 0b2d54b7d13b7467bee9db13f63100f5 (MD5) Previous issue date: 2013-02-20 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Breast cancer is the second most common type of cancer in the world and difficult to diagnose. Distinguished Systems Aided Detection and Diagnosis Computer have been used to assist experts in the health field with an indication of suspicious areas of difficult perception to the human eye, thus aiding in the detection and diagnosis of cancer. This dissertation proposes a methodology for discrimination and classification of regions extracted from the breast mass and non-mass. The Digital Database for Screening Mammography (DDSM) is used in this work for the acquisition of mammograms, which are extracted from the regions of mass and non-mass. The Taxonomic Diversity Index (∆) and the Taxonomic Distinctness (∆) are used to describe the texture of the regions of interest, originally applied in ecology. The calculation of those indices is based on phylogenetic trees, which applied in this work to describe patterns in regions of the images of the breast with two regions bounding approaches to texture analysis: circle with rings and internal with external masks. Suggested in this work to be applied in the description of patterns of regions in breast imaging approaches circle with rings and masks as internal and external boundaries regions for texture analysis. Support Vector Machine (SVM) is used to classify the regions in mass or non-mass. The proposed methodology provides successful results for the classification of masses and non-mass, reaching an average accuracy of 99.67%. / O câncer de mama é o segundo tipo de câncer mais frequente no mundo e de difícil diagnóstico. Distintos Sistemas de Detecção e Diagnóstico Auxiliados por Computador (Computer Aided Detection/Diagnosis) têm sido utilizados para auxiliar especialistas da área da saúde com a indicação de áreas suspeitas de difícil percepção ao olho humano, assim ajudando na detecção e diagnóstico de câncer. Este trabalho propõe uma metodologia de discriminação e classificação de regiões extraídas da mama em massa e não-massa. O banco de imagens Digital Database for Screening Mammography (DDSM) é usado neste trabalho para aquisição das mamografias, onde são extraído as regiões de massa e não-massa. Na descrição da textura da região de interesse são utilizados os Índices de Diversidade Taxonômica (∆) e Distinção Taxonômica (∆), provenientes da ecologia. O cálculo destes índices é baseado nas árvores filogenéticas, sendo aplicados neste trabalho na descrição de padrões em regiões das imagens da mama com duas abordagens de regiões delimitadoras para análise da textura: círculo com anéis e máscaras internas com externas. Para classificação das regiões em massa e não-massa é utilizado o classificador Máquina de Vetores de Suporte (MVS). A metodologia apresenta resultados promissores para a classificação de massas e não-massas, alcançando uma acurácia média de 99,67%. Mamografia Árvores Filogenéticas Índice de Diversidade Taxonômica Índice de Distinção Taxonômica Máquina de Vetores de Suporte Mammography Phylogenetic tree Taxonomic Diversity Index Taxonomic Distinctness Index Support Vector Machine Processamento Gráfico Cancerologia
267	CARACTERIZAÇÃO DE NÓDULOS PULMONARES SOLITÁRIOS UTILIZANDO ÍNDICE DE SIMPSON E MÁQUINA DE VETORES DE SUPORTE. / CHARACTERIZATION OF SOLID PULMONARY NODULES USING SIMPSON INDEX AND VECTOR MACHINE SUPPORT. SILVA, Cleriston Araújo da 12 February 2009 (has links) Submitted by Maria Aparecida (cidazen@gmail.com) on 2017-08-18T14:02:37Z No. of bitstreams: 1 cleriston.pdf: 1605933 bytes, checksum: c1faa5f854c1a9debfbaa1affc5ab4ad (MD5) / Made available in DSpace on 2017-08-18T14:02:37Z (GMT). No. of bitstreams: 1 cleriston.pdf: 1605933 bytes, checksum: c1faa5f854c1a9debfbaa1affc5ab4ad (MD5) Previous issue date: 2009-02-12 / The diagnosis of lung nodules has been constantly looked for by researchers as a way to minimize the high global mortality indices related to lung cancer. The usage of medical images, such as Computerized Tomography, has made possible the deepening and the improvement of techniques used to evaluate exams and provide diagnosis. This work presents a methodology for diagnosing single lung nodules that can be an aid for studies performed on similar areas and for specialists. This methodology was applied to two different image databases. The representation of the nodules was done with extraction of geometry and texture features, being the last one described through Simpson’s Index, a statistic used in Spatial Analysis and in Ecology. These features were submitted to the Support Vector Machine classifier (SVM) in two approaches: the traditional approach and the approach by using One Class. With the traditional SVM approach, we have obtained sensibility rates of 90%, specificity of 96.67% and accuracy of 95%. Using One Class SVM, the obtained rates were: sensibility of 89.7%, specificity of 89.7% and accuracy of 89.7%. / O diagnóstico de nódulos pulmonares tem sido buscado constantemente por pesquisadores como forma de amenizar os altos índices de mortalidade mundial relacionado ao câncer de pulmão. O uso de imagens médicas, como a Tomografia Computadorizada, tem possibilitado um aprofundamento e melhoramento de técnicas para avaliar exames e prover diagnósticos. Este trabalho apresenta uma metodologia para diagnóstico de nódulos pulmonares solitários que possa servir como um auxílio para estudos realizados em áreas afins e para especialistas. Esta metodologia foi aplicada a duas diferentes bases de dados de imagens. A representação dos nódulos foi feita com a extração de medidas de geometria e de textura sendo esta última descrita através do Índice de Simpson, uma estatística utilizada na Análise Espacial e na Ecologia. Essas medidas foram submetidas ao classificador Máquina de Vetores de Suporte - MVS em duas abordagens: a abordagem tradicional e abordagem usando uma classe. Com abordagem MVS tradicional, obtiveramse taxas de sensibilidade de 90%, especificidade de 96,67% e acurácia de 95%. Usando MVS de uma classe, as taxas obtidas foram: sensibilidade igual a 89,7%, especificidade igual a 89,7% e acurácia igual a 89,7%. Modelos Analíticos e de Simulação
268	Aplicação de classificadores para determinação de conformidade de biodiesel / Attesting compliance of biodiesel quality using classification methods LOPES, Marcus Vinicius de Sousa 26 July 2017 (has links) Submitted by Rosivalda Pereira (mrs.pereira@ufma.br) on 2017-09-04T17:47:07Z No. of bitstreams: 1 MarcusLopes.pdf: 2085041 bytes, checksum: 14f6f9bbe0d5b050a23103874af8c783 (MD5) / Made available in DSpace on 2017-09-04T17:47:07Z (GMT). No. of bitstreams: 1 MarcusLopes.pdf: 2085041 bytes, checksum: 14f6f9bbe0d5b050a23103874af8c783 (MD5) Previous issue date: 2017-07-26 / The growing demand for energy and the limitations of oil reserves have led to the search for renewable and sustainable energy sources to replace, even partially, fossil fuels. Biodiesel has become in last decades the main alternative to petroleum diesel. Its quality is evaluated by given parameters and specifications which vary according to country or region like, for example, in Europe (EN 14214), US (ASTM D6751) and Brazil (RANP 45/2014), among others. Some of these parameters are intrinsically related to the composition of fatty acid methyl esters (FAMEs) of biodiesel, such as viscosity, density, oxidative stability and iodine value, which allows to relate the behavior of these properties with the size of the carbon chain and the presence of unsaturation in the molecules. In the present work four methods for direct classification (support vector machine, K-nearest neighbors, decision tree classifier and artificial neural networks) were optimized and compared to classify biodiesel samples according to their compliance to viscosity, density, oxidative stability and iodine value, having as input the composition of fatty acid methyl esters, since those parameters are intrinsically related to composition of biodiesel. The classifi- cations were carried out under the specifications of standards EN 14214, ASTM D6751 and RANP 45/2014. A comparison between these methods of direct classification and empirical equations (indirect classification) distinguished positively the direct classification methods in the problem addressed, especially when the biodiesel samples have properties values very close to the limits of the considered specifications. / A demanda crescente por fontes de energia renováveis e como alternativa aos combustíveis fósseis tornam o biodiesel como uma das principais alternativas para substituição dos derivados do petróleo. O controle da qualidade do biodiesel durante processo de produção e distribuição é extremamente importante para garantir um combustível com qualidade confiável e com desempenho satisfatório para o usuário final. O biodiesel é caracterizado pela medição de determinadas propriedades de acordo com normas internacionais. A utilização de métodos de aprendizagem de máquina para a caracterização do biodiesel permite economia de tempo e dinheiro. Neste trabalho é mostrado que para a determinação da conformidade de um biodiesel os classificadores SVM, KNN e Árvore de decisões apresentam melhores resultados que os métodos de predição de trabalhos anteriores. Para as propriedades de viscosidade densidade, índice de iodo e estabilidade oxidativa (RANP 45/2014, EN14214:2014 e ASTM D6751-15) os classificadores KNN e Árvore de decisões apresentaram-se como melhores opções. Estes resultados mostram que os classificadores podem ser aplicados de forma prática visando economia de tempo, recursos financeiros e humanos. Biodiesel Parâmetros de qualidade Máquina de vetor de suporte K-vizinhos próximos Árvore de Decisões Quality Parameters Support Vector Machine K-Nearest Neighbors Decision Tree Classifier Sistemas de Informação
269	Image analysis and representation for textile design classification Jia, Wei January 2011 (has links) A good image representation is vital for image comparision and classification; it may affect the classification accuracy and efficiency. The purpose of this thesis was to explore novel and appropriate image representations. Another aim was to investigate these representations for image classification. Finally, novel features were examined for improving image classification accuracy. Images of interest to this thesis were textile design images. The motivation of analysing textile design images is to help designers browse images, fuel their creativity, and improve their design efficiency. In recent years, bag-of-words model has been shown to be a good base for image representation, and there have been many attempts to go beyond this representation. Bag-of-words models have been used frequently in the classification of image data, due to good performance and simplicity. “Words” in images can have different definitions and are obtained through steps of feature detection, feature description, and codeword calculation. The model represents an image as an orderless collection of local features. However, discarding the spatial relationships of local features limits the power of this model. This thesis exploited novel image representations, bag of shapes and region label graphs models, which were based on bag-of-words model. In both models, an image was represented by a collection of segmented regions, and each region was described by shape descriptors. In the latter model, graphs were constructed to capture the spatial information between groups of segmented regions and graph features were calculated based on some graph theory. Novel elements include use of MRFs to extract printed designs and woven patterns from textile images, utilisation of the extractions to form bag of shapes models, and construction of region label graphs to capture the spatial information. The extraction of textile designs was formulated as a pixel labelling problem. Algorithms for MRF optimisation and re-estimation were described and evaluated. A method for quantitative evaluation was presented and used to compare the performance of MRFs optimised using alpha-expansion and iterated conditional modes (ICM), both with and without parameter re-estimation. The results were used in the formation of the bag of shapes and region label graphs models. Bag of shapes model was a collection of MRFs' segmented regions, and the shape of each region was described with generic Fourier descriptors. Each image was represented as a bag of shapes. A simple yet competitive classification scheme based on nearest neighbour class-based matching was used. Classification performance was compared to that obtained when using bags of SIFT features. To capture the spatial information, region label graphs were constructed to obtain graph features. Regions with the same label were treated as a group and each group was associated uniquely with a vertex in an undirected, weighted graph. Each region group was represented as a bag of shape descriptors. Edges in the graph denoted either the extent to which the groups' regions were spatially adjacent or the dissimilarity of their respective bags of shapes. Series of unweighted graphs were obtained by removing edges in order of weight. Finally, an image was represented using its shape descriptors along with features derived from the chromatic numbers or domination numbers of the unweighted graphs and their complements. Linear SVM classifiers were used for classification. Experiments were implemented on data from Liberty Art Fabrics, which consisted of more than 10,000 complicated images mainly of printed textile designs and woven patterns. Experimental data was classified into seven classes manually by assigning each image a text descriptor based on content or design type. The seven classes were floral, paisley, stripe, leaf, geometric, spot, and check. The result showed that reasonable and interesting regions were obtained from MRF segmentation in which alpha-expansion with parameter re-estimation performs better than alpha-expansion without parameter re-estimation or ICM. This result was not only promising for textile CAD (Computer-Aided Design) to redesign the textile image, but also for image representation. It was also found that bag of shapes model based on MRF segmentation can obtain comparable classification accuracy with bag of SIFT features in the framework of nearest neighbour class-based matching. Finally, the result indicated that incorporation of graph features extracted by constructing region label graphs can improve the classification accuracy compared to both bag of shapes model and bag of SIFT models. 004
270	Feature Extraction and Dimensionality Reduction in Pattern Recognition and Their Application in Speech Recognition Wang, Xuechuan, n/a January 2003 (has links) Conventional pattern recognition systems have two components: feature analysis and pattern classification. Feature analysis is achieved in two steps: parameter extraction step and feature extraction step. In the parameter extraction step, information relevant for pattern classification is extracted from the input data in the form of parameter vector. In the feature extraction step, the parameter vector is transformed to a feature vector. Feature extraction can be conducted independently or jointly with either parameter extraction or classification. Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) are the two popular independent feature extraction algorithms. Both of them extract features by projecting the parameter vectors into a new feature space through a linear transformation matrix. But they optimize the transformation matrix with different intentions. PCA optimizes the transformation matrix by finding the largest variations in the original feature space. LDA pursues the largest ratio of between-class variation and within-class variation when projecting the original feature space to a subspace. The drawback of independent feature extraction algorithms is that their optimization criteria are different from the classifiers minimum classification error criterion, which may cause inconsistency between feature extraction and the classification stages of a pattern recognizer and consequently, degrade the performance of classifiers. A direct way to overcome this problem is to conduct feature extraction and classification jointly with a consistent criterion. Minimum classification Error (MCE) training algorithm provides such an integrated framework. MCE algorithm was first proposed for optimizing classifiers. It is a type of discriminative learning algorithm but achieves minimum classification error directly. The flexibility of the framework of MCE algorithm makes it convenient to conduct feature extraction and classification jointly. Conventional feature extraction and pattern classification algorithms, LDA, PCA, MCE training algorithm, minimum distance classifier, likelihood classifier and Bayesian classifier, are linear algorithms. The advantage of linear algorithms is their simplicity and ability to reduce feature dimensionalities. However, they have the limitation that the decision boundaries generated are linear and have little computational flexibility. SVM is a recently developed integrated pattern classification algorithm with non-linear formulation. It is based on the idea that the classification that a.ords dot-products can be computed efficiently in higher dimensional feature spaces. The classes which are not linearly separable in the original parametric space can be linearly separated in the higher dimensional feature space. Because of this, SVM has the advantage that it can handle the classes with complex nonlinear decision boundaries. However, SVM is a highly integrated and closed pattern classification system. It is very difficult to adopt feature extraction into SVMs framework. Thus SVM is unable to conduct feature extraction tasks. This thesis investigates LDA and PCA for feature extraction and dimensionality reduction and proposes the application of MCE training algorithms for joint feature extraction and classification tasks. A generalized MCE (GMCE) training algorithm is proposed to mend the shortcomings of the MCE training algorithms in joint feature and classification tasks. SVM, as a non-linear pattern classification system is also investigated in this thesis. A reduced-dimensional SVM (RDSVM) is proposed to enable SVM to conduct feature extraction and classification jointly. All of the investigated and proposed algorithms are tested and compared firstly on a number of small databases, such as Deterding Vowels Database, Fishers IRIS database and Germans GLASS database. Then they are tested in a large-scale speech recognition experiment based on TIMIT database. feature extraction dimensionality pattern recognition speech recognition Linear Discriminant Analysis LDA Principal Component Analysis PCA Minimum Classification Error algorithm MCE Support Vector Machine SVM

Search results