Global ETD Search

291	Κατασκευή μικροϋπολογιστικού συστήματος διαχωρισμού σημάτων με τον αλγόριθμο ICA Χονδρός, Παναγιώτης 13 October 2013 (has links) Η διπλωματική εργασία αυτή αφορά την κατασκευή ενός μικροϋπολογιστικού συστήματος διαχωρισμού σημάτων. Ο διαχωρισμός των σημάτων γίνεται με βάση τη θεωρία της τεχνικής της Ανάλυσης Ανεξάρτητων Συνιστωσών. Αφού παρουσιαστεί η θεωρία της τεχνικής, παρουσιάζεται ο μικροελεγκτής ADuC 7026 που επελέγη για την υλοποίηση. Στη συνέχεια γίνεται η παρουσίαση του λογισμικού προσομοίωσης του μικροελεγκτή και παρατίθενται βασικά παραδείγματα για τον προγραμματισμό του. Τέλος, αναπτύσσονται, χωρίς τη χρήση περιφερειακών, και προσομοιώνονται, με τη χρήση περιφερειακών τρεις αλγόριθμοι, δυο εκδόσεις του FastICA και μια έκδοση του InfoMax. Οι αλγόριθμοι αυτοί αξιολογούνται ως προς τις επιδόσεις τους και εξάγονται τα συμπεράσματα. / This thesis deals with the construction of a microcomputer system to separate signals. The separation of the signals is based on the theory of the technique of Independent Component Analysis. The theory of the technique and the microcontroller ADuC 7026 chosen for implementation are presented. Then, follows the presentation of the software on which the microcontroller is simulated and basic examples of its programming are mentioned. Finally, three algorithms, two versions of FastICA and a version of InfoMax, are developed without the use of peripheral systems and simulated using peripheral systems. These algorithms are evaluated for their performance and conclusions are drawn. Μικροελεγκτές Διαχωρισμός σημάτων Τυφλός διαχωρισμός 621.382 2 Microcontrollers Signal separation Independent components Blind source separation uVision 4 ICA Feature extraction ARM 7 BSS
292	Single-trial classification of an EEG-based brain computer interface using the wavelet packet decomposition and cepstral analysis Lodder, Shaun 12 1900 (has links) Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2009. / ENGLISH ABSTRACT: Brain-Computer Interface (BCI) monitors brain activity by using signals such as EEG, EcOG, and MEG, and attempts to bridge the gap between thoughts and actions by providing control to physical devices that range from wheelchairs to computers. A crucial process for a BCI system is feature extraction, and many studies have been undertaken to find relevant information from a set of input signals. This thesis investigated feature extraction from EEG signals using two different approaches. Wavelet packet decomposition was used to extract information from the signals in their frequency domain, and cepstral analysis was used to search for relevant information in the cepstral domain. A BCI was implemented to evaluate the two approaches, and three classification techniques contributed to finding the effectiveness of each feature type. Data containing two-class motor imagery was used for testing, and the BCI was compared to some of the other systems currently available. Results indicate that both approaches investigated were effective in producing separable features, and, with further work, can be used for the classification of trials based on a paradigm exploiting motor imagery as a means of control. / AFRIKAANSE OPSOMMING: ’n Brein-Rekenaar Koppelvlak (BRK) monitor brein aktiwiteit deur gebruik te maak van seine soos EEG, EcOG, en MEG. Dit poog om die gaping tussen gedagtes en fisiese aksies te oorbrug deur beheer aan toestelle soos rolstoele en rekenaars te verskaf. ’n Noodsaaklike proses vir ’n BRK is die ontginning van toepaslike inligting uit inset-seine, wat kan help om tussen verskillende gedagtes te onderskei. Vele studies is al onderneem oor hoe om sulke inligting te vind. Hierdie tesis ondersoek die ontginning van kenmerk-vektore in EEG-seine deur twee verskillende benaderings. Die eerste hiervan is golfies pakkie ontleding, ’n metode wat gebruik word om die sein in die frekwensie gebied voor te stel. Die tweede benadering gebruik kepstrale analise en soek vir toepaslike inligting in die kepstrale domein. ’n BRK is geïmplementeer om beide metodes te evalueer. Die toetsdata wat gebruik is, het bestaan uit twee-klas motoriese verbeelde bewegings, en drie klassifikasie-tegnieke was gebruik om die doeltreffendheid van die twee metodes te evalueer. Die BRK is vergelyk met ander stelsels wat tans beskikbaar is, en resultate dui daarop dat beide metodes doeltreffend was. Met verdere navorsing besit hulle dus die potensiaal om gebruik te word in stelsels wat gebruik maak van motoriese verbeelde bewegings om fisiese toestelle te beheer. Feature extraction Wavelet packet decomposition (WPD) Cepstral analysis Brain-computer interfaces Wavelets (Mathematics) Electroencephalography Dissertations -- Electronic engineering Theses -- Electronic engineering Electrical and Electronic Engineering
293	Arabic text recognition of printed manuscripts : efficient recognition of off-line printed Arabic text using Hidden Markov Models, Bigram Statistical Language Model, and post-processing Al-Muhtaseb, Husni Abdulghani January 2010 (has links) Arabic text recognition was not researched as thoroughly as other natural languages. The need for automatic Arabic text recognition is clear. In addition to the traditional applications like postal address reading, check verification in banks, and office automation, there is a large interest in searching scanned documents that are available on the internet and for searching handwritten manuscripts. Other possible applications are building digital libraries, recognizing text on digitized maps, recognizing vehicle license plates, using it as first phase in text readers for visually impaired people and understanding filled forms. This research work aims to contribute to the current research in the field of optical character recognition (OCR) of printed Arabic text by developing novel techniques and schemes to advance the performance of the state of the art Arabic OCR systems. Statistical and analytical analysis for Arabic Text was carried out to estimate the probabilities of occurrences of Arabic character for use with Hidden Markov models (HMM) and other techniques. Since there is no publicly available dataset for printed Arabic text for recognition purposes it was decided to create one. In addition, a minimal Arabic script is proposed. The proposed script contains all basic shapes of Arabic letters. The script provides efficient representation for Arabic text in terms of effort and time. Based on the success of using HMM for speech and text recognition, the use of HMM for the automatic recognition of Arabic text was investigated. The HMM technique adapts to noise and font variations and does not require word or character segmentation of Arabic line images. In the feature extraction phase, experiments were conducted with a number of different features to investigate their suitability for HMM. Finally, a novel set of features, which resulted in high recognition rates for different fonts, was selected. The developed techniques do not need word or character segmentation before the classification phase as segmentation is a byproduct of recognition. This seems to be the most advantageous feature of using HMM for Arabic text as segmentation tends to produce errors which are usually propagated to the classification phase. Eight different Arabic fonts were used in the classification phase. The recognition rates were in the range from 98% to 99.9% depending on the used fonts. As far as we know, these are new results in their context. Moreover, the proposed technique could be used for other languages. A proof-of-concept experiment was conducted on English characters with a recognition rate of 98.9% using the same HMM setup. The same techniques where conducted on Bangla characters with a recognition rate above 95%. Moreover, the recognition of printed Arabic text with multi-fonts was also conducted using the same technique. Fonts were categorized into different groups. New high recognition results were achieved. To enhance the recognition rate further, a post-processing module was developed to correct the OCR output through character level post-processing and word level post-processing. The use of this module increased the accuracy of the recognition rate by more than 1%. 005.3
294	A cell level automated approach for quantifying antibody staining in immunohistochemistry images : a structural approach for quantifying antibody staining in colonic cancer spheroid images by integrating image processing and machine learning towards the implementation of computer aided scoring of cancer markers Khorshed, Reema A. A. January 2013 (has links) Immunohistological (IHC) stained images occupy a fundamental role in the pathologist's diagnosis and monitoring of cancer development. The manual process of monitoring such images is a subjective, time consuming process that typically relies on the visual ability and experience level of the pathologist. A novel and comprehensive system for the automated quantification of antibody inside stained cell nuclei in immunohistochemistry images is proposed and demonstrated in this research. The system is based on a cellular level approach, where each nucleus is individually analyzed to observe the effects of protein antibodies inside the nuclei. The system provides three main quantitative descriptions of stained nuclei. The first quantitative measurement automatically generates the total number of cell nuclei in an image. The second measure classifies the positive and negative stained nuclei based on the nuclei colour, morphological and textural features. Such features are extracted directly from each nucleus to provide discriminative characteristics of different stained nuclei. The output generated from the first and second quantitative measures are used collectively to calculate the percentage of positive nuclei (PS). The third measure proposes a novel automated method for determining the staining intensity level of positive nuclei or what is known as the intensity score (IS). The minor intensity features are observed and used to classify low, intermediate and high stained positive nuclei. Statistical methods were applied throughout the research to validate the system results against the ground truth pathology data. Experimental results demonstrate the effectiveness of the proposed approach and provide high accuracy when compared to the ground truth pathology data. 616.99
295	Efficient FPGA Architectures for Separable Filters and Logarithmic Multipliers and Automation of Fish Feature Extraction Using Gabor Filters Joginipelly, Arjun Kumar 13 August 2014 (has links) Convolution and multiplication operations in the filtering process can be optimized by minimizing the resource utilization using Field Programmable Gate Arrays (FPGA) and separable filter kernels. An FPGA architecture for separable convolution is proposed to achieve reduction of on-chip resource utilization and external memory bandwidth for a given processing rate of the convolution unit. Multiplication in integer number system can be optimized in terms of resources, operation time and power consumption by converting to logarithmic domain. To achieve this, a method altering the filter weights is proposed and implemented for error reduction. The results obtained depict significant error reduction when compared to existing methods, thereby optimizing the multiplication in terms of the above mentioned metrics. Underwater video and still images are used by many programs within National Oceanic Atmospheric and Administration (NOAA) fisheries with the objective of identifying, classifying and quantifying living marine resources. They use underwater cameras to get video recording data for manual analysis. This process of manual analysis is labour intensive, time consuming and error prone. An efficient solution for this problem is proposed which uses Gabor filters for feature extraction. The proposed method is implemented to identify two species of fish namely Epinephelus morio and Ocyurus chrysurus. The results show higher rate of detection with minimal rate of false alarms. Computer Engineering Digital Circuits Electrical and Electronics
296	Détection et caractérisation d'attributs géométriques sur les corps rocheux du système solaire / Detection and characterization of geometric features on rocky surfaces on the solar system Christoff Vesselinova, Nicole 19 December 2018 (has links) L’un des défis de la science planétaire est la détermination de l’âge des surfaces des différents corps célestes du système solaire, pour comprendre leurs processus de formation et d’évolution. Une approche repose sur l’analyse de la densité et de la taille des cratères d’impact. En raison de l’énorme quantité de données à traiter, des approches automatiques ont été proposées pour détecter les cratères d’impact afin de faciliter ce processus de datation. Ils utilisent généralement les valeurs de couleur des images ou les valeurs d’altitude de "modèles numériques d’élévation" (DEM). Dans cette thèse, nous proposons une nouvelle approche pour détecter les bords des cratères. L’idée principale est de combiner l’analyse de la courbure avec une classification basée sur un réseau de neurones. Cette approche comporte deux étapes principales : premièrement, chaque sommet du maillage est étiqueté avec la valeur de la courbure minimale; deuxièmement, cette carte de courbure est injectée dans un réseau de neurones pour détecter automatiquement les formes d’intérêt. Les résultats montrent que la détection des formes est plus efficace en utilisant une carte en deux dimensions s’appuyant sur le calcul d’estimateurs différentiels discrets, plutôt qu’en utilisant la valeur de l’élévation en chaque sommet. Cette approche réduit significativement le nombre de faux négatifs par rapport aux approches précédentes basées uniquement sur une information topographique. La validation de la méthode est effectuée sur des DEM de Mars, acquis par un altimètre laser à bord de la sonde spatiale "Mars Global Surveyor" de la NASA et combinés avec une base de données de cratères identifiés manuellement. / One of the challenges of planetary science is the age determination of the surfaces of the different celestial bodies in the solar system, to understand their formation and evolution processes. An approach relies on the analysis of the crater impact density and size. Due to the huge quantity of data to process, automatic approaches have been proposed for automatically detecting impact craters in order to facilitate this dating process. They generally use the color values from images or the elevation values from Digital Elevation Model (DEM). In this PhD thesis, we propose a new approach for detecting craters rims. The main idea is to combine curvature analysis with Neural Network based classification. This approach contains two main steps: first, each vertex of the mesh is labeled with the value of the minimal curvature; second, this curvature map is injected into a neural network to automatically detect the shapes of interest. The results show that detecting forms are more efficient using a two-dimensional map based on the computation of discrete differential estimators, than by the value of the elevation at each vertex. This approach significantly reduces the number of false negatives compared to previous approaches based on topographic information only. The validation of the method is performed on DEMs of Mars, acquired by a laser altimeter aboard NASA’s Mars Global Surveyor spacecraft and combined with a database of manually identified craters. Maillages 3D Courbures discrètes Réseau de neurones Détection de cratères Mars 3D meshes Automatic feature extraction Discrete curvatures Neural networks Crater detection Mars 510
297	Classificação de imagens de plâncton usando múltiplas segmentações / Plankton image classification using multiple segmentations Fernandez, Mariela Atausinchi 27 March 2017 (has links) Plâncton são organismos microscópicos que constituem a base da cadeia alimentar de ecossistemas aquáticos. Eles têm importante papel no ciclo do carbono pois são os responsáveis pela absorção do carbono na superfície dos oceanos. Detectar, estimar e monitorar a distribuição das diferentes espécies são atividades importantes para se compreender o papel do plâncton e as consequências decorrentes de alterações em seu ambiente. Parte dos estudos deste tipo é baseada no uso de técnicas de imageamento de volumes de água. Devido à grande quantidade de imagens que são geradas, métodos computacionais para auxiliar no processo de análise das imagens estão sob demanda. Neste trabalho abordamos o problema de identificação da espécie. Adotamos o pipeline convencional que consiste dos passos de detecção de alvo, segmentação (delineação de contorno), extração de características, e classificação. Na primeira parte deste trabalho abordamos o problema de escolha de um algoritmo de segmentação adequado. Uma vez que a avaliação de resultados de segmentação é subjetiva e demorada, propomos um método para avaliar algoritmos de segmentação por meio da avaliação da classificação no final do pipeline. Experimentos com esse método mostraram que algoritmos de segmentação distintos podem ser adequados para a identificação de espécies de classes distintas. Portanto, na segunda parte do trabalho propomos um método de classificação que leva em consideração múltiplas segmentações. Especificamente, múltiplas segmentações são calculadas e classificadores são treinados individualmente para cada segmentação, os quais são então combinados para construir o classificador final. Resultados experimentais mostram que a acurácia obtida com a combinação de classificadores é superior em mais de 2% à acurácia obtida com classificadores usando uma segmentação fixa. Os métodos propostos podem ser úteis para a construção de sistemas de identificação de plâncton que sejam capazes de se ajustar rapidamente às mudanças nas características das imagens. / Plankton are microscopic organisms that constitute the basis of the food chain of aquatic ecosystems. They have an important role in the carbon cycle as they are responsible for the absorption of carbon in the ocean surfaces. Detecting, estimating and monitoring the distribution of plankton species are important activities for understanding the role of plankton and the consequences of changes in their environment. Part of these type of studies is based on the analysis of water volumes by means of imaging techniques. Due to the large quantity of generated images, computational methods for helping the process of image analysis are in demand. In this work we address the problem of species identification. We follow the conventional pipeline consisting of target detection, segmentation (contour delineation), feature extraction, and classification steps. In the first part of this work we address the problem of choosing an appropriate segmentation algorithm. Since evaluating segmentation results is a subjective and time consuming task, we propose a method to evaluate segmentation algorithms by evaluating the classification results at the end of the pipeline. Experiments with this method showed that distinct segmentation algorithms might be appropriate for identifying species of distinct classes. Therefore, in the second part of this work we propose a classification method that takes into consideration multiple segmentations. Specifically, multiple segmentations are computed and classifiers are trained individually for each segmentation, which are then combined to build the final classifier. Experimental results show that the accuracy obtained with the combined classifier is superior in more than 2% to the accuracy obtained with classifiers using a fixed segmentation. The proposed methods can be useful to build plankton identification systems that are able to quickly adjust to changes in the characteristics of the images. Classificação de imagens de plâncton Detecção de plâncton Detection of plankton Extração de características Feature extraction Plankton image classification Plankton image segmentation Segmentação de imagens de plâncton Segmentation algorithms assessment
298	Caracterização de eventos transitórios da qualidade da energia elétrica utilizando sistemas inteligentes e processamento de sinais. / Characterization of power quality transient events using Intelligent systems and signal processing. Vega García, Valdomiro 12 December 2012 (has links) O diagnóstico de eventos que afetam a qualidade da energia elétrica tem se tornado preocupação de magnitude mundial, em especial em dois temas importantes que são: a localização relativa da origem do evento (LROE) e a classificação automática da causa fundamental de eventos (CACFE). O primeiro está relacionado com a identificação da fonte do evento, isto é, a montante ou a jusante do medidor de qualidade de energia (MQE). O segundo pode ser dividido em dois grupos: a classificação das causas internas e das causas externas. As causas internas estão relacionadas a eventos produzidos pela operação do sistema elétrico (energização ou desenergização do sistema, energização de transformador, chaveamento de capacitores dentre outros), e as causas externas estão vinculadas a eventos produzidos por faltas externas ao sistema elétrico (contato com galhos de árvore, animais, descargas atmosféricas, dentre outros). Ambos os temas, LROE e CACFE, são abordados nesta tese de doutorado. Para classificar eventos por causas internas ou externas é necessário antes definir se realmente trata-se ou não de um evento, para o qual é imprescindível conhecer a LROE. Este último necessita de um processo de segmentação das formas de onda de tensão e corrente para funcionar de forma correta. A segmentação identifica segmentos transitórios e não transitórios nas formas de onda e contribui também na extração de características para os diferentes algoritmos de classificação. Neste sentido, neste trabalho de pesquisa é proposta uma metodologia de diagnóstico da qualidade de eventos, focada em LROE e CACFE. Para isto foram desenvolvidos diferentes algoritmos de segmentação, extração de características e classificação, sendo criada uma ferramenta computacional em MatLab® que inclui pré-processamento de sinais de tensão e corrente de um banco de dados real fornecido por uma concessionária do Estado de São Paulo. Além disto, foram propostos novos algoritmos de LROE com resultados satisfatórios quando comparados com outros dois disponíveis na literatura científica. Para as causas internas, dois novos índices são propostos para separar eventos produzidos por faltas e energização de transformadores. Finalmente, são propostos novos algoritmos de extração de características baseados na energia dos coeficientes de decomposição da transformada wavelet bem como o algoritmo à trous modificado. São propostos dois novos vetores de descritores de energia (VDE) baseados no primeiro segmento transitório do evento. Para a classificação destes eventos foi utilizado um algoritmo de indução de regras de decisão (CN2), que gera regras de simples implementação. Todos os métodos de classificação utilizados nesta tese estão baseados em regras, sendo seu desempenho avaliado por meio da matriz de confusão. / Diagnosing events that affect power quality have become a worldwide concern, especially with respect to two important issues related to the relative location of the event origin (RLEO) and automatic cause classification of events (ACCE). The first one is related to the identification of the event source, i.e. either upstream or downstream in relation to the power quality meter (PQM). The second one can be subdivided into two groups, namely the classification of internal causes and of external causes. Internal causes are related to events produced by power system operation (connection or disconnection of feeders, power transformer inrush, capacitor switching, amongst others) and external causes that are related to events produced by external faults to the power system (network contacts to tree branches, animals contact, atmospheric discharges, amongst others). Both topics, RLEO and ACCE, are herein considered. In order to classify events due to internal or external causes, one should first define whether it is an actual event, what demands the RLEO. This makes use of a segmentation process applied to the voltage and current waveforms. The segmentation identifies the transient and stationary segments within the waveforms, contributing also to the feature extraction for different classification algorithms. Based on the aforementioned, this research proposes a methodology to diagnose power quality events, focusing on RLEO and ACCE. Different algorithms of segmentation, feature extraction and classification were then developed by the use of a computational tool implemented in MatLab®, that considers also the preprocessing of voltage and current signals in a real data base which was made available by a distribution company in Sao Paulo State. Besides that, new RLEO algorithms have shown satisfactory results when compared to algorithms published in the scientific literature. As for the internal causes, two new indices were proposed in order to separate events produced by faults or by the connection of power transformers. New algorithms for feature extraction are proposed, which are based on the energy of decomposition coefficients of the wavelet transform as well as the modified à trous algorithm. Two vectors of energy descriptors are proposed, which are based on the first transient segment of the event. The classification of such events was carried out by an induction algorithm of decision rules (CN2), that generates easily implementable rules. All classification methods utilized in this thesis are based on rules and their performances are assessed by the confusion matrix. À trous algorithm Análise de ondaletas Energia (qualidade) Feature extraction Power quality Processamento de sinais elétricos Rule-based systems Signal processing Sistemas baseados em regras Wavelet transform
299	Mineração de imagens médicas utilizando características de forma / Medical image supported by shape features Costa, Alceu Ferraz 10 April 2012 (has links) Bases de imagens armazenadas em sistemas computacionais da área médica correspondem a uma valiosa fonte de conhecimento. Assim, a mineração de imagens pode ser aplicada para extrair conhecimento destas bases com o propósito de apoiar o diagnóstico auxiliado por computador (Computer Aided Diagnosis - CAD). Sistemas CAD apoiados por mineração de imagens tipicamente realizam a extração de características visuais relevantes das imagens. Essas características são organizadas na forma de vetores de características que representam as imagens e são utilizados como entrada para classificadores. Devido ao problema conhecido como lacuna semântica, que corresponde à diferença entre a percepção da imagem pelo especialista médico e suas características automaticamente extraídas, um aspecto desafiador do CAD é a obtenção de um conjunto de características que seja capaz de representar de maneira sucinta e eficiente o conteúdo visual de imagens médicas. Foi desenvolvido neste trabalho o extrator de características FFS (Fast Fractal Stack) que realiza a extração de características de forma, que é um atributo visual que aproxima a semântica esperada pelo ser humano. Adicionalmente, foi desenvolvido o algoritmo de classificação Concept, que emprega mineração de regras de associação para predizer a classe de uma imagem. O aspecto inovador do Concept refere-se ao algoritmo de obtenção de representações de imagens, denominado MFS-Map (Multi Feature Space Map) e também desenvolvido neste trabalho. O MFS-Map realiza agrupamento de dados em diferentes espaços de características para melhor aproveitar as características extraídas no processo de classificação. Os experimentos realizados para imagens de tomografia pulmonar e mamografias indicam que tanto o FFS como a abordagem de representação adotada pelo Concept podem contribuir para o aprimoramento de sistemas CAD / Medical image databases represent a valuable source of data from which potential knowledge can be extracted. Image mining can be applied to knowledge discover from these data in order to help CAD (Computer Aided Diagnosis) systems. The typical set-up of a CAD system consists in the extraction of relevant visual features in the form of image feature vectors that are used as input to a classifier. Due to the semantic gap problem, which corresponds to the difference between the humans image perception and the features automatically extracted from the image, a challenging aspect of CAD is to obtain a set of features that is able to succinctly and efficiently represent the visual contents of medical images. To deal with this problem it was developed in this work a new feature extraction method entitled Fast Fractal Stack (FFS). FFS extracts shape features from objects and structures, which is a visual attribute that approximates the semantics expected by humans. Additionally, it was developed the Concept classification method, which employs association rules mining to the task of image class prediction. The innovative aspect of Concept refers to its image representation algorithm termed MFS-Map (Multi Feature Space Map). MFS-Map employs clustering in different feature spaces to maximize features usefulness in the classification process. Experiments performed employing computed tomography and mammography images indicate that both FFS and Concept methods for image representation can contribute to the improvement of CAD systems Classificação de imagens Computed aided diagnosis (CAD) Diagnóstico auxiliado por computador Extração de características Feature extraction Image classification Image mining Imagens médicas Medical imaging Mineração de imagens
300	Filtragem baseada em conteúdo auxiliada por métodos de indexação colaborativa / Content-based filtering aided by collaborative indexing methods D\'Addio, Rafael Martins 10 June 2015 (has links) Sistemas de recomendação surgiram da necessidade de selecionar e apresentar conteúdo relevante a usuários de acordo com suas preferências. Dentre os diversos métodos existentes, aqueles baseados em conteúdo faz em uso exclusivo da informação inerente aos itens. Estas informações podem ser criadas a partir de técnicas de indexação automática e manual. Enquanto que as abordagens automáticas necessitam de maiores recursos computacionais e são limitadas á tarefa específica que desempenham, os métodos manuais são caros e propensos a erros. Por outro lado, com a expansão da Web e a possibilidade de usuários comuns criarem novos conteúdos e anotações sobre diferentes itens e produtos, uma alternativa é obter esses metadados criados colaborativamente pelos próprios usuários. Entretanto, essas informações, em especial revisões e comentários, podem conter ruídos, além de estarem em uma forma desestruturada. Deste modo, este trabalho1 tem como objetivo desenvolver métodos de construção de representações de itens baseados em descrições colaborativas para um sistema de recomendação. Objetiva-se analisar o impacto que diferentes técnicas de extração de características, aliadas à análise de sentimento, causam na precisão da geração de sugestões, avaliando-se os resultados em dois cenários de recomendação: predição de notas e geração de ranques. Dentre as técnicas analisadas, observa-se que a melhor apresenta um ganho no poder descritivo dos itens, ocasionando uma melhora no sistema de recomendação. / Recommender systems arose from the need to select and present relevant content to users according to their preferences. Among several existent methods, those based on content make exclusive use of information inherent to the items. This information can be created through automatic and manual indexing techniques. While automa-tic approaches require greater computing resources and are limited to the specific task they perform, manual methods are expensive and prone to errors. On the other hand, with the expansion of theWeb and the possibility of common users to create new content and descriptions about different items and products, an alternative is to get these metadata created collaboratively by the users. However, this information, especially reviews and comments, may contain noise, be- sides being in a unstructured fashion. Thus, this study aims to develop methods for the construction of items representations based on collaborative descriptions for a recommender system. This study aims to analyze the impact that different feature extraction techniques, combined with sentiment analysis, caused in the accuracy of the generated suggestions, evaluating the results in both recommendations cenarios: rating prediction and ranking generation. Among the analyzed techniques, it is observed that the best is able to describe items in a more effcient manner, resulting in an improvement in the recommendation system. Análise de sentimento Extração de características Feature extraction Informação não-estruturada Items' representation Recommender systems Representação de itens Sentiment analysis Sistemas de recomendação Unstructured information

Search results