Spelling suggestions: "subject:"deprocessing"" "subject:"cryoprocessing""
71 |
O algoritmo de aprendizado semi-supervisionado co-training e sua aplicação na rotulação de documentos / The semi-supervised learning algorithm co-training applied to label text documentsMatsubara, Edson Takashi 26 May 2004 (has links)
Em Aprendizado de Máquina, a abordagem supervisionada normalmente necessita de um número significativo de exemplos de treinamento para a indução de classificadores precisos. Entretanto, a rotulação de dados é freqüentemente realizada manualmente, o que torna esse processo demorado e caro. Por outro lado, exemplos não-rotulados são facilmente obtidos se comparados a exemplos rotulados. Isso é particularmente verdade para tarefas de classificação de textos que envolvem fontes de dados on-line tais como páginas de internet, email e artigos científicos. A classificação de textos tem grande importância dado o grande volume de textos disponível on-line. Aprendizado semi-supervisionado, uma área de pesquisa relativamente nova em Aprendizado de Máquina, representa a junção do aprendizado supervisionado e não-supervisionado, e tem o potencial de reduzir a necessidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados está disponível. Este trabalho descreve o algoritmo de aprendizado semi-supervisionado co-training, que necessita de duas descrições de cada exemplo. Deve ser observado que as duas descrições necessárias para co-training podem ser facilmente obtidas de documentos textuais por meio de pré-processamento. Neste trabalho, várias extensões do algoritmo co-training foram implementadas. Ainda mais, foi implementado um ambiente computacional para o pré-processamento de textos, denominado PreTexT, com o objetivo de utilizar co-training em problemas de classificação de textos. Os resultados experimentais foram obtidos utilizando três conjuntos de dados. Dois conjuntos de dados estão relacionados com classificação de textos e o outro com classificação de páginas de internet. Os resultados, que variam de excelentes a ruins, mostram que co-training, similarmente a outros algoritmos de aprendizado semi-supervisionado, é afetado de maneira bastante complexa pelos diferentes aspectos na indução dos modelos. / In Machine Learning, the supervised approach usually requires a large number of labeled training examples to learn accurately. However, labeling is often manually performed, making this process costly and time-consuming. By contrast, unlabeled examples are often inexpensive and easier to obtain than labeled examples. This is especially true for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Text classification is of great practical importance today given the massive volume of online text available. Semi-supervised learning, a relatively new area in Machine Learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labeled data whenever only a small set of labeled examples is available. This work describes the semi-supervised learning algorithm co-training, which requires a partitioned description of each example into two distinct views. It should be observed that the two different views required by co-training can be easily obtained from textual documents through pre-processing. In this works, several extensions of co-training algorithm have been implemented. Furthermore, we have also implemented a computational environment for text pre-processing, called PreTexT, in order to apply the co-training algorithm to text classification problems. Experimental results using co-training on three data sets are described. Two data sets are related to text classification and the other one to web-page classification. Results, which range from excellent to poor, show that co-training, similarly to other semi-supervised learning algorithms, is affected by modelling assumptions in a rather complicated way.
|
72 |
Twittersentimentanalys : Jämförelse av klassificeringsmodeller tränade på olika datamängder. / Twitter Sentiment Analysis : Comparison of classification models trained on different data sets.Bandgren, Johannes, Selberg, Johan January 2018 (has links)
Twitter är en av de populäraste mikrobloggarna, som används för att uttryckatankar och åsikter om olika ämnen. Ett område som har dragit till sig mycketintresse under de senaste åren är twittersentimentanalys. Twittersentimentanalyshandlar om att bedöma vad för sentiment ett inlägg på Twitter uttrycker, om detuttrycker någonting positivt eller negativt. Olika metoder kan användas för attutföra twittersentimentanalys, där vissa lämpar sig bättre än andra. De vanligastemetoderna för twittersentimentanalys använder maskininlärning.Syftet med denna studie är att utvärdera tre stycken klassificeringsalgoritmerinom maskininlärning och hur märkningen av en datamängd påverkar en klassifi-ceringsmodells förmåga att märka ett twitterinlägg korrekt för twittersentimenta-nalys. Naive Bayes, Support Vector Machine och Convolutional Neural Network ärklassificeringsalgoritmerna som har utvärderats. För varje klassificeringsalgoritmhar två klassificeringsmodeller tagits fram, som har tränats och testats på två se-parata datamängder: Stanford Twitter Sentiment och SemEval. Det som skiljer detvå datamängderna åt, utöver innehållet i twitterinläggen, är märkningsmetodenoch mängden twitterinlägg. Utvärderingen har gjorts utefter vilken prestanda deframtagna klassificeringmodellerna uppnår på respektive datamängd, hur lång tidde tar att träna och hur invecklade de var att implementera.Resultaten av studien visar att samtliga modeller som tränades och testades påSemEval uppnådde en högre prestanda än de som tränades och testades på Stan-ford Twitter Sentiment. Klassificeringsmodellerna som var framtagna med Convo-lutional Neural Network uppnådde bäst resultat över båda datamängderna. Dockär ett Convolutional Neural Network mer invecklad att implementera och tränings-tiden är betydligt längre än Naive Bayes och Support Vector Machine. / Twitter is one of the most popular microblogs, which is used to express thoughtsand opinions on different topics. An area that has attracted much interest in recentyears is Twitter sentiment analysis. Twitter sentiment analysis is about assessingwhat sentiment a Twitter post expresses, whether it expresses something positiveor negative. Different methods can be used to perform Twitter sentiment analysis.The most common methods of Twitter sentiment analysis use machine learning.The purpose of this study is to evaluate three classification algorithms in ma-chine learning and how the labeling of a data set affects classification models abilityto classify a Twitter post correctly for Twitter sentiment analysis. Naive Bayes,Support Vector Machine and Convolutional Neural Network are the classificationalgorithms that have been evaluated. For each classification algorithm, two classi-fication models have been trained and tested on two separate data sets: StanfordTwitter Sentiment and SemEval. What separates the two data sets, in addition tothe content of the twitter posts, is the labeling method and the amount of twitterposts. The evaluation has been done according to the performance of the classifi-cation models on the respective data sets, training time and how complicated theywere to implement.The results show that all models trained and tested on SemEval achieved ahigher performance than those trained and tested on Stanford Twitter Sentiment.The Convolutional Neural Network models achieved the best results over both datasets. However, a Convolutional Neural Network is more complicated to implementand the training time is significantly longer than Naive Bayes and Support VectorMachine.
|
73 |
Análise de séries temporais fuzzy para previsão e identificação de padrões comportamentais dinâmicosSantos, Fábio José Justo dos 30 April 2015 (has links)
Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-09-06T18:59:08Z
No. of bitstreams: 1
TeseFJJS.pdf: 3277696 bytes, checksum: 0a34a4499fb5e482fa95ea8925603968 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-12T14:12:50Z (GMT) No. of bitstreams: 1
TeseFJJS.pdf: 3277696 bytes, checksum: 0a34a4499fb5e482fa95ea8925603968 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-12T14:13:02Z (GMT) No. of bitstreams: 1
TeseFJJS.pdf: 3277696 bytes, checksum: 0a34a4499fb5e482fa95ea8925603968 (MD5) / Made available in DSpace on 2016-09-12T14:13:13Z (GMT). No. of bitstreams: 1
TeseFJJS.pdf: 3277696 bytes, checksum: 0a34a4499fb5e482fa95ea8925603968 (MD5)
Previous issue date: 2015-04-30 / Não recebi financiamento / The good results obtained by the fuzzy approaches applied in the analysis of time series
(TS) has contributed significantly to the growth of the area. Although there are satisfactory
results in TS analysis with methods that use the classic concepts of TS and with the recent
concepts of fuzzy time series (FTS), there is a lack of models combining both areas. Face
of this context, the contributions of this thesis are associated with the development of models
for TS analysis combining the concepts of FTS with statistical methods aiming at the
improvement in accuracy of forecasts and in identification of behavioral changes in the TS.
In order to allow a suitable fuzzy representation of crisp values observed, the approaches
developed in this thesis were combined with a new proposal for pre-processing of the data.
The prediction value is calculated from a new smoothing technique combined with an extension
of the fuzzy logic relationships. This combination allow to be considered in value
computed different degrees of influence to the most recent behavior and to the oldest behavior
of the series. In situations where the model does not have the necessary knowledge
to calculate the predicted value, the concepts of simple linear regression are combined with
the concepts of the FTS to identify the most recent trend in the TS. The approach developed
for the behavioral analysis of the TS aims to identify changes in behavior from the
definition of prototypes that represent the groups of the TS and from the segmentation of
the series that will be analyzed. In this new approach, the dissimilarity between a segment
of a TS and the corresponding interval of a given prototype is defined by metric Fuzzy
Dynamic Time Warping weighted by a new smoothing technique applied to the distance
matrix between the observed data. The accuracy obtained by the forecast model not only
demonstrates the effectiveness of the developed approach, but also shows the evolution
of model throughout the research and the importance of preprocessing in the forecast. The
analysis of segmented TS identifies satisfactorily the behavioral changes of the series by
calculating the membership functions of these segments in the respective groups represented
by the prototypes. / Os bons resultados obtidos pelas abordagens fuzzy utilizadas para a análise de séries
temporais (ST) tem contribuído significativamente para o crescimento da área. Embora
haja resultados satisfatórios na análise de ST com métodos que utilizam os conceitos clássicos
de ST e também com os conceitos recentes de séries temporais fuzzy (STF), há uma
carência de modelos que combinem ambas as áreas. Diante deste contexto, as contribuições
deste trabalho estão associadas ao desenvolvimento de modelos para a análise de
ST combinando os conceitos de STF e métodos estatísticos visando a melhora na acurácia
das previsões e a identificação de alterações comportamentais nas séries. Com o objetivo
de permitir uma melhor representação fuzzy dos valores crisp observados, as abordagens
desenvolvidas nesta tese foram associadas a uma nova proposta de pré-processamento
dos dados. A previsão de valores é calculada a partir de uma nova técnica de suavização
combinada a uma extensão das relações lógicas fuzzy. Essa combinação permite que
sejam considerados no cálculo do valor previsto diferentes graus de influência para o comportamento
mais recente e para o comportamento mais antigo da série. Em ocasiões onde
o modelo não dispõe do conhecimento necessário para o cálculo do valor previsto, os
conceitos de regressão linear simples são associados aos conceitos das STF para identificar
a tendência mais recente da ST. A abordagem desenvolvida para a análise comportamental
das séries tem como objetivo identificar mudanças no comportamento a partir da
definição de protótipos que representam um grupo de ST e da segmentação das séries a
serem analisadas. Nesta nova abordagem, a dissimilaridade entre um segmento de uma
ST e o intervalo correspondente de um determinado protótipo é definida por meio da métrica
Dynamic Time Warping (DTW) Fuzzy, ponderada por uma nova técnica de suavização
aplicada à matriz de distâncias entre os dados observados. A acurácia obtida pelo
modelo de previsão não só comprova a eficácia da abordagem desenvolvida, como também
demonstra a evolução do modelo ao longo da pesquisa e a importância do pré-processamento
nas previsões. A análise das ST segmentadas identifica satisfatoriamente as
alterações comportamentais das séries por meio do cálculo da pertinência dos segmentos
nos respectivos grupos representados pelos protótipos.
|
74 |
O algoritmo de aprendizado semi-supervisionado co-training e sua aplicação na rotulação de documentos / The semi-supervised learning algorithm co-training applied to label text documentsEdson Takashi Matsubara 26 May 2004 (has links)
Em Aprendizado de Máquina, a abordagem supervisionada normalmente necessita de um número significativo de exemplos de treinamento para a indução de classificadores precisos. Entretanto, a rotulação de dados é freqüentemente realizada manualmente, o que torna esse processo demorado e caro. Por outro lado, exemplos não-rotulados são facilmente obtidos se comparados a exemplos rotulados. Isso é particularmente verdade para tarefas de classificação de textos que envolvem fontes de dados on-line tais como páginas de internet, email e artigos científicos. A classificação de textos tem grande importância dado o grande volume de textos disponível on-line. Aprendizado semi-supervisionado, uma área de pesquisa relativamente nova em Aprendizado de Máquina, representa a junção do aprendizado supervisionado e não-supervisionado, e tem o potencial de reduzir a necessidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados está disponível. Este trabalho descreve o algoritmo de aprendizado semi-supervisionado co-training, que necessita de duas descrições de cada exemplo. Deve ser observado que as duas descrições necessárias para co-training podem ser facilmente obtidas de documentos textuais por meio de pré-processamento. Neste trabalho, várias extensões do algoritmo co-training foram implementadas. Ainda mais, foi implementado um ambiente computacional para o pré-processamento de textos, denominado PreTexT, com o objetivo de utilizar co-training em problemas de classificação de textos. Os resultados experimentais foram obtidos utilizando três conjuntos de dados. Dois conjuntos de dados estão relacionados com classificação de textos e o outro com classificação de páginas de internet. Os resultados, que variam de excelentes a ruins, mostram que co-training, similarmente a outros algoritmos de aprendizado semi-supervisionado, é afetado de maneira bastante complexa pelos diferentes aspectos na indução dos modelos. / In Machine Learning, the supervised approach usually requires a large number of labeled training examples to learn accurately. However, labeling is often manually performed, making this process costly and time-consuming. By contrast, unlabeled examples are often inexpensive and easier to obtain than labeled examples. This is especially true for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Text classification is of great practical importance today given the massive volume of online text available. Semi-supervised learning, a relatively new area in Machine Learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labeled data whenever only a small set of labeled examples is available. This work describes the semi-supervised learning algorithm co-training, which requires a partitioned description of each example into two distinct views. It should be observed that the two different views required by co-training can be easily obtained from textual documents through pre-processing. In this works, several extensions of co-training algorithm have been implemented. Furthermore, we have also implemented a computational environment for text pre-processing, called PreTexT, in order to apply the co-training algorithm to text classification problems. Experimental results using co-training on three data sets are described. Two data sets are related to text classification and the other one to web-page classification. Results, which range from excellent to poor, show that co-training, similarly to other semi-supervised learning algorithms, is affected by modelling assumptions in a rather complicated way.
|
75 |
Multimodální registrace retinálních snímků z fundus kamery a OCT / Multimodal Registration of Fundus Camera and OCT Retinal ImagesBěťák, Ondřej January 2012 (has links)
Tato práce se zabývá multimodální registrací snímků sítnice z různých skenovacích zařízení. Multimodální registrace umožňuje zvýraznit prvky na snímcích sítnice, které jsou důležité pro detekci různých typů onemocnění oka (jako je glaukom, degradace nervových vláken, degradace cév, atd.). Teoretická část tvoří zhruba první půlku práce a je následována praktickou částí, která popisuje postupy při různých typech registrací snímků z fundus kamery, SLO a OCT. Registrace fundus a SLO snímků je provedena pomocí prostorové transformace. Tato práce popisuje tři různé metody registrace SLO snímků se snímky z fundus kamery. První a zároveň nejjednodušší je manuální registrace. Druhou je automatická registrace založená na metodě korelace. Výsledky, včetně porovnání obou metod, jsou uvedeny v závěru. Třetím typem je poloautomatická registrace, která využívá výhod obou předchozích metod a tím pádem je kompromisem mezi rychlostí a přesností registrace. Registrace fundus snímků a B-scanů z OCT je realizována dvěma různými metodami. První je opět založená na korelaci a druhá na prostorové transformaci. Všechny tyto registrační metody jsou realizovány také prakticky v programovém prostředí Matlab.
|
76 |
Task Load Modelling for LTE Baseband Signal Processing with Artificial Neural Network ApproachWang, Lu January 2014 (has links)
This thesis gives a research on developing an automatic or guided-automatic tool to predict the hardware (HW) resource occupation, namely task load, with respect to the software (SW) application algorithm parameters in an LTE base station. For the signal processing in an LTE base station it is important to get knowledge of how many HW resources will be used when applying a SW algorithm on a specic platform. The information is valuable for one to know the system and platform better, which can facilitate a reasonable use of the available resources. The process of developing the tool is considered to be the process of building a mathematical model between HW task load and SW parameters, where the process is dened as function approximation. According to the universal approximation theorem, the problem can be solved by an intelligent method called articial neural networks (ANNs). The theorem indicates that any function can be approximated with a two-layered neural network as long as the activation function and number of hidden neurons are proper. The thesis documents a work ow on building the model with the ANN method, as well as some research on data subset selection with mathematical methods, such as Partial Correlation and Sequential Searching as a data pre-processing step for the ANN approach. In order to make the data selection method suitable for ANNs, a modication has been made on Sequential Searching method, which gives a better result. The results show that it is possible to develop such a guided-automatic tool for prediction purposes in LTE baseband signal processing under specic precision constraints. Compared to other approaches, this model tool with intelligent approach has a higher precision level and a better adaptivity, meaning that it can be used in any part of the platform even though the transmission channels are dierent. / Denna avhandling utvecklar ett automatiskt eller ett guidat automatiskt verktyg for att forutsaga behov av hardvaruresurser, ocksa kallat uppgiftsbelastning, med avseende pa programvarans algoritmparametrar i en LTE basstation. I signalbehandling i en LTE basstation, ar det viktigt att fa kunskap om hur mycket av hardvarans resurser som kommer att tas i bruk nar en programvara ska koras pa en viss plattform. Informationen ar vardefull for nagon att forsta systemet och plattformen battre, vilket kan mojliggora en rimlig anvandning av tillgangliga resurser. Processen att utveckla verktyget anses vara processen att bygga en matematisk modell mellan hardvarans belastning och programvaruparametrarna, dar processen denieras som approximation av en funktion. Enligt den universella approximationssatsen, kan problemet losas genom en intelligent metod som kallas articiella neuronnat (ANN). Satsen visar att en godtycklig funktion kan approximeras med ett tva-skiktS neuralt natverk sa lange aktiveringsfunktionen och antalet dolda neuroner ar korrekt. Avhandlingen dokumenterar ett arbets- ode for att bygga modellen med ANN-metoden, samt studerar matematiska metoder for val av delmangder av data, sasom Partiell korrelation och sekventiell sokning som dataforbehandlingssteg for ANN. For att gora valet av uppgifter som lampar sig for ANN har en andring gjorts i den sekventiella sokmetoden, som ger battre resultat. Resultaten visar att det ar mojligt att utveckla ett sadant guidat automatiskt verktyg for prediktionsandamal i LTE basbandssignalbehandling under specika precisions begransningar. Jamfort med andra metoder, har dessa modellverktyg med intelligent tillvagagangssatt en hogre precisionsniva och battre adaptivitet, vilket innebar att den kan anvandas i godtycklig del av plattformen aven om overforingskanalerna ar olika.
|
77 |
Určování poloh robotů Trilobot / Determination of Trilobot Robots PositionsLoyka, Tomáš January 2007 (has links)
This master's thesis is engaged in machine vision, methods of image processing and analysis. The reason is to create application to determine relative positions of Trilobot robots in the laboratory.
|
78 |
Morphological Change Monitoring of Skin Lesions for Early Melanoma DetectionDhinagar, Nikhil J. 01 October 2018 (has links)
No description available.
|
79 |
Automated Image Pre-Processing for Optimized Text Extraction Using Reinforcement Learning and Genetic AlgorithmsRohoullah, Rahmat, Joakim, Månsson January 2023 (has links)
This project aims to develop an automated image pre-processing chain to extract valuable information from appliance labels before recycling. The primary goal is to improve optical character recognition accuracy by addressing noise issues using reinforcement learning and an evolutionary algorithm. Python was selected as the primary programming language for this project due to its extensive support for machine learning and computer vision libraries. Different techniques are implemented to enhance text extraction from labels. Binary Robust Invariant Scalable Keypoints (BRISK) are used to straighten labels and separate the label from the background. You Only Look Once version 8x (YOLOv8x) is then used for extracting the regions containing the text of interest. The reinforcement learning model and genetic algorithm dataset are created using BRISK with YOLOv8x. The results showed that pre-processing images in the dataset, provided through BRISK and YOLOv8x, does not affect text extraction accuracy, as suggested by reinforcement learning and evolutionary algorithms. / Detta projekt syftar till att utveckla en automatiserad bildförbehandlingskedja för att extrahera värdefull information från apparatmärken före återvinning. Det primära målet är att förbättra noggrannheten för optisk teckenigenkänning genom att hantera brusproblem med hjälp av förstärkningsinlärning och en evolutionär algoritm. Python valdes som det primära programmeringsspråket för detta projekt på grund av dess omfattande stöd för maskininlärnings- och datorseendebibliotek. Olika tekniker implementeras för att förbättra textutvinningen från etiketterna. Binary Robust Invariant Scalable Keypoints (BRISK) används för att räta ut etiketter och separera etiketten från bakgrunden. You Only Look Once version 8x (YOLOv8x) används sedan för att extrahera områden som innehåller den önskade texten. Datasetet för förstärkningsinlärningsmodellen och den genetiska algoritmen skapas genom att använda BRISK med YOLOv8x. Resultaten visade att förbehandlingen av bilder i datasetet, som tillhandahålls genom BRISK och YOLOv8x, inte påverkar noggrannheten för textutvinning, som föreslagits av förstärkningsinlärning och evolutionära algoritmer.
|
80 |
Enhancing Drone Spectra Classification : A Study on Data-Adaptive Pre-processing and Efficient Hardware DeploymentDel Gaizo, Dario January 2023 (has links)
Focusing on the problem of Drone vs. Unknown classification based on radar frequency-amplitude spectra using Deep Learning (DL), especially 1-Dimensional Convolutional Neural Networks (1D-CNNs), this thesis aims at reducing the current gap in the research related to adequate pre-processing techniques for hardware deployment. The primary challenge tackled in this work is determining a pipeline that facilitates industrial deployment while maintaining high classification metrics. After presenting a comprehensive review of existing research on radar signal classification and the application of DL techniques in this domain, the technical background of signal processing is described to provide a practical scenario where the solutions could be implemented. A thorough description of technical constraints, such as Field Programmable Gate Array (FPGA) data type requirements, follows the entire project justifying the necessity of a learning-based pre-processing technique for highly skewed distributions. The results demonstrate that data-adaptive preprocessing eases hardware deployment and maintains high classification metrics, while other techniques contribute to noise and information loss. In conclusion, this thesis contributes to the field of radar frequency-amplitude spectra classification by identifying effective methods to support efficient hardware deployment of 1D-CNNs, without sacrificing performance. This work lays the foundation for future studies in the field of DL for real-world signal processing applications. / Med fokus på problemet med klassificering av drönare kontra okänt baserat på radarfrekvens-amplitudspektra med Deep Learning (DL), särskilt 1-Dimensional Convolutional Neural Networks (1D-CNNs), syftar denna avhandling till att minska det nuvarande gapet i forskningen relaterad till adekvata förbehandlingstekniker för hårdvarudistribution. Den främsta utmaningen i detta arbete är att fastställa en pipeline som underlättar industriell driftsättning samtidigt som höga klassificeringsmått bibehålls. Efter en omfattande genomgång av befintlig forskning om klassificering av radarsignaler och tillämpningen av DL-tekniker inom detta område, beskrivs den tekniska bakgrunden för signalbehandling för att ge ett praktiskt scenario där lösningarna kan implementeras. En grundlig beskrivning av tekniska begränsningar, såsom krav på datatyper för FPGA (Field Programmable Gate Array), följer hela projektet och motiverar nödvändigheten av en inlärningsbaserad förbehandlingsteknik för mycket skeva fördelningar. Resultaten visar att dataanpassad förbehandling underlättar hårdvaruimplementering och bibehåller höga klassificeringsmått, medan andra tekniker bidrar till brus och informationsförlust. Sammanfattningsvis bidrar denna avhandling till området klassificering av radarfrekvens-amplitudspektra genom att identifiera effektiva metoder för att stödja effektiv hårdvarudistribution av 1D-CNN, utan att offra prestanda. Detta arbete lägger grunden för framtida studier inom området DL för verkliga signalbehandlingstillämpningar.
|
Page generated in 0.0774 seconds