31 |
Využití umělé inteligence ve vibrodiagnostice / Utilization of artificial intelligence in vibrodiagnosticsDočekalová, Petra January 2021 (has links)
The diploma thesis deals with machine learning, expert systems, fuzzy logic, genetic algorithms, neural networks and chaos theory, which fall into the category of artificial intelligence. The aim of this work is to describe and implement three different classification methods, according to which the data set will be processed. The GNU Octave software environment was chosen for the data application for licensing reasons. Further evaluate the success of data classification, including visualization. Three different classification methods are used for comparison, so that we can compare the processed data with each other.
|
32 |
Webový portál pro správu a klasifikaci informací z distribuovaných zdrojů / Web Application for Managing and Classifying Information from Distributed SourcesVrána, Pavel January 2011 (has links)
This master's thesis deals with data mining techniques and classification of the data into specified categories. The goal of this thesis is to implement a web portal for administration and classification of data from distributed sources. To achieve the goal, it is necessary to test different methods and find the most appropriate one for web articles classification. From the results obtained, there will be developed an automated application for downloading and classification of data from different sources, which would ultimately be able to substitute a user, who would process all the tasks manually.
|
33 |
Rozpoznávání SPZ / LPR RecognitionTrkal, Ondřej January 2016 (has links)
The thesis deals with analysis and design of system for automatic localization and recognition of the license plate. The input images are from different sources, and contain large scenic and weather variations. The aim was to create a system able to find the licence plate on the image and recognize its alphanumeric figure. In this work, there is a focus on analysis and implementation of localization and optical character recognition methods. One own and four other localization methods are compared. There are also compared three classifiers for optical character recognition. Localization and OCR methods are tested on real data and evaluated in accordance with the calculated evaluation parameters. The work also contains sensitivity analysis of the proposed system.
|
34 |
Automatická klasifikace spánkových fází z polysomnografických dat / Automatic sleep scoring using polysomnographic dataVávrová, Eva January 2016 (has links)
The thesis is focused on analysis of polysomnographic signals based on extraction of chosen parameters in time, frequency and time-frequency domain. The parameters are acquired from 30 seconds long segments of EEG, EMG and EOG signals recorded during different sleep stages. The parameters used for automatic classification of sleep stages are selected according to statistical analysis. The classification is realized by artificial neural networks, k-NN classifier and linear discriminant analysis. The program with a graphical user interface was created using Matlab.
|
35 |
Metody klasifikace www stránek / Methods for Classification of WWW PagesSvoboda, Pavel January 2009 (has links)
The main goal of this master's thesis was to study the main principles of classification methods. Basic principles of knowledge discovery process, data mining and using an external class CSSBox are described. Special attantion was paid to implementation of a ,,k-nearest neighbors`` classification method. The first objective of this work was to create training and testing data described by 'n' attributes. The second objective was to perform experimental analysis to determine a good value for 'k', the number of neighbors.
|
36 |
Toward an application of machine learning for predicting foreign trade in services – a pilot study for Statistics SwedenUnnebäck, Tea January 2023 (has links)
The objective of this thesis is to investigate the possibility of using machine learn- ing at Statistics Sweden within the Foreign Trade in Services (FTS) statistic, to predict the likelihood of a unit to conduct foreign trade in services. The FTS survey is a sample survey, for which there is no natural frame to sample from. Therefore, prior to sampling a frame is manually constructed each year, starting with a register of all Swedish companies and agencies and in a rule- based manner narrowing it down to contain only what is classified as units likely to trade in services during the year to come. An automatic procedure that would enable reliable predictions is requested. To this end, three different machine learning methods have been analyzed, two rule- based methods (random forest and extreme gradient boosting) and one distance- based method (k nearest neighbors). The models arising from these methods are trained and tested on historically sampled units, for which it is known whether they did trade or not. The results indicate that the two rule-based methods perform well in classifying likely traders. The random forest model is better at finding traders, while the extreme gradient boosting model is better at finding non-traders. The results also indicate interesting patterns when studying different metrics for the models. The results also indicate that when training the rule-based models, the year in which the training data was sampled needs to be taken into account. This entails that cross-validation with random folds should not be used, but rather grouped cross-validation based on year. By including a feature that mirror the state of the economy, the model can adapt its rules to this, meaning that the rules learned on training data can be extended to years beyond training data. Based on the observed results, the final recommendation is to further develop and investigate the performance of the random forest model.
|
37 |
Detekce logopedických vad v řeči / Detection of Logopaedic Defects in SpeechPešek, Milan January 2009 (has links)
The thesis deals with a design and an implementation of software for a detection of logopaedia defects of speech. Due to the need of early logopaedia defects detecting, this software is aimed at a child’s age speaker. The introductory part describes the theory of speech realization, simulation of speech realization for numerical processing, phonetics, logopaedia and basic logopaedia defects of speech. There are also described used methods for feature extraction, for segmentation of words to speech sounds and for features classification into either correct or incorrect pronunciation class. In the next part of the thesis there are results of testing of selected methods presented. For logopaedia speech defects recognition algorithms are used in order to extract the features MFCC and PLP. The segmentation of words to speech sounds is performed on the base of Differential Function method. The extracted features of a sound are classified into either a correct or an incorrect pronunciation class with one of tested methods of pattern recognition. To classify the features, the k-NN, SVN, ANN, and GMM methods are tested.
|
38 |
[en] TEXT CATEGORIZATION: CASE STUDY: PATENT S APPLICATION DOCUMENTS IN PORTUGUESE / [pt] CATEGORIZAÇÃO DE TEXTOS: ESTUDO DE CASO: DOCUMENTOS DE PEDIDOS DE PATENTE NO IDIOMA PORTUGUÊSNEIDE DE OLIVEIRA GOMES 08 January 2015 (has links)
[pt] Atualmente os categorizadores de textos construídos por técnicas de
aprendizagem de máquina têm alcançado bons resultados, tornando viável a
categorização automática de textos. A proposição desse estudo foi a definição de
vários modelos direcionados à categorização de pedidos de patente, no idioma
português. Para esse ambiente foi proposto um comitê composto de 6 (seis)
modelos, onde foram usadas várias técnicas. A base de dados foi constituída de
1157 (hum mil cento e cinquenta e sete) resumos de pedidos de patente,
depositados no INPI, por depositantes nacionais, distribuídos em várias
categorias. Dentre os vários modelos propostos para a etapa de processamento da
categorização de textos, destacamos o desenvolvido para o Método 01, ou seja, o
k-Nearest-Neighbor (k-NN), modelo também usado no ambiente de patentes, para
o idioma inglês. Para os outros modelos, foram selecionados métodos que não os
tradicionais para ambiente de patentes. Para quatro modelos, optou-se por
algoritmos, onde as categorias são representadas por vetores centróides. Para um
dos modelos, foi explorada a técnica do High Order Bit junto com o algoritmo k-
NN, sendo o k todos os documentos de treinamento. Para a etapa de préprocessamento
foram implementadas duas técnicas: os algoritmos de stemização
de Porter; e o StemmerPortuguese; ambos com modificações do original. Foram
também utilizados na etapa do pré-processamento: a retirada de stopwords; e o
tratamento dos termos compostos. Para a etapa de indexação foi utilizada
principalmente a técnica de pesagem dos termos intitulada: frequência de termos
modificada versus frequência de documentos inversa TF -IDF . Para as medidas
de similaridade ou medidas de distância destacamos: cosseno; Jaccard; DICE;
Medida de Similaridade; HOB. Para a obtenção dos resultados foram usadas as
técnicas de predição da relevância e do rank. Dos métodos implementados nesse
trabalho, destacamos o k-NN tradicional, o qual apresentou bons resultados
embora demande muito tempo computacional. / [en] Nowadays, the text s categorizers constructed based on learning techniques,
had obtained good results and the automatic text categorization became viable.
The purpose of this study was the definition of various models directed to text
categorization of patent s application in Portuguese language. For this
environment was proposed a committee composed of 6 (six) models, where were
used various techniques. The text base was constituted of 1157 (one thousand one
hundred fifty seven) abstracts of patent s applications, deposited in INPI, by
national applicants, distributed in various categories. Among the various models
proposed for the step of text categorization s processing, we emphasized the one
devellopped for the 01 Method, the k-Nearest-Neighbor (k-NN), model also used
in the English language patent s categorization environment. For the others
models were selected methods, that are not traditional in the English language
patent s environment. For four models, there were chosen for the algorithms,
centroid vectors representing the categories. For one of the models, was explored
the High Order Bit technique together with the k-NN algorithm, being the k all the
training documents. For the pre-processing step, there were implemented two
techniques: the Porter s stemization algorithm; and the StemmerPortuguese
algorithm; both with modifications of the original. There were also used in the
pre-processing step: the removal of the stopwards; and the treatment of the
compound terms. For the indexing step there was used specially the modified
documents term frequency versus documents term inverse frequency TF-IDF .
For the similarity or distance measures there were used: cosine; Jaccard; DICE;
Similarity Measure; HOB. For the results, there were used the relevance and the
rank technique. Among the methods implemented in this work it was emphasized
the traditional k-NN, which had obtained good results, although demands much
computational time.
|
Page generated in 0.0259 seconds