Global ETD Search

1	Advanced Text Analytics and Machine Learning Approach for Document Classification Anne, Chaitanya 19 May 2017 (has links) Text classification is used in information extraction and retrieval from a given text, and text classification has been considered as an important step to manage a vast number of records given in digital form that is far-reaching and expanding. This thesis addresses patent document classification problem into fifteen different categories or classes, where some classes overlap with other classes for practical reasons. For the development of the classification model using machine learning techniques, useful features have been extracted from the given documents. The features are used to classify patent document as well as to generate useful tag-words. The overall objective of this work is to systematize NASA’s patent management, by developing a set of automated tools that can assist NASA to manage and market its portfolio of intellectual properties (IP), and to enable easier discovery of relevant IP by users. We have identified an array of methods that can be applied such as k-Nearest Neighbors (kNN), two variations of the Support Vector Machine (SVM) algorithms, and two tree based classification algorithms: Random Forest and J48. The major research steps in this work consist of filtering techniques for variable selection, information gain and feature correlation analysis, and training and testing potential models using effective classifiers. Further, the obstacles associated with the imbalanced data were mitigated by adding synthetic data wherever appropriate, which resulted in a superior SVM classifier based model. Computational Engineering Computer and Systems Architecture
2	Machine Learning for Beam Based Mobility Optimization in NR Ekman, Björn January 2017 (has links) One option for enabling mobility between 5G nodes is to use a set of area-fixed reference beams in the downlink direction from each node. To save power these reference beams should be turned on only on demand, i.e. only if a mobile needs it. An User Equipment (UE) moving out of a beam's coverage will require a switch from one beam to another, preferably without having to turn on all possible beams to find out which one is the best. This thesis investigates how to transform the beam selection problem into a format suitable for machine learning and how good such solutions are compared to baseline models. The baseline models considered were beam overlap and average Reference Signal Received Power (RSRP), both building beam-to-beam maps. Emphasis in the thesis was on handovers between nodes and finding the beam with the highest RSRP. Beam-hit-rate and RSRP-difference (selected minus best) were key performance indicators and were compared for different numbers of activated beams. The problem was modeled as a Multiple Output Regression (MOR) problem and as a Multi-Class Classification (MCC) problem. Both problems are possible to solve with the random forest model, which was the learning model of choice during this work. An Ericsson simulator was used to simulate and collect data from a seven-site scenario with 40 UEs. Primary features available were the current serving beam index and its RSRP. Additional features, like position and distance, were suggested, though many ended up being limited either by the simulated scenario or by the cost of acquiring the feature in a real-world scenario. Using primary features only, learned models' performance were equal to or worse than the baseline models' performance. Adding distance improved the performance considerably, beating the baseline models, but still leaving room for more improvements. machine learning 5G random forest multi-class classification multi-target regression Communication Systems Kommunikationssystem
3	Applying Discriminant Functions with One-Class SVMs for Multi-Class Classification Lee, Zhi-Ying 09 August 2007 (has links) AdaBoost.M1 has been successfully applied to improve the accuracy of a learning algorithm for multi-class classification problems. However, it assumes that the performance of each base classifier must be better than 1/2, and this may be hard to achieve in practice for a multi-class problem. A new algorithm called AdaBoost.MK only requiring base classifiers better than a random guessing (1/k) is thus designed. Early SVM-based multi-class classification algorithms work by splitting the original problem into a set of two-class sub-problems. The time and space required by these algorithms are very demanding. In order to have low time and space complexities, we develop a base classifier that integrates one-class SVMs with discriminant functions. In this study, a hybrid method that integrates AdaBoost.MK and one-class SVMs with improved discriminant functions as the base classifiers is proposed to solve a multi-class classification problem. Experimental results on data sets from UCI and Statlog show that the proposed approach outperforms many popular multi-class algorithms including support vector clustering and AdaBoost.M1 with one-class SVMs as the base classifiers. AdaBoost.M1 multi-class classification One-class SVM Discriminant function Support vector clustering
4	Multi-Class Imbalanced Learning for Time Series Problem : An Industrial Case Study Andersson, Melanie January 2020 (has links) Classification problems with multiple classes and imbalanced sample sizes present a new challenge than the binary classification problems. Methods have been proposed to handle imbalanced learning, however most of them are specifically designed for binary classification problems. Multi-class imbalance imposes additional challenges when applied to time series classification problems, such as weather classification. In this thesis, we introduce, apply and evaluate a new algorithm for handling multi-class imbalanced problems involving time series data. Our proposed algorithm is designed to handle both multi-class imbalance and time series classification problems and is inspired by the Imbalanced Fuzzy-Rough Ordered Weighted Average Nearest Neighbor Classification algorithm. The feasibility of our proposed algorithm is studied through an empirical evaluation performed on a telecom use-case at Ericsson, Sweden where data from commercial microwave links is used for weather classification. Our proposed algorithm is compared to the currently used model at Ericsson which is a one-dimensional convolutional neural network, as well as three other deep learning models. The empirical evaluation indicates that the performance of our proposed algorithm for weather classification is comparable to that of the current solution. Our proposed algorithm and the current solution are the two best performing models of the study. Time Series Imbalanced Learning Weather Classification Microwave Link Multi-Class Classification Engineering and Technology Teknik och teknologier
5	Evaluating machine learning strategies for classification of large-scale Kubernetes cluster logs Sarika, Pawan January 2022 (has links) Kubernetes is a free, open-source container orchestration system for deploying and managing Docker containers that host microservices. Its cluster logs are extremely helpful in determining the root cause of a failure. However, as systems become more complex, locating failures becomes more difficult and time-consuming. This study aims to identify the classification algorithms that accurately classify the given log data and, at the same time, require fewer computational resources. Because the data is quite large, we begin with expert-based feature selection to reduce the data size. Following that, TF-IDF feature extraction is performed, and finally, we compare five classification algorithms, SVM, KNN, random forest, gradient boosting and MLP using several metrics. The results show that Random forest produces good accuracy while requiring fewer computational resources compared to other algorithms. Kubernetes logs feature selection feature extraction multi-class classification Computational cost Computer Sciences Datavetenskap (datalogi)
6	Multi-class recognition using pair-wise classifiers / Daugelio klasių atpažinimas naudojant klasifikatorius poroms Kybartas, Rimantas 01 October 2010 (has links) There are plenty of solutions for the task of multi-class recognition. Unfortunately, these solutions are not always unanimous. Most of them are based on empirical experiments while statistical data features consideration is often omitted. That’s why questions like when and which method should be used, what the reliability of any chosen method is for solving a multi-class recognition task arise. In this dissertation two-stage multi-class decision methods are analyzed. Pair-wise classifiers able to better exploit statistical data features are used in the first stage of such methods. In the second stage a particular fusion rule of the first stage results is used to fuse the first stage results in order to produce the final classification decision. Complexity issues of pair-wise classifiers, training data size and precision of method quality estimation are pointed out in the research. The precision of algorithm highly depends on the data and the number of experiments performed (data permutation, division into training and testing data). It is shown that the declared superiority of some known algorithms is not reliable due to low precision of estimation. A detailed comparison of well known multi-class classification methods is performed and a new pair-wise classifier fusion method based on similar method used in multi-class classifier fusion is presented. The recommendations for multi-class classification task designer are provided. Methods which allow reducing classification... [to full text] / Daugelio klasių atpažinimo uždaviniams spręsti yra sukurta aibė sprendimų ir ne visada vieningų rekomendacijų. Dauguma jų paremta empiriniais bandymais, retai atsižvelgiama į statistines duomenų savybes. Dėl to sprendžiant daugelio klasių klasifikavimo uždavinį kyla klausimų, kurį metodą ir kada geriausia naudoti, koks vieno ar kito metodo patikimumas. Disertacijoje nagrinėjami dviejų pakopų sprendimo priėmimo metodai, kai pirmame etape sudaromi klasifikatoriai poroms (angl. pair-wise), sugebantys geriau išnaudoti klasių tarpusavio statistines savybes, o kitame etape yra atliekamas klasifikatorių poroms rezultatų apjungimas. Tyrime ypatingas dėmesys yra skiriamas klasifikatorių poroms sudėtingumui, mokymo duomenų kiekiui bei algoritmų kokybės įvertinimo tikslumui. Tikslumas labai priklauso nuo duomenų bei atliktų eksperimentų kiekio (duomenų permaišymo klasėse, juos skirstant į mokymo ir testavimo). Parodyta, jog dėl žemo įvertinimo tikslumo kai kurių publikuotų algoritmų deklaruojamas pranašumas prieš žinomus algoritmus nėra patikimas. Darbe atliktas detalus žinomų metodų palyginimas bei pristatytas naujai sukurtas klasifikatorių poroms apjungimo algoritmas, kuris yra paremtas analogišku algoritmu daugelio klasių klasifikatorių rezultatų apjungimui. Pateiktos bendros rekomendacijos, kaip projektuotojui elgtis daugelio klasių atveju. Pasiūlyti metodai, leidžiantys sumažinti klasifikavimo klaidą atliekant klasifikatorių poroms apjungimo koregavimą, kad algoritmas nebūtų... [toliau žr. visą tekstą] Informatics Multi-class classification Single layer perceptron Pair-wise classification Daugelio klasių klasifikavimas Vieno sluoksnio perceptronas Klasifikavimas poroms
7	Daugelio klasių atpažinimas naudojant klasifikatorius poroms / Multi-class recognition using pair-wise classifiers Kybartas, Rimantas 01 October 2010 (has links) Daugelio klasių atpažinimo uždaviniams spręsti yra sukurta aibė sprendimų ir ne visada vieningų rekomendacijų. Dauguma jų paremta empiriniais bandymais, retai atsižvelgiama į statistines duomenų savybes. Dėl to sprendžiant daugelio klasių klasifikavimo uždavinį kyla klausimų, kurį metodą ir kada geriausia naudoti, koks vieno ar kito metodo patikimumas. Disertacijoje nagrinėjami dviejų pakopų sprendimo priėmimo metodai, kai pirmame etape sudaromi klasifikatoriai poroms (angl. pair-wise), sugebantys geriau išnaudoti klasių tarpusavio statistines savybes, o kitame etape yra atliekamas klasifikatorių poroms rezultatų apjungimas. Tyrime ypatingas dėmesys yra skiriamas klasifikatorių poroms sudėtingumui, mokymo duomenų kiekiui bei algoritmų kokybės įvertinimo tikslumui. Tikslumas labai priklauso nuo duomenų bei atliktų eksperimentų kiekio (duomenų permaišymo klasėse, juos skirstant į mokymo ir testavimo). Parodyta, jog dėl žemo įvertinimo tikslumo kai kurių publikuotų algoritmų deklaruojamas pranašumas prieš žinomus algoritmus nėra patikimas. Darbe atliktas detalus žinomų metodų palyginimas bei pristatytas naujai sukurtas klasifikatorių poroms apjungimo algoritmas, kuris yra paremtas analogišku algoritmu daugelio klasių klasifikatorių rezultatų apjungimui. Pateiktos bendros rekomendacijos, kaip projektuotojui elgtis daugelio klasių atveju. Pasiūlyti metodai, leidžiantys sumažinti klasifikavimo klaidą atliekant klasifikatorių poroms apjungimo koregavimą, kad algoritmas nebūtų... [toliau žr. visą tekstą] / There are plenty of solutions for the task of multi-class recognition. Unfortunately, these solutions are not always unanimous. Most of them are based on empirical experiments while statistical data features consideration is often omitted. That’s why questions like when and which method should be used, what the reliability of any chosen method is for solving a multi-class recognition task arise. In this dissertation two-stage multi-class decision methods are analyzed. Pair-wise classifiers able to better exploit statistical data features are used in the first stage of such methods. In the second stage a particular fusion rule of the first stage results is used to fuse the first stage results in order to produce the final classification decision. Complexity issues of pair-wise classifiers, training data size and precision of method quality estimation are pointed out in the research. The precision of algorithm highly depends on the data and the number of experiments performed (data permutation, division into training and testing data). It is shown that the declared superiority of some known algorithms is not reliable due to low precision of estimation. A detailed comparison of well known multi-class classification methods is performed and a new pair-wise classifier fusion method based on similar method used in multi-class classifier fusion is presented. The recommendations for multi-class classification task designer are provided. Methods which allow reducing classification... [to full text] Informatics Daugelio klasių klasifikavimas Vieno sluoksnio perceptronas Klasifikavimas poroms Multi-class classification Single layer perceptron Pair-wise classification
8	Review of Large-Scale Coordinate Descent Algorithms for Multi-class Classification with Memory Constraints Jovanovich, Aleksandar 03 June 2013 (has links) No description available. Computer Science Information Systems Information Science Statistics Classification Multi-class Classification Large-Scale Coordinate Descent Memory Constraints
9	Evolutionary Learning of Boosted Features for Visual Inspection Automation Zhang, Meng 01 March 2018 (has links) Feature extraction is one of the major challenges in object recognition. Features that are extracted from one type of objects cannot always be used directly for a different type of objects, therefore limiting the performance of feature extraction. Having an automatic feature learning algorithm could be a big advantage for an object recognition algorithm. This research first introduces several improvements on a fully automatic feature construction method called Evolution COnstructed Feature (ECO-Feature). These improvements are developed to construct more robust features and make the training process more efficient than the original version. The main weakness of the original ECO-Feature algorithm is that it is designed only for binary classification and cannot be directly applied to multi-class cases. We also observe that the recognition performance depends heavily on the size of the feature pool from which features can be selected and the ability of selecting the best features. For these reasons, we have developed an enhanced evolutionary learning method for multi-class object classification to address these challenges. Our method is called Evolutionary Learning of Boosted Features (ECO-Boost). ECO-Boost method is an efficient evolutionary learning algorithm developed to automatically construct highly discriminative image features from the training image for multi-class image classification. This unique method constructs image features that are often overlooked by humans, and is robust to minor image distortion and geometric transformations. We evaluate this algorithm with a few visual inspection datasets including specialty crops, fruits and road surface conditions. Results from extensive experiments confirm that ECO-Boost performs closely comparable to other methods and achieves a good balance between accuracy and simplicity for real-time multi-class object classification applications. It is a hardware-friendly algorithm that can be optimized for hardware implementation in an FPGA for real-time embedded visual inspection applications. ECO-Boost evolutionary computation feature construction multi-class classification object recognition boosting visual inspection quality grading Electrical and Computer Engineering Engineering
10	[en] FUZZY RULES EXTRACTION FROM SUPPORT VECTOR MACHINES (SVM) FOR MULTI-CLASS CLASSIFICATION / [pt] EXTRAÇÃO DE REGRAS FUZZY PARA MÁQUINAS DE VETOR SUPORTE (SVM) PARA CLASSIFICAÇÃO EM MÚLTIPLAS CLASSES ADRIANA DA COSTA FERREIRA CHAVES 25 October 2006 (has links) [pt] Este trabalho apresenta a proposta de um novo método para a extração de regras fuzzy de máquinas de vetor suporte (SVMs) treinadas para problemas de classificação. SVMs são sistemas de aprendizado baseados na teoria estatística do aprendizado e apresentam boa habilidade de generalização em conjuntos de dados reais. Estes sistemas obtiveram sucesso em vários tipos de problemas. Entretanto, as SVMs, da mesma forma que redes neurais (RN), geram um modelo caixa preta, isto é, um modelo que não explica o processo pelo qual sua saída é obtida. Alguns métodos propostos para reduzir ou eliminar essa limitação já foram desenvolvidos para o caso de classificação binária, embora sejam restritos à extração de regras simbólicas, isto é, contêm funções ou intervalos nos antecedentes das regras. No entanto, a interpretabilidade de regras simbólicas ainda é reduzida. Deste modo, propõe-se, neste trabalho, uma técnica para a extração de regras fuzzy de SVMs treinadas, com o objetivo de aumentar a interpretabilidade do conhecimento gerado. Além disso, o modelo proposto foi desenvolvido para classificação em múltiplas classes, o que ainda não havia sido abordado até agora. As regras fuzzy obtidas são do tipo se x1 pertence ao conjunto fuzzy C1, x2 pertence ao conjunto fuzzy C2,..., xn pertence ao conjunto fuzzy Cn, então o ponto x = (x1,...,xn) é da classe A. Para testar o modelo foram realizados estudos de caso detalhados com quatro bancos de dados: Íris, Wine, Bupa Liver Disorders e Winconsin Breast Cancer. A cobertura das regras resultantes da aplicação desse modelo nos testes realizados mostrou-se muito boa, atingindo 100% no caso da Íris. Após a geração das regras, foi feita uma avaliação das mesmas, usando dois critérios, a abrangência e a acurácia fuzzy. Além dos testes acima mencionados foi comparado o desempenho dos métodos de classificação em múltiplas classes usados no trabalho. / [en] This text proposes a new method for fuzzy rule extraction from support vector machines (SVMs) trained to solve classification problems. SVMs are learning systems based on statistical learning theory and present good ability of generalization in real data base sets. These systems have been successfully applied to a wide variety of application. However SVMs, as well as neural networks, generates a black box model, i.e., a model which does not explain the process used in order to obtain its result. Some considered methods to reduce this limitation already has been proposed for the binary classification case, although they are restricted to symbolic rules extraction, and they have, in their antecedents, functions or intervals. However, the interpretability of the symbolic generated rules is small. Hence, to increase the linguistic interpretability of the generating rules, we propose a new technique for extracting fuzzy rules of a trained SVM. Moreover, the proposed model was developed for classification in multiple classes, which was not introduced till now. Fuzzy rules obtained are presented in the format if x1 belongs to the fuzzy set C1, x2 belongs to the fuzzy set C2 , … , xn belongs to the fuzzy set Cn , then the point x=(x1, x2, …xn) belongs to class A. For testing this new model, we performed detailed researches on four data bases: Iris, Wine, Bupa Liver Disorders and Wisconsin Breast Cancer. The rules´ coverage resultant of the application of this method was quite good, reaching 100% in Iris case. After the rules generation, its evaluation was performed using two criteria: coverage and accuracy. Besides the testing above, the performance of the methods for multi-class SVM described in this work was evaluated. [pt] REGRAS FUZZY [en] FUZZY RULES [pt] EXTRACAO DE REGRAS [en] EXTRACTION OF RULES [pt] CLASSIFICACAO EM MULTIPLAS CLASSES [en] MULTI-CLASS CLASSIFICATION [pt] SVM [en] SVM

Search results