Global ETD Search

31	Multi-class Classification Methods Utilizing Mahalanobis Taguchi System And A Re-sampling Approach For Imbalanced Data Sets Ayhan, Dilber 01 April 2009 (has links) (PDF) Classification approaches are used in many areas in order to identify or estimate classes, which different observations belong to. The classification approach, Mahalanobis Taguchi System (MTS) is analyzed and further improved for multi-class classification problems under the scope of this thesis study. MTS tries to explore significant variables and classify a new observation based on its Mahalanobis distance (MD). In this study, first, sample size problems, which are encountered mostly in small data sets, and multicollinearity problems, which constitute some limitations of MTS, are analyzed and a re-sampling approach is explored as a solution. Our re-sampling approach, which only works for data sets with two classes, is a combination of over-sampling and under-sampling. Over-sampling is based on SMOTE, which generates the synthetic observations between the nearest neighbors of observations in the minority class. In addition, MTS models are used to test the performance of several re-sampling parameters, for which the most appropriate values are sought specific to each case. In the second part, multi-class classification methods with MTS are developed. An algorithm, namely Feature Weighted Multi-class MTS-I (FWMMTS-I), is inspired by the descent feature weighted MD. It relaxes adding up of the MDs for variables equally. This provides representations of noisy variables with weights close to zero so that they do not mask the other variables. As a second multi-class classification algorithm, the original MTS method is extended to multi-class problems, which is called Multi-class MTS (MMTS). In addition, a comparable approach to that of Su and Hsiao (2009), which also considers weights of variables, is studied with a modification in MD calculation. It is named as Feature Weighted Multi-class MTS-II (FWMMTS-II). The methods are compared on eight different multi-class data sets using a 5-fold stratified cross validation approach. Results show that FWMMTS-I is as accurate as MMTS, and they are better than FWMMTS-II. Interestingly, the Mahalanobis Distance Classifier (MDC) using all the variables directly in the classification model has performed equally well on the studied data sets.
32	Efficient system design: stability and flexibility Tekin, Salih 21 January 2011 (has links) This thesis is concerned with queueing models where demand is allowed to exceed the system capacity, and also with the capacity sizing and pricing problem for heterogeneous products and resources under demand uncertainty. Our aim is to improve productivity and profitability. In the first part of the thesis, we consider the dynamic assignment of servers to tasks in queueing networks where demand may exceed the capacity for service. The objective is to maximize the system throughput. We use fluid limit analysis to show that several quantities of interest, namely the maximum possible throughput, the maximum throughput for a given arrival rate, the minimum arrival rate that will yield a desired feasible throughput, and the optimal allocations of servers to classes for a given arrival rate and desired throughput, can be computed by solving linear programming problems. We develop generalized round robin policies for assigning servers to classes for a given arrival rate and desired throughput, and show that our policies achieve the desired throughput as long as this throughput is feasible for the arrival rate. We conclude with numerical examples that illustrate the points discussed and provide insights into the system behavior when the arrival rate deviates from the one the system is designed for. In the second part of the thesis, we consider the effects of inspection and repair stations on the production capacity and product quality in a serial line with possible inspection and repair following each operation. We consider multiple defect types and allow for possible inspection errors that are defect dependent. We construct a profit function that takes into account inspection, repair, and goodwill costs, as well as the capacity of each station. Then we compare the profitability of different inspection plans and discuss how to identify the optimal inspection plan. Finally, in the third part of the thesis, we consider the capacity and pricing decisions made by a monopolistic firm producing two heterogeneous products under demand uncertainty. The objective is to maximize profit. Our model incorporates dedicated and flexible resources, product substitutability, and processing rates that may depend on the product and on the resource type. We provide the optimum prices and production quantities as functions of resource capacities and demand intercepts. We also show that investment in flexible capacity is only desirable when it is optimal to invest in dedicated capacities for both products, and obtain upper bounds for the costs of the dedicated capacities that need to be satisfied for investment in the flexible resource. We conclude with numerical examples that illustrate the points discussed and provide insights into how the optimal capacities and expected production quantities, prices, and profit depend on various model parameters. Stability Multi-class queueing networks Maximum throughput Fluid model Capacity sizing Flexible servers Inspection location Queueing theory Industrial productivity Queuing networks (Data transmission)
33	Uma abordagem de predição estruturada baseada no modelo perceptron Coelho, Maurício Archanjo Nunes 25 June 2015 (has links) Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-03-06T17:58:43Z No. of bitstreams: 1 mauricioarchanjonunescoelho.pdf: 10124655 bytes, checksum: 549fa53eba76e81b76ddcbce12c97e55 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-06T20:26:43Z (GMT) No. of bitstreams: 1 mauricioarchanjonunescoelho.pdf: 10124655 bytes, checksum: 549fa53eba76e81b76ddcbce12c97e55 (MD5) / Made available in DSpace on 2017-03-06T20:26:44Z (GMT). No. of bitstreams: 1 mauricioarchanjonunescoelho.pdf: 10124655 bytes, checksum: 549fa53eba76e81b76ddcbce12c97e55 (MD5) Previous issue date: 2015-06-25 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A teoria sobre aprendizado supervisionado tem avançado significativamente nas últimas décadas. Diversos métodos são largamente utilizados para resoluções dos mais variados problemas, citando alguns: sistemas especialistas para obter respostas to tipo verdadeiro/ falso, o modelo Perceptron para separação de classes, Máquina de Vetores Suportes (SVMs) e o Algoritmo de Margem Incremental (IMA) no intuito de aumentar a margem de separação, suas versões multi-classe, bem como as redes neurais artificiais, que apresentam possibilidades de entradas relativamente complexas. Porém, como resolver tarefas que exigem respostas tão complexas quanto as perguntas? Tais respostas podem consistir em várias decisões inter-relacionadas que devem ser ponderadas uma a uma para se chegar a uma solução satisfatória e globalmente consistente. Será visto no decorrer do trabalho que existem problemas de relevante interesse que apresentam estes requisitos. Uma questão que naturalmente surge é a necessidade de se lidar com a explosão combinatória das possíveis soluções. Uma alternativa encontrada apresenta-se através da construção de modelos que compactam e capturam determinadas propriedades estruturais do problema: correlações sequenciais, restrições temporais, espaciais, etc. Tais modelos, chamados de estruturados, incluem, entre outros, modelos gráficos, tais como redes de Markov e problemas de otimização combinatória, como matchings ponderados, cortes de grafos e agrupamentos de dados com padrões de similaridade e correlação. Este trabalho formula, apresenta e discute estratégias on-line eficientes para predição estruturada baseadas no princípio de separação de classes derivados do modelo Perceptron e define um conjunto de algoritmos de aprendizado supervisionado eficientes quando comparados com outras abordagens. São também realizadas e descritas duas aplicações experimentais a saber: inferência dos custos das diversas características relevantes para a realização de buscas em mapas variados e a inferência dos parâmetros geradores dos grafos de Markov. Estas aplicações têm caráter prático, enfatizando a importância da abordagem proposta. / The theory of supervised learning has significantly advanced in recent decades. Several methods are widely used for solutions of many problems, such as expert systems for answers to true/false, Support Vector Machine (SVM) and Incremental Margin Algorithm (IMA). In order to increase the margin of separation, as well as its multi-class versions, in addition to the artificial neural networks which allow complex input data. But how to solve tasks that require answers as complex as the questions? Such responses may consist of several interrelated decisions to be considered one by one to arrive at a satisfactory and globally consistent solution. Will be seen throughout the thesis, that there are problems of relevant interest represented by these requirements. One question that naturally arises is the need to deal with the exponential explosion of possible answers. As a alternative, we have found through the construction of models that compress and capture certain structural properties of the problem: sequential correlations, temporal constraints, space, etc. These structured models include, among others, graphical models, such as Markov networks and combinatorial optimization problems, such as weighted matchings, graph cuts and data clusters with similarity and correlation patterns. This thesis formulates, presents and discusses efficient online strategies for structured prediction based on the principle of separation of classes, derived from the Perceptron and defines a set of efficient supervised learning algorithms compared to other approaches. Also are performed and described two experimental applications: the costs prediction of relevant features on maps and the prediction of the probabilistic parameters for the generating Markov graphs. These applications emphasize the importance of the proposed approach. CNPQ::CIENCIAS EXATAS E DA TERRA Aprendizado de máquina Predição de dados estruturados Perceptron multi-classe Planejamento de caminhos Grafos de Markov Machine Learning Perceptron Multi-class Path Planning Prediction of Structured Data Markov Graphs
34	SCUT-DS: Methodologies for Learning in Imbalanced Data Streams Olaitan, Olubukola January 2018 (has links) The automation of most of our activities has led to the continuous production of data that arrive in the form of fast-arriving streams. In a supervised learning setting, instances in these streams are labeled as belonging to a particular class. When the number of classes in the data stream is more than two, such a data stream is referred to as a multi-class data stream. Multi-class imbalanced data stream describes the situation where the instance distribution of the classes is skewed, such that instances of some classes occur more frequently than others. Classes with the frequently occurring instances are referred to as the majority classes, while the classes with instances that occur less frequently are denoted as the minority classes. Classification algorithms, or supervised learning techniques, use historic instances to build models, which are then used to predict the classes of unseen instances. Multi-class imbalanced data stream classification poses a great challenge to classical classification algorithms. This is due to the fact that traditional algorithms are usually biased towards the majority classes, since they have more examples of the majority classes when building the model. These traditional algorithms yield low predictive accuracy rates for the minority instances and need to be augmented, often with some form of sampling, in order to improve their overall performances. In the literature, in both static and streaming environments, most studies focus on the binary class imbalance problem. Furthermore, research in multi-class imbalance in the data stream environment is limited. A number of researchers have proceeded by transforming a multi-class imbalanced setting into multiple binary class problems. However, such a transformation does not allow the stream to be studied in the original form and may introduce bias. The research conducted in this thesis aims to address this research gap by proposing a novel online learning methodology that combines oversampling of the minority classes with cluster-based majority class under-sampling, without decomposing the data stream into multiple binary sets. Rather, sampling involves continuously selecting a balanced number of instances across all classes for model building. Our focus is on improving the rate of correctly predicting instances of the minority classes in multi-class imbalanced data streams, through the introduction of the Synthetic Minority Over-sampling Technique (SMOTE) and Cluster-based Under-sampling - Data Streams (SCUT-DS) methodologies. In this work, we dynamically balance the classes by utilizing a windowing mechanism during the incremental sampling process. Our SCUT-DS algorithms are evaluated using six different types of classification techniques, followed by comparing their results against a state-of-the-art algorithm. Our contributions are tested using both synthetic and real data sets. The experimental results show that the approaches developed in this thesis yield high prediction rates of minority instances as contained in the multiple minority classes within a non-evolving stream. Multi-class Imbalanced Learning Imbalanced data sets Data streams Classification Imbalanced Learning Sampling Cluster-based Under-sampling Synthetic Oversampling Augmenting Minority Examples Online Learning SMOTE-based Oversampling
35	Melanoma Diagnostics Using Fully Convolutional Networks on Whole Slide Images Phillips, Adon January 2017 (has links) Semantic segmentation as an approach to recognizing and localizing objects within an image is a major research area in computer vision. Now that convolutional neural networks are being increasingly used for such tasks, there have been many improve- ments in grand challenge results, and many new research opportunities in previously untennable areas. Using fully convolutional networks, we have developed a semantic segmentation pipeline for the identification of melanocytic tumor regions, epidermis, and dermis lay- ers in whole slide microscopy images of cutaneous melanoma or cutaneous metastatic melanoma. This pipeline includes processes for annotating and preparing a dataset from the output of a tissue slide scanner to the patch-based training and inference by an artificial neural network. We have curated a large dataset of 50 whole slide images containing cutaneous melanoma or cutaneous metastatic melanoma that are fully annotated at 40× ob- jective resolution by an expert pathologist. We will publish the source images of this dataset online. We also present two new FCN architectures that fuse multiple deconvolutional strides, combining coarse and fine predictions to improve accuracy over similar networks without multi-stride information. Our results show that the system performs better than our comparators. We include inference results on thousands of patches from four whole slide images, reassembling them into whole slide segmentation masks to demonstrate how our system generalizes on novel cases. Machine Learning Medical Imaging Deep Learning Melanoma Whole Slide Images Digital Pathology Convolutional Neural Network Cancer Semantic Segmentation Multi-Class Segmentation
36	Shape Detection in Images Using Machine Learning Devlin, Axel January 2021 (has links) Rapporten undersöker hur man ska gå tillväga för att implementera en support vector machinesom kan klassificera olika former i bilder med hjälp av OpenCV libraryt i Python. Dettakommer att göras genom att beräkna scale-invariant features. De scale-invariant features somkommer undersökas är simple features och Hu moments. Dessa features ska sedantillsammans med sina tillhörande labels matas in i en SVM för träning. SVM ska därefterkunna urskilja mellan olika former baserat på deras scale-invariant feature. Rapportenundersöker även vilken av Hu moments och simple features som fungerar bäst för attklassificera former i bilder. Rapporten tittar också på tidigare forskning i området ochrapporter som täcker olika sätt att extrahera former ut bilder.Nyckelord: Flerklass klassificering, SVM, stödvektormaskin, övervakat / The report examines the possibility to implement a support vector machine that can classifydifferent shapes in images, with the help of the OpenCV library in Python. This will be donethrough calculating scale-invariant features. The scale-invariant features that will beimplemented are simple features and Hu moments. These features will in combination withtheir labels be fed to the SVM for training. The SVM should then be able to distinguishbetween different shapes based on scale-invariant features. The report will also examinewhich of the Hu moments and simple features give the best results in classifying shapes inimages. The report also looks at earlier reports in the same area and reports covering differentways of detecting shapes in images. Multi-class classification SVM support vector machine supervised learning machine learning Flerklass klassificering SVM stödvektormaskin övervakat lärande maskininlärning Computer Sciences Datavetenskap (datalogi)
37	Approaches based on tree-structures classifiers to protein fold prediction Mauricio-Sanchez, David, de Andrade Lopes, Alneu, higuihara Juarez Pedro Nelson 08 1900 (has links) El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado. / Protein fold recognition is an important task in the biological area. Different machine learning methods such as multiclass classifiers, one-vs-all and ensemble nested dichotomies were applied to this task and, in most of the cases, multiclass approaches were used. In this paper, we compare classifiers organized in tree structures to classify folds. We used a benchmark dataset containing 125 features to predict folds, comparing different supervised methods and achieving 54% of accuracy. An approach related to tree-structure of classifiers obtained better results in comparison with a hierarchical approach. / Revisión por pares Learning systems Protein folding Proteins Trees (mathematics) Benchmark datasets Hierarchical approach Machine learning methods Multi-class classifier Nested dichotomies Protein fold recognition Supervised methods Tree structures
38	LAND COVER/USE CHANGE AND CHANGE PATTERN DETECTION USING RADAR AND OPTICAL IMAGES : AN INSTANCE OF URBAN ENVIRONMENT / レーダと光学画像を用いた土地被覆・利用の変化、変化形態の検出 : 都市環境の事例 Bhogendra Mishra 24 September 2014 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(工学) / 甲第18556号 / 工博第3917号 / 新制\|\|工\|\|1602(附属図書館) / 31456 / 京都大学大学院工学研究科社会基盤工学専攻 / (主査)教授田村正行, 准教授須﨑純一, 教授小池克明 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM Land cover change detection Normalized difference ratio (NDR) Thresholding Region growing algorithm Polarimetric information fusion Automatic multi-class change detection SAR and optical information fusion 500
39	[en] REDUCING TEACHER-STUDENT INTERACTIONS BETWEEN TWO NEURAL NETWORKS / [pt] REDUZINDO AS INTERAÇÕES PROFESSOR-ALUNO ENTRE DUAS REDES NEURAIS GUSTAVO MADEIRA KRIEGER 11 October 2019 (has links) [pt] Propagação de conhecimento é um dos pilares da evolução humana. Nossas descobertas são baseadas em conhecimentos já existentes, construídas em cima deles e então se tornam a fundação para a próxima geração de aprendizado. No ramo de Inteligência Artificial, existe o interesse em replicar esse aspecto da natureza humana em máquinas. Criando um primeiro modelo e treinando ele nos dados originais, outro modelo pode ser criado e aprender a partir dele ao invés de ter que começar todo o processo do zero. Se for comprovado que esse método é confiável, ele vai permitir várias mudanças na forma que nós abordamos machine learning, em que cada inteligência não será um microcosmo independente. Essa relação entre modelos é batizada de relação Professor-Aluno. Esse trabalho descreve o desenvolvimento de dois modelos distintos e suas capacidades de aprender usando a informação dada em um ao outro. Os experimentos apresentados aqui mostram os resultados desse treino e as diferentes metodologias usadas em busca do cenário ótimo em que esse processo de aprendizado é viável para replicação futura. / [en] Propagation of knowledge is one of the pillars of human evolution. Our discoveries are all based on preexisting knowledge, built upon them and then become the foundation for the next generation of learning. In the field of artificial intelligence, there s an interest in replicating this aspect of human nature on machines. By creating a first model and training it on the original data, another model can be created and learn from it instead of having to learn everything from scratch. If this method is proven to be reliable, it will allow many changes in the way that we approach machine learning, specially allowing different models to work together. This relation between models is nicknamed the Teacher-Student relation. This work describes the development of two separate models and their ability to learn using incomplete data and each other. The experiments presented here show the results of this training and the different methods used in the pursuit of an optimal scenario where such learning process is viable for future use. [pt] APRENDIZADO DE MAQUINA [pt] DESTILACAO DE CONHECIMENTO [pt] PERCEPTRON MULTICAMADAS [pt] CLASSIFICACAO EM MULTIPLAS CLASSES [en] MACHINE LEARNING [en] KNOWLEDGE DISTILLATION [en] PERCEPTRON MULTILAYERS [en] MULTI-CLASS CLASSIFICATION
40	Classifying Portable Electronic Devices using Device Specifications : A Comparison of Machine Learning Techniques Westerholm, Ludvig January 2024 (has links) In this project, we explored the usage of machine learning in classifying portable electronic devices. The primary objective was to identify devices such as laptops, smartphones, and tablets, based on their physical and technical specification. These specifications, sourced from the Pricerunner price comparison website, contain height, Wi-Fi standard, and screen resolution. We aggregated this information into a dataset and split it into a training set and a testing set. To achieve the classification of devices, we trained four popular machine learning models: Random Forest (RF), Logistic Regression (LR), k-Nearest Neighbor (kNN), and Fully Connected Network (FCN). We then compared the performance of these models. The evaluation metrics used to compare performance included precision, recall, F1-score, accuracy, and training time. The RF model achieved the highest overall accuracy of 95.4% on the original dataset. The FCN, applied to a dataset processed with standardization followed by Principal Component Analysis (PCA), reached an accuracy of 92.7%, the best within this specific subset. LR excelled in a few class-specific metrics, while kNN performed notably well relative to its training time. The RF model was the clear winner on the original dataset, while the kNN model was a strong contender on the PCA-processed dataset due to its significantly faster training time compared to the FCN. In conclusion, the RF was the best-performing model on the original dataset, the FCN showed impressive results on the standardized and PCA-processed dataset, and the kNN model, with its highest macro precision and rapid training time, also demonstrated competitive performance. Supervised Machine Learning Random Forest Logistic Regression k-Nearest Neighbor Neural Networks Classifiation Multi-class Classification Device Recognition Computer and Information Sciences Data- och informationsvetenskap

Search results