• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 21
  • 14
  • 12
  • 3
  • 1
  • Tagged with
  • 54
  • 27
  • 24
  • 17
  • 16
  • 11
  • 9
  • 8
  • 8
  • 8
  • 7
  • 7
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Sparse Multiclass And Multi-Label Classifier Design For Faster Inference

Bapat, Tanuja 12 1900 (has links) (PDF)
Many real-world problems like hand-written digit recognition or semantic scene classification are treated as multiclass or multi-label classification prob-lems. Solutions to these problems using support vector machines (SVMs) are well studied in literature. In this work, we focus on building sparse max-margin classifiers for multiclass and multi-label classification. Sparse representation of the resulting classifier is important both from efficient training and fast inference viewpoints. This is true especially when the training and test set sizes are large.Very few of the existing multiclass and multi-label classification algorithms have given importance to controlling the sparsity of the designed classifiers directly. Further, these algorithms were not found to be scalable. Motivated by this, we propose new formulations for sparse multiclass and multi-label classifier design and also give efficient algorithms to solve them. The formulation for sparse multi-label classification also incorporates the prior knowledge of label correlations. In both the cases, the classification model is designed using a common set of basis vectors across all the classes. These basis vectors are greedily added to an initially empty model, to approximate the target function. The sparsity of the classifier can be controlled by a user defined parameter, dmax which indicates the max-imum number of common basis vectors. The computational complexity of these algorithms for multiclass and multi-label classifier designisO(lk2d2 max), Where l is the number of training set examples and k is the number of classes. The inference time for the proposed multiclass and multi-label classifiers is O(kdmax). Numerical experiments on various real-world benchmark datasets demonstrate that the proposed algorithms result in sparse classifiers that require lesser number of basis vectors than required by state-of-the-art algorithms, to attain the same generalization performance. Very small value of dmax results in significant reduction in inference time. Thus, the proposed algorithms provide useful alternatives to the existing algorithms for sparse multiclass and multi-label classifier design.
22

Geração automática de laudos médicos para o diagnóstico de epilepsia por meio do processamento de eletroencefalogramas utilizando aprendizado de máquina / Automatic Generation of Medical Reports for Epilepsy Diagnosis through Electroencephalogram Processing using Machine Learning

Oliva, Jefferson Tales 05 December 2018 (has links)
A epilepsia, cujas crises são resultantes de distúrbios elétricos temporários no cérebro, é a quarta enfermidade neurológica mais comum, atingindo aproximadamente 50 milhões de pessoas. Essa enfermidade pode ser diagnosticada por meio de eletroencefalogramas (EEG), que são de elevada importância para o diagnóstico de enfermidades cerebrais. As informações consideradas relevantes desses exames são descritas em laudos médicos, que são armazenados com o objetivo de manter o histórico clínico do paciente e auxiliar os especialistas da área médica na realização de procedimentos futuros, como a identificação de padrões de determinadas enfermidades. Entretanto, o crescente aumento no armazenamento de dados médicos inviabiliza a análise manual dos mesmos. Outra dificuldade para a análise de EEG é a variabilidade de opiniões de especialistas sobre um mesmo padrão observado, podendo aumentar a dificuldade para o diagnóstico de enfermidades cerebrais. Também, os exames de EEG podem conter padrões relevantes difíceis de serem observados, mesmo por profissionais experientes. Da mesma forma, nos laudos podem faltar informações e/ou conter erros de digitação devido aos mesmos serem preenchidos apressadamente por especialistas. Assim, neste trabalho foi desenvolvido o método computacional de geração de laudos médicos (automatic generation of medical report AutoGenMR), que tem o propósito de auxiliar especialistas da área médica no diagnóstico de epilepsia e em tomadas de decisão. Esse processo é aplicado em duas fases: (1) construção de classificadores por meio de métodos de aprendizado de máquina e (2) geração automática de laudos textuais. O AutoGenMR foi avaliado experimentalmente em dois estudos de caso, para os quais, em cada um foi utilizada uma base de EEG disponibilizada publicamente e gratuitamente. Nessas avaliações foram utilizadas as mesmas configurações experimentais para a extração de características e construção de classificadores (desconsiderando que um dos problemas de classificação é multiclasse e o outro, binário). No primeiro estudo de caso, os modelos preditivos geraram, em média, 89% das expressões de laudos. Na segunda avaliação experimental, em média, 76% das sentenças de laudos foram geradas corretamente. Desse modo, os resultados de ambos estudos são considerados promissores, constatando que o AutoGenMR pode auxiliar especialistas na identificação de padrões relacionados a eventos epiléticos, na geração de laudos textuais padronizados e em processos de tomadas de decisão. / Epilepsy, which seizures are due to temporary electrical disturbances in the brain, is the fourth most common neurological disorder, affecting 50 million people, approximately. This disease can be diagnosed by electroencephalograms (EEG), which have great importance for the diagnosis of brain diseases. The information considered relevant in these tests is described in textual reports, which are stored in order to maintain the patients medical history and assist medical experts in performing such other procedures as the standard identification of certain diseases. However, the increasing medical data storage makes it unfeasible for manual analysis. Another challenge for the EEG analysis is the diversity of expert opinions on particular patterns observed and may increase the difficulty in diagnosing diseases of the brain. Moreover, the EEG may contain patterns difficult to be noticed even by experienced professionals. Similarly, the reports may not have information and/or include typographical errors due to its rushed filling by experts. Thereby, in this work, the automatic generation of medical report (AutoGenMR) method was developed in order to assist medical experts in the diagnosis of epilepsy and decision making. This method is applied in two phases: (1) classifier building by machine learning techniques and (2) automatic report generation. The AutoGenMR was computed in two case studies, for which, a public and freely available EEG database was used in each one. In both studies, the same experimental settings for feature extraction and classifier building were used. In the first study case, the classifiers correctly generated, on average, 89% of the report expressions. In the second experiment, on average, 76% of the report sentences were successfully generated. In this sense, the results of both studies are considered promising, noting that the AutoGenMR can assist medical experts in the identification of patterns related to epileptic events, standardized textual report generation, and in decision-making processes.
23

Flutuações do choque no processo de Hammersley / Shock fluctuations in Hammersley process

Souza, Marcio Watanabe Alves de 30 September 2013 (has links)
No presente trabalho provamos resultados sobre as flutuações dos fluxos de partículas e das partículas marcadas no processo de Hammersley multiclasse. Os métodos das demonstrações são robustos, formulados de modo a serem aplicados em outros processos, em particular se aplicam ao processo de exclusão totalmente assimétrico multiclasse (TASEP multiclasse) e à seu respectivo modelo de percolação de última passagem. Os principais teoremas obtidos são um teorema central do limite para o choque, seu coeficiente de difusão e uma fórmula exata para a variância do fluxo de partículas de classe N >1 para o processo em equilíbrio multiclasse. / We prove fluctuations results concerning fluxes of particles and tagged particles on multiclass Hammersley process. The methods used are robust and apply to other processes, in particular all the proofs can be adapted to the Multiclass totally asymmetric simple exclusion process (Multiclass TASEP) and its respective last passage percolation model. The main theorems obtained are a central limit theorem for the shock, its diffusion coefficient and an exact formula for the variance of the $N$-th class particle flux in a stationary version of the multiclass process when N > 1.
24

Smart task logging : Prediction of tasks for timesheets with machine learning

Bengtsson, Emil, Mattsson, Emil January 2018 (has links)
Every day most people are using applications and services that are utilising machine learning, in some way, without even knowing it. Some of these applications and services could, for example, be Google’s search engine, Netflix’s recommendations, or Spotify’s music tips. For machine learning to work it needs data, and often a large amount of it. Roughly 2,5 quintillion bytes of data are created every day in the modern information society. This huge amount of data can be utilised to make applications and systems smarter and automated. Time logging systems today are usually not smart since users of these systems still must enter data manually. This bachelor thesis will explore the possibility of applying machine learning to task logging systems, to make it smarter and automated. The machine learning algorithm that is used to predict the user’s task, is called multiclass logistic regression, which is categorical. When a small amount of training data was used in the machine learning process the predictions of a task had a success rate of about 91%.
25

Flutuações do choque no processo de Hammersley / Shock fluctuations in Hammersley process

Marcio Watanabe Alves de Souza 30 September 2013 (has links)
No presente trabalho provamos resultados sobre as flutuações dos fluxos de partículas e das partículas marcadas no processo de Hammersley multiclasse. Os métodos das demonstrações são robustos, formulados de modo a serem aplicados em outros processos, em particular se aplicam ao processo de exclusão totalmente assimétrico multiclasse (TASEP multiclasse) e à seu respectivo modelo de percolação de última passagem. Os principais teoremas obtidos são um teorema central do limite para o choque, seu coeficiente de difusão e uma fórmula exata para a variância do fluxo de partículas de classe N >1 para o processo em equilíbrio multiclasse. / We prove fluctuations results concerning fluxes of particles and tagged particles on multiclass Hammersley process. The methods used are robust and apply to other processes, in particular all the proofs can be adapted to the Multiclass totally asymmetric simple exclusion process (Multiclass TASEP) and its respective last passage percolation model. The main theorems obtained are a central limit theorem for the shock, its diffusion coefficient and an exact formula for the variance of the $N$-th class particle flux in a stationary version of the multiclass process when N > 1.
26

Exploring Deep Learning Frameworks for Multiclass Segmentation of 4D Cardiac Computed Tomography / Utforskning av djupinlärningsmetoder för 4D segmentering av hjärtat från datortomografi

Janurberg, Norman, Luksitch, Christian January 2021 (has links)
By combining computed tomography data with computational fluid dynamics, the cardiac hemodynamics of a patient can be assessed for diagnosis and treatment of cardiac disease. The advantage of computed tomography over other medical imaging modalities is its capability of producing detailed high resolution images containing geometric measurements relevant to the simulation of cardiac blood flow. To extract these geometries from computed tomography data, segmentation of 4D cardiac computed tomography (CT) data has been performed using two deep learning frameworks that combine methods which have previously shown success in other research. The aim of this thesis work was to develop and evaluate a deep learning based technique to segment the left ventricle, ascending aorta, left atrium, left atrial appendage and the proximal pulmonary vein inlets. Two frameworks have been studied where both utilise a 2D multi-axis implementation to segment a single CT volume by examining it in three perpendicular planes, while one of them has also employed a 3D binary model to extract and crop the foreground from surrounding background. Both frameworks determine a segmentation prediction by reconstructing three volumes after 2D segmentation in each plane and combining their probabilities in an ensemble for a 3D output.  The results of both frameworks show similarities in their performance and ability to properly segment 3D CT data. While the framework that examines 2D slices of full size volumes produces an overall higher Dice score, it is less successful than the cropping framework at segmenting the smaller left atrial appendage. Since the full size 2D slices also contain background information in each slice, it is believed that this is the main reason for better segmentation performance. While the cropping framework provides a higher proportion of each foreground label, making it easier for the model to identify smaller structures. Both frameworks show success for use in 3D cardiac CT segmentation, and with further research and tuning of each network, even better results can be achieved.
27

Comparing Weak and Strong Annotation Strategies for Multiple Instance Learning in Digital Pathology / Jämförelse av svaga och starka annoteringsstrategier för flerinstansinlärning i digital patologi

Ciallella, Alice January 2022 (has links)
Prostate cancer is the second most diagnosed cancer worldwide and its diagnosis is done through visual inspection of biopsy tissue by a pathologist, who assigns a score used by doctors to decide on the treatment. However, the scoring system, the Gleason score, is affected by a high inter and intra-observer variability, lack of standardization, and overestimation. Therefore, there is a need for new solutions that can reduce these issues and provide a more accurate diagnosis. Nowadays, high-resolution digital images of biopsy tissues can be obtained and stored. The availability of such images, called Whole Slide Images (WSI) allows the implementation of Machine and Deep learning models to assist pathologists in diagnosing prostate cancer. Multiple-Instance Learning (MIL) has been shown to reach very promising results in digital pathology and binary classification of prostate cancer slides. However, such models require large datasets to ensure good performances. This project wants to investigate the use of small sets of strongly annotated images to create new large datasets to train a MIL model. To evaluate the performance of this approach, the standard dataset is used to obtain baselines for both binary and multiclass classification tasks. For multiclassification, the International Society of Urological Pathology (ISUP) score is used, which is derived from the Gleason score. The dataset used is the publicly available PANDA. In this project, only the slides from RadboudUniversity Medical Center are used, which consists of 5160 images. The MIL model chosen is the Clustering-constrained Attention Multiple instance learning (CLAM) model, which is publicly available. The standard approach reaches a Cohen’s kappa (κ) of 0.78 and 0.59 for binary and multiclass classification respectively. To evaluate the new approach, large datasets are created starting from different set sizes. Using 500 images, the model reaches a κ of 0.72 and 0.38 respectively. While for the binary the results of the two approaches are comparable, the new approach is not beneficial for multiclass classification tasks.
28

Interpretable Binary and Multiclass Prediction Models for Insolvencies and Credit Ratings

Obermann, Lennart 10 May 2016 (has links)
Insolvenzprognosen und Ratings sind wichtige Aufgaben der Finanzbranche und dienen der Kreditwürdigkeitsprüfung von Unternehmen. Eine Möglichkeit dieses Aufgabenfeld anzugehen, ist maschinelles Lernen. Dabei werden Vorhersagemodelle aufgrund von Beispieldaten aufgestellt. Methoden aus diesem Bereich sind aufgrund Ihrer Automatisierbarkeit vorteilhaft. Dies macht menschliche Expertise in den meisten Fällen überflüssig und bietet dadurch einen höheren Grad an Objektivität. Allerdings sind auch diese Ansätze nicht perfekt und können deshalb menschliche Expertise nicht gänzlich ersetzen. Sie bieten sich aber als Entscheidungshilfen an und können als solche von Experten genutzt werden, weshalb interpretierbare Modelle wünschenswert sind. Leider bieten nur wenige Lernalgorithmen interpretierbare Modelle. Darüber hinaus sind einige Aufgaben wie z.B. Rating häufig Mehrklassenprobleme. Mehrklassenklassifikationen werden häufig durch Meta-Algorithmen erreicht, welche mehrere binäre Algorithmen trainieren. Die meisten der üblicherweise verwendeten Meta-Algorithmen eliminieren jedoch eine gegebenenfalls vorhandene Interpretierbarkeit. In dieser Dissertation untersuchen wir die Vorhersagegenauigkeit von interpretierbaren Modellen im Vergleich zu nicht interpretierbaren Modellen für Insolvenzprognosen und Ratings. Wir verwenden disjunktive Normalformen und Entscheidungsbäume mit Schwellwerten von Finanzkennzahlen als interpretierbare Modelle. Als nicht interpretierbare Modelle werden Random Forests, künstliche Neuronale Netze und Support Vector Machines verwendet. Darüber hinaus haben wir einen eigenen Lernalgorithmus Thresholder entwickelt, welcher disjunktive Normalformen und interpretierbare Mehrklassenmodelle generiert. Für die Aufgabe der Insolvenzprognose zeigen wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen nicht unterlegen sind. Dazu wird in einer ersten Fallstudie eine in der Praxis verwendete Datenbank mit Jahresabschlüssen von 5152 Unternehmen verwendet, um die Vorhersagegenauigkeit aller oben genannter Modelle zu messen. In einer zweiten Fallstudie zur Vorhersage von Ratings demonstrieren wir, dass interpretierbare Modelle den nicht interpretierbaren Modellen sogar überlegen sind. Die Vorhersagegenauigkeit aller Modelle wird anhand von drei in der Praxis verwendeten Datensätzen bestimmt, welche jeweils drei Ratingklassen aufweisen. In den Fallstudien vergleichen wir verschiedene interpretierbare Ansätze bezüglich deren Modellgrößen und der Form der Interpretierbarkeit. Wir präsentieren exemplarische Modelle, welche auf den entsprechenden Datensätzen basieren und bieten dafür Interpretationsansätze an. Unsere Ergebnisse zeigen, dass interpretierbare, schwellwertbasierte Modelle den Klassifikationsproblemen in der Finanzbranche angemessen sind. In diesem Bereich sind sie komplexeren Modellen, wie z.B. den Support Vector Machines, nicht unterlegen. Unser Algorithmus Thresholder erzeugt die kleinsten Modelle während seine Vorhersagegenauigkeit vergleichbar mit den anderen interpretierbaren Modellen bleibt. In unserer Fallstudie zu Rating liefern die interpretierbaren Modelle deutlich bessere Ergebnisse als bei der zur Insolvenzprognose (s. o.). Eine mögliche Erklärung dieser Ergebnisse bietet die Tatsache, dass Ratings im Gegensatz zu Insolvenzen menschengemacht sind. Das bedeutet, dass Ratings auf Entscheidungen von Menschen beruhen, welche in interpretierbaren Regeln, z.B. logischen Verknüpfungen von Schwellwerten, denken. Daher gehen wir davon aus, dass interpretierbare Modelle zu den Problemstellungen passen und diese interpretierbaren Regeln erkennen und abbilden.
29

Alguns processos relacionados a modelos de fluxo de tráfego / Some processes related with traffic flow models.

Souza, Marcio Watanabe Alves de 20 February 2009 (has links)
No presente trabalho, estudamos alguns sistemas de partículas interagentes que podem ser vistos como modelos simples de fluxo de tráfego, a saber: O Processo de Hammersley-Aldous-Diaconis e o Processo de Exclusão. Exploramos suas representações como modelos de crescimento no plano. Ênfase é dada aos casos em que há mais de um tipo de partícula, aos processos multiclasses e às suas relações com modelos de filas. Analogia entre os modelos é usada para provar os resultados. Por fim, damos uma nova prova para o cálculo da variância assintótica reescalonada do fluxo de partículas de segunda classe no processo de Hammersley multiclasse em equilíbrio. / In the present work we study the following interacting particle systems which can be seen as simple models of traffic flow: The Hammersley-Aldous-Diaconis Process and the Exclusion Process. We explore the related growth models in the plane. Focus is given to cases where there are more than one kind of particles, to the multitype processes and to their relations with queue models. Analogy between the models is used to prove the results. At last, we give a new proof for the calculation of the asimptotic flux of second class particles in the Multiclass Hammersley process in equilibrium.
30

"Investigação de estratégias para a geração de máquinas de vetores de suporte multiclasses" / Investigation of strategies for the generation of multiclass support vector machines

Lorena, Ana Carolina 16 February 2006 (has links)
Diversos problemas envolvem a classificação de dados em categorias, também denominadas classes. A partir de um conjunto de dados cujas classes são conhecidas, algoritmos de Aprendizado de Máquina (AM) podem ser utilizados na indução de um classificador capaz de predizer a classe de novos dados do mesmo domínio, realizando assim a discriminação desejada. Dentre as diversas técnicas de AM utilizadas em problemas de classificação, as Máquinas de Vetores de Suporte (Support Vector Machines - SVMs) se destacam por sua boa capacidade de generalização. Elas são originalmente concebidas para a solução de problemas com apenas duas classes, também denominados binários. Entretanto, diversos problemas requerem a discriminação dos dados em mais que duas categorias ou classes. Nesta Tese são investigadas e propostas estratégias para a generalização das SVMs para problemas com mais que duas classes, intitulados multiclasses. O foco deste trabalho é em estratégias que decompõem o problema multiclasses original em múltiplos subproblemas binários, cujas saídas são então combinadas na obtenção da classificação final. As estratégias propostas visam investigar a adaptação das decomposições a cada aplicação considerada, a partir de informações do desempenho obtido em sua solução ou extraídas de seus dados. Os algoritmos implementados foram avaliados em conjuntos de dados gerais e em aplicações reais da área de Bioinformática. Os resultados obtidos abrem várias possibilidades de pesquisas futuras. Entre os benefícios verificados tem-se a obtenção de decomposições mais simples, que requerem menos classificadores binários na solução multiclasses. / Several problems involve the classification of data into categories, also called classes. Given a dataset containing data whose classes are known, Machine Learning (ML) algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, thus performing the desired discrimination. Among the several ML techniques applied to classification problems, the Support Vector Machines (SVMs) are known by their high generalization ability. They are originally conceived for the solution of problems with only two classes, also named binary problems. However, several problems require the discrimination of examples into more than two categories or classes. This thesis investigates and proposes strategies for the generalization of SVMs to problems with more than two classes, known as multiclass problems. The focus of this work is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are then combined to obtain the final classification. The proposed strategies aim to investigate the adaptation of the decompositions for each multiclass application considered, using information of the performance obtained for its solution or extracted from its examples. The implemented algorithms were evaluated on general datasets and on real applications from the Bioinformatics domain. The results obtained open possibilities of many future work. Among the benefits observed is the obtainment of simpler decompositions, which require less binary classifiers in the multiclass solution.

Page generated in 0.0438 seconds