Global ETD Search

741	Accelerating university-industry collaborations with MLOps : A case study about the cooperation of Aimo and the Linnaeus University / Accelerating university-industry collaborations with MLOps : A case study about the cooperation of Aimo and the Linnaeus University Pistor, Nico January 2023 (has links) Many developed machine learning models are not used in production applications as several challenges must be solved to develop and deploy ML models. Manual reimplementation and heterogeneous environments increase the effort required to develop an ML model or improve an existing one, considerably slowing down the overall process. Furthermore, it is required that a model is constantly monitored to ensure high-quality predictions and avoid possible drifts or biases. MLOps processes solve these challenges and streamline the development and deployment process by covering the whole life cycle of ML models. Even if the research area of MLOps, which applies DevOps principles to ML models, is relatively new, several researchers have already developed abstract MLOps process models. Research for cases with multiple collaboration partners is rare. This research project aims to develop an MLOps process for cases involving multiple collaboration partners. Hence, a case study is conducted with the cooperation of Aimo and LNU as a single case. Aimo requires ML models for their application and collaborates with LNU regarding this demand. LNU develops ML models based on the provided data, which Aimo integrates into their application afterward. This case is analyzed in-depth to identify challenges and the current process. These results are required to elaborate a suitable MLOps process for the case, which also considers the handover of artifacts between the collaboration partners. This process is derived from the already existing general MLOps process models. It is also instantiated to generate a benefit for the case and evaluate the feasibility of the MLOps process. Required components are identified, and existing MLOps tools are collected and compared, leading to the selection of suitable tools for the case. A project template is implemented and applied to an ML model project of the case to show the feasibility. As a result, this research project provides a concrete MLOps process. Besides that, several artifacts were elaborated, such as a project template for ML models in which the selected toolset is applied. These results mainly fit the analyzed case. Nevertheless, several findings are also generalizable such as the identified challenges. The compared alternatives and the generally applied method to elaborate an MLOps process can also be applied to other settings. This is also the case for several artifacts of this project, such as the tool comparison table and the applied process to select suitable tools. This case study shows that it is possible to set up MLOps processes with a high maturity level in situations where multiple cooperation partners are involved and artifacts need to be transferred among them. MLOps Machine Learning Operations Development Process Machine Learning Artificial Intelligence DevOps Collaboration Computer Systems Datorsystem
742	Fine-Grained Bayesian Zero-Shot Object Recognition Sarkhan Badirli (11820785) 03 January 2022 (has links) <div>Building machine learning algorithms to recognize objects in real-world tasks is a very challenging problem. With increasing number of classes, it becomes very costly and impractical to collect samples for all classes to obtain an exhaustive data to train the model. This limited labeled data bottleneck prevails itself more profoundly over fine grained object classes where some of these classes may lack any labeled representatives in the training data. A robust algorithm in this realistic scenario will be required to classify samples from well-represented classes as well as to handle samples from unknown origin. In this thesis, we break down this difficult task into more manageable sub-problems and methodically explore novel solutions to address each component in a sequential order.</div><div><br></div><div>We begin with zero-shot learning (ZSL) scenario where classes that are lacking any labeled images in the training data, i.e., unseen classes, are assumed to have some semantic descriptions associated with them. The ZSL paradigm is motivated by analogy to humans’ learning process. We human beings can recognize new categories by just knowing some semantic descriptions of them without even seeing any instances from these categories. We</div><div>develop a novel hierarchical Bayesian classifier for ZSL task. The two-layer architecture of the model is specifically designed to exploit the implicit hierarchy present among classes, in particular evident in fine-grained datasets. In the proposed method, there are latent classes that define the class hierarchy in the image space and semantic information is used to build the Bayesian hierarchy around these meta-classes. Our Bayesian model imposes local priors on semantically similar classes that share the same meta-class to realize knowledge transfer. We finally derive posterior predictive distributions to reconcile information about local and global priors and then blend them with data likelihood for the final likelihood calculation. With its closed form solution, our two-layer hierarchical classifier proves to be fast in training and flexible to model both fine and coarse-grained datasets. In particular, for challenging fine-grained datasets the proposed model can leverage the large number of seen classes to its advantage for a better local prior estimation without sacrificing on seen class accuracy.</div><div>Side information plays a critical role in ZSL and ZSL models hold on a strong assumption that the side information is strongly correlated with image features. Our model uses side information only to build hierarchy, thus, no explicit correlation between image features is assumed. This in turn leads the Bayesian model to be very resilient to various side</div><div>information sources as long as they are discriminative enough to define class hierarchy.</div><div><br></div><div>When dealing with thousands of classes, it becomes very difficult to obtain semantic descriptions for fine grained classes. For example, in species classification where classes display very similar morphological traits, it is impractical if not impossible to derive characteristic</div><div>visual attributes that can distinguish thousands of classes. Moreover, it would be unrealistic to assume that an exhaustive list of visual attributes characterizing all object classes, both seen and unseen, can be determined based only on seen classes. We propose DNA as a side</div><div>information to overcome this obstacle in order to do fine grained zero-shot species classification. We demonstrate that 658 base pair long DNA barcodes can be sufficient to serve as a robust source of side information for newly compiled insect dataset with more than thousand</div><div>classes. The experiments is further validated on well-known CUB dataset on which DNA attributes proves to be as competitive as word vectors. Our proposed Bayesian classifier delivers state of the art results on both datasets while using DNA as side information.</div><div><br></div><div>Traditional ZSL framework, however, is not quite suitable for scalable species identification and discovery. For example, insects are one of the largest groups of animal kingdom</div><div>with estimated 5.5 million species yet only 20% of them is described. We extend the traditional ZSL into a more practical framework where no explicit side information is available for unseen classes. We transform our Bayesian model to utilize taxonomical hierarchy of species</div><div>to perform insect identification at scale. Our approach is the first to combine two different data modalities, namely image and DNA information, to perform insect identification with</div><div>more than thousand classes. Our algorithm not only classifies known species with impressive 97% accuracy but also identifies unknown species and classify them to their true genus with 81% accuracy.</div><div><br></div><div>Our approach has the ability to address some major societal issues in climate change such as changing insect distributions and measuring biodiversity across the world. We believe this work can pave the way for more precise and more importantly the scalable monitoring of</div><div>biodiversity and can become instrumental in offering objective measures of the impacts of recent changes our planet has been going through.</div> Computer Vision computer vision algorithms Machine Learning Hierarchical Bayesian Classification
743	The automatic recognition of emotions in speech Manamela, Phuti, John January 2020 (has links) Thesis(M.Sc.(Computer Science)) -- University of Limpopo, 2020 / Speech emotion recognition (SER) refers to a technology that enables machines to detect and recognise human emotions from spoken phrases. In the literature, numerous attempts have been made to develop systems that can recognise human emotions from their voice, however, not much work has been done in the context of South African indigenous languages. The aim of this study was to develop an SER system that can classify and recognise six basic human emotions (i.e., sadness, fear, anger, disgust, happiness, and neutral) from speech spoken in Sepedi language (one of South Africa’s official languages). One of the major challenges encountered, in this study, was the lack of a proper corpus of emotional speech. Therefore, three different Sepedi emotional speech corpora consisting of acted speech data have been developed. These include a RecordedSepedi corpus collected from recruited native speakers (9 participants), a TV broadcast corpus collected from professional Sepedi actors, and an Extended-Sepedi corpus which is a combination of Recorded-Sepedi and TV broadcast emotional speech corpora. Features were extracted from the speech corpora and a data file was constructed. This file was used to train four machine learning (ML) algorithms (i.e., SVM, KNN, MLP and Auto-WEKA) based on 10 folds validation method. Three experiments were then performed on the developed speech corpora and the performance of the algorithms was compared. The best results were achieved when Auto-WEKA was applied in all the experiments. We may have expected good results for the TV broadcast speech corpus since it was collected from professional actors, however, the results showed differently. From the findings of this study, one can conclude that there are no precise or exact techniques for the development of SER systems, it is a matter of experimenting and finding the best technique for the study at hand. The study has also highlighted the scarcity of SER resources for South African indigenous languages. The quality of the dataset plays a vital role in the performance of SER systems. / National research foundation (NRF) and Telkom Center of Excellence (CoE) Speech emotion recognition Machine learning Feature extraction Classification Emotional speech database Automatic speech recognition Machine learning
744	Using Instance-Level Meta-Information to Facilitate a More Principled Approach to Machine Learning Smith, Michael Reed 01 April 2015 (has links) (PDF) As the capability for capturing and storing data increases and becomes more ubiquitous, an increasing number of organizations are looking to use machine learning techniques as a means of understanding and leveraging their data. However, the success of applying machine learning techniques depends on which learning algorithm is selected, the hyperparameters that are provided to the selected learning algorithm, and the data that is supplied to the learning algorithm. Even among machine learning experts, selecting an appropriate learning algorithm, setting its associated hyperparameters, and preprocessing the data can be a challenging task and is generally left to the expertise of an experienced practitioner, intuition, trial and error, or another heuristic approach. This dissertation proposes a more principled approach to understand how the learning algorithm, hyperparameters, and data interact with each other to facilitate a data-driven approach for applying machine learning techniques. Specifically, this dissertation examines the properties of the training data and proposes techniques to integrate this information into the learning process and for preprocessing the training set.It also proposes techniques and tools to address selecting a learning algorithm and setting its hyperparameters.This dissertation is comprised of a collection of papers that address understanding the data used in machine learning and the relationship between the data, the performance of a learning algorithm, and the learning algorithms associated hyperparameter settings.Contributions of this dissertation include:* Instance hardness that examines how difficult an instance is to classify correctly.* hardness measures that characterize properties of why an instance may be misclassified.* Several techniques for integrating instance hardness into the learning process. These techniques demonstrate the importance of considering each instance individually rather than doing a global optimization which considers all instances equally.* Large-scale examinations of the investigated techniques including a large numbers of examined data sets and learning algorithms. This provides more robust results that are less likely to be affected by noise.* The Machine Learning Results Repository, a repository for storing the results from machine learning experiments at the instance level (the prediction for each instance is stored). This allows many data set-level measures to be calculated such as accuracy, precision, or recall. These results can be used to better understand the interaction between the data, learning algorithms, and associated hyperparameters. Further, the repository is designed to be a tool for the community where data can be downloaded and uploaded to follow the development of machine learning algorithms and applications. machine learning supervised learning classification meta-learning instance hardness machine learning results repository Computer Sciences
745	Using XAI Tools to Detect Harmful Bias in ML Models Virtanen, Klaus January 2022 (has links) In the past decade, machine learning (ML) models have become farmore powerful, and are increasingly being used in many important contexts. At the same time, ML models have become more complex, and harder to understand on their own, which has necessitated an interesting explainable AI (XAI), a field concerned with ensuring that ML and other AI system can be understood by human users and practitioners. One aspect of XAI is the development of ”explainers”, tools that take a more complex system (here: an ML model) and generate a simpler but sufficiently accurate model of this system — either globally or locally —to yield insight into the behaviour of the original system. As ML models have become more complex and prevalent, concerns that they may embody and perpetuate harmful social biases have also risen, with XAI being one proposed tool for bias detection. This paper investigates the ability of two explainers, LIME and SHAP, which explain the prediction of potentially more complex models by way of locally faithful linear models, to detect harmful social bias (here in the form of the influence of the racial makeup of a neighbourhood on property values), in a simple experiment involving two kinds of ML models, line arregression and an ensemble method, trained on the well-known Boston-housing dataset. The results show that LIME and SHAP appear to be helpful in bias detection, while also revealing an instance where the explanations do not quite reflect the workings of the model, while still yielding accurate insight into the predictions the model makes. Explainable AI XAI Machine Learning Bias Bias in Machine Learning LIME SHAP Computer Sciences Datavetenskap (datalogi)
746	Clustering SQL-queries using unsupervised machine learning Schmidt, Thomas January 2022 (has links) Decerno has created a business system that utilizes Microsoft's Entity Framework (EF) which is an object-database mapper. It can automatically generate SQL queries from code written in C#. Some of these queries has started to display significant increase in query response time which require further examination. The generated queries can vary in length between 3 to around 2500 tokens in length which makes it difficult to get an overview of what types of queries that are consistently slow. This thesis examines the possibility of using neural networks based on the transformer model in conjunction with the autoencoder in order to create feature rich embeddings from the SQL queries. The networks presented in this thesis are tasked with capturing the semantics of the SQL queries such that semantically similar queries will be mapped close to one another in the latent feature space. In order to investigate the impact of embedding dimension, several transformer based networks are constructed that calculate embeddings with varying embedding dimension. The dimensionality reduction algorithm UMAP is applied to the higher dimensional embeddings in order to enable the clustering algorithm DBSCAN to successfully be applied. The results show that unsupervised machine learning can be used in order to create feature-rich embeddings from SQL-queries but that higher dimensional embeddings are required as the models that encoded the SQL queries to embeddings with 5 dimensions and lower not yielded satisfactory results. Thus some sort of dimensionality reduction algorithm is required when assuming the method proposed in this thesis. Furthermore, the results did not indicate any correlation between semantic similarity and average response times. unsupervised machine learning machine learning SQL AI clustering Computer Sciences Datavetenskap (datalogi)
747	Oövervakad maskininlärning för att upptäcka bottar i online-tävlingar Saari, Lukas, Mårtensson, Emil January 2016 (has links) Digital marknadsföring är i dagsläget en snabbt växande bransch och aktörer söker ständigt efter nya sätt att bedriva marknadsföring. I denna rapport studeras en av dessa aktörer, Adoveo, vars värdeerbjudande är att inkludera ett tävlingsmoment i reklamkampanjerna som ger deltagare möjlighet att vinna priser. Problematiskt är dock att priserna riskeras att inte delas ut till mänskliga deltagare, utan istället delas ut till bottar som deltar i tävlingarna både omänskligt många gånger och med omänskligt bra resultat. Syftet med rapporten är att med hjälp av data från denna aktör försöka skilja mänskliga deltagare från bottar. För detta tillämpades två oövervakade maskininlärningsalgoritmer för att klustra datapunkterna, Gaussian Mixture Model och K-medelvärde. Resultatet var en otydlig klusterstruktur där det inte gick att pålitligt identifiera något kluster som mänskligt respektive botliknande. Orsakerna bakom denna osäkerhet var främst designen av reklamtävlingarna samt att attributen i den studerade datan var otillräckliga. Rekommendationer gavs till hur dessa problem skulle kunna åtgärdas. Slutligen genomfördes en analys avseende affärsnyttan med botsäkra tävlingar och vilket mervärde det skapar för företaget. Analysen visade att affärsnyttan från att botsäkra tävlingarna skulle vara stor, då det skulle ge fördelar gentemot konsumenter såväl som annonsörer och konkurrenter. / Digital marketing is a fast-growing market and its actors are constantlylooking for innovative and new ways of marketing. In this paper, an actoron this market called Adoveo will be studied. Their specialization and valueproposition is to include a competition part in their advertisement campaigns,giving its participators the possibility to win a prize. What could turn out to beproblematic is that the prizes are not rewarded to human contestants, insteadgoing to a bot that can participate in the competition with unreasonably goodresults. The purpose of this paper is to try to separate bots from human contestantswith the data provided from Adoveo. To that end, two unsupervised machinelearning algorithms were implemented to cluster the data points, GaussianMixture Model and K-Means. The result was an uninterpretable cluster structurefrom which there was no reliable identification of bot-like and human-likebehaviour to be made. The reason behind this was twofold, the design of thecompetition and a lack of decisive attributes in the data. Recommendationswere provided to how both of these issues could be rectified.Finally, an analysis was provided on the business value of bot-securingcompetitions and the value it gives to the company. The analysis showed thatthe business value of bot-securing competitions would be beneficial, becauseit would give a competitive advantage against competitors and also improvebusiness with advertisers and consumers. Unsupervised machine learning clustering GMM digital marketing machine learning Computer Sciences Datavetenskap (datalogi)
748	Efficient Machine Teaching Frameworks for Natural Language Processing Karamanolakis, Ioannis January 2022 (has links) The past decade has seen tremendous growth in potential applications of language technologies in our daily lives due to increasing data, computational resources, and user interfaces. An important step to support emerging applications is the development of algorithms for processing the rich variety of human-generated text and extracting relevant information. Machine learning, especially deep learning, has seen increasing success on various text benchmarks. However, while standard benchmarks have static tasks with expensive human-labeled data, real-world applications are characterized by dynamic task specifications and limited resources for data labeling, thus making it challenging to transfer the success of supervised machine learning to the real world. To deploy language technologies at scale, it is crucial to develop alternative techniques for teaching machines beyond data labeling. In this dissertation, we address this data labeling bottleneck by studying and presenting resource-efficient frameworks for teaching machine learning models to solve language tasks across diverse domains and languages. Our goal is to (i) support emerging real-world problems without the expensive requirement of large-scale manual data labeling; and (ii) assist humans in teaching machines via more flexible types of interaction. Towards this goal, we describe our collaborations with experts across domains (including public health, earth sciences, news, and e-commerce) to integrate weakly-supervised neural networks into operational systems, and we present efficient machine teaching frameworks that leverage flexible forms of declarative knowledge as supervision: coarse labels, large hierarchical taxonomies, seed words, bilingual word translations, and general labeling rules. First, we present two neural network architectures that we designed to leverage weak supervision in the form of coarse labels and hierarchical taxonomies, respectively, and highlight their successful integration into operational systems. Our Hierarchical Sigmoid Attention Network (HSAN) learns to highlight important sentences of potentially long documents without sentence-level supervision by, instead, using coarse-grained supervision at the document level. HSAN improves over previous weakly supervised learning approaches across sentiment classification benchmarks and has been deployed to help inspections in health departments for the discovery of foodborne illness outbreaks. We also present TXtract, a neural network that extracts attributes for e-commerce products from thousands of diverse categories without using manually labeled data for each category, by instead considering category relationships in a hierarchical taxonomy. TXtract is a core component of Amazon’s AutoKnow, a system that collects knowledge facts for over 10K product categories, and serves such information to Amazon search and product detail pages. Second, we present architecture-agnostic machine teaching frameworks that we applied across domains, languages, and tasks. Our weakly-supervised co-training framework can train any type of text classifier using just a small number of class-indicative seed words and unlabeled data. In contrast to previous work that use seed words to initialize embedding layers, our iterative seed word distillation (ISWD) method leverages the predictive power of seed words as supervision signals and shows strong performance improvements for aspect detection in reviews across domains and languages. We further demonstrate the cross-lingual transfer abilities of our co-training approach via cross-lingual teacher-student (CLTS), a method for training document classifiers across diverse languages using labeled documents only in English and a limited budget for bilingual translations. Not all classification tasks, however, can be effectively addressed using human supervision in the form of seed words. To capture a broader variety of tasks, we present weakly-supervised self-training (ASTRA), a weakly-supervised learning framework for training a classifier using more general labeling rules in addition to labeled and unlabeled data. As a complete set of accurate rules may be hard to obtain all in one shot, we further present an interactive framework that assists human annotators by automatically suggesting candidate labeling rules. In conclusion, this thesis demonstrates the benefits of teaching machines with different types of interaction than the standard data labeling paradigm and shows promising results for new applications across domains and languages. To facilitate future research, we publish our code implementations and design new challenging benchmarks with various types of supervision. We believe that our proposed frameworks and experimental findings will influence research and will enable new applications of language technologies without the costly requirement of large manually labeled datasets. Artificial intelligence Deep learning (Machine learning) Machine learning Amazon.com (Firm)
749	Accelerating Structural Design and Optimization using Machine Learning Singh, Karanpreet 13 January 2020 (has links) Machine learning techniques promise to greatly accelerate structural design and optimization. In this thesis, deep learning and active learning techniques are applied to different non-convex structural optimization problems. Finite Element Analysis (FEA) based standard optimization methods for aircraft panels with bio-inspired curvilinear stiffeners are computationally expensive. The main reason for employing many of these standard optimization methods is the ease of their integration with FEA. However, each optimization requires multiple computationally expensive FEA evaluations, making their use impractical at times. To accelerate optimization, the use of Deep Neural Networks (DNNs) is proposed to approximate the FEA buckling response. The results show that DNNs obtained an accuracy of 95% for evaluating the buckling load. The DNN accelerated the optimization by a factor of nearly 200. The presented work demonstrates the potential of DNN-based machine learning algorithms for accelerating the optimization of bio-inspired curvilinearly stiffened panels. But, the approach could have disadvantages for being only specific to similar structural design problems, and requiring large datasets for DNNs training. An adaptive machine learning technique called active learning is used in this thesis to accelerate the evolutionary optimization of complex structures. The active learner helps the Genetic Algorithms (GA) by predicting if the possible design is going to satisfy the required constraints or not. The approach does not need a trained surrogate model prior to the optimization. The active learner adaptively improve its own accuracy during the optimization for saving the required number of FEA evaluations. The results show that the approach has the potential to reduce the total required FEA evaluations by more than 50%. Lastly, the machine learning is used to make recommendations for modeling choices while analyzing a structure using FEA. The decisions about the selection of appropriate modeling techniques are usually based on an analyst's judgement based upon their knowledge and intuition from past experience. The machine learning-based approach provides recommendations within seconds, thus, saving significant computational resources for making accurate design choices. / Doctor of Philosophy / This thesis presents an innovative application of artificial intelligence (AI) techniques for designing aircraft structures. An important objective for the aerospace industry is to design robust and fuel-efficient aerospace structures. The state of the art research in the literature shows that the structure of aircraft in future could mimic organic cellular structure. However, the design of these new panels with arbitrary structures is computationally expensive. For instance, applying standard optimization methods currently being applied to aerospace structures to design an aircraft, can take anywhere from a few days to months. The presented research demonstrates the potential of AI for accelerating the optimization of an aircraft structures. This will provide an efficient way for aircraft designers to design futuristic fuel-efficient aircraft which will have positive impact on the environment and the world. Structural Design and Optimization Finite Element Methods Parallel Processing Machine learning Deep learning (Machine learning) Active Learning
750	A General Framework for Model Adaptation to Meet Practical Constraints in Computer Vision Huang, Shiyuan January 2024 (has links) Recent advances in deep learning models have shown impressive capabilities in various computer vision tasks, which encourages the integration of these models into real-world vision systems such as smart devices. This integration presents new challenges as models need to meet complex real-world requirements. This thesis is dedicated to building practical deep learning models, where we focus on two main challenges in vision systems: data efficiency and variability. We address these issues by providing a general model adaptation framework that extends models with practical capabilities. In the first part of the thesis, we explore model adaptation approaches for efficient representation. We illustrate the benefits of different types of efficient data representations, including compressed video modalities from video codecs, low-bit features and sparsified frames and texts. By using such efficient representation, the system complexity such as data storage, processing and computation can be greatly reduced. We systematically study various methods to extract, learn and utilize these representations, presenting new methods to adapt machine learning models for them. The proposed methods include a compressed-domain video recognition model with coarse-to-fine distillation training strategy, a task-specific feature compression framework for low-bit video-and-language understanding, and a learnable token sparsification approach for sparsifying human-interpretable video inputs. We demonstrate new perspectives of representing vision data in a more practical and efficient way in various applications. The second part of the thesis focuses on open environment challenges, where we explore model adaptation for new, unseen classes and domains. We examine the practical limitations in current recognition models, and introduce various methods to empower models in addressing open recognition scenarios. This includes a negative envisioning framework for managing new classes and outliers, and a multi-domain translation approach for dealing with unseen domain data. Our study shows a promising trajectory towards models exhibiting the capability to navigate through diverse data environments in real-world applications. Computer science Deep learning (Machine learning) Computer vision--Mathematical models Machine learning--Mathematical models Video compression

Search results