• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 397
  • 64
  • 43
  • 26
  • 6
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 626
  • 626
  • 284
  • 222
  • 213
  • 150
  • 138
  • 131
  • 101
  • 95
  • 93
  • 88
  • 80
  • 78
  • 78
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
221

Investigating a Supervised Learning and IMU Fusion Approach for Enhancing Bluetooth Anchors / Att förbättra Bluetooth-ankare med hjälp av övervakad inlärning och IMU

Mahrous, Wael, Joseph, Adam January 2024 (has links)
Modern indoor positioning systems encounter challenges inherent to indoor environments. Signal changes can stem from various factors like object movement, signal propagation, or obstructed line of sight. This thesis explores a supervised machine learning approach that integrates Bluetooth Low Energy (BLE) and inertial sensor data to achieve consistent angle and distance estimations. The method relies on BLE angle estimations and signal strength alongside additional sensor data from an Inertial Measurement Unit (IMU). Relevant features are extracted and a supervised learning model is trained and then validated on familiar environment tests. The model is then gradually introduced to more unfamiliar test environments, and its performance is evaluated and compared accordingly. This thesis project was conducted at the u-blox office and presents a comprehensive methodology utilizing their existing hardware. Several extensive experiments were conducted, refining both data collection procedures and experimental setups. This iterative approach facilitated the improvement of the supervised learning model, resulting in a proposed model architecture based on transformers and convolutional layers. The provided methodology encompasses the entire process, from data collection to the evaluation of the proposed supervised learning model, enabling direct comparisons with existing angle estimation solutions employed at u-blox. The results of these comparisons demonstrate more accurate outcomes compared to existing solutions when validated in familiar environments. However, performance gradually declines when introduced to a new environment, encountering a wider range of signal conditions than the supervised model had trained on. Distance estimations are then compared with the path loss propagation equation, showing an overall improvement. / Moderna inomhuspositioneringssystem möter utmaningar som förekommer i inomhusmiljöer. Signalförändringar kan bero på olika faktorer som objektets rörelse, signalutbredning eller blockerad siktlinje. Denna kandidat avhandling undersöker ett övervakat maskininlärningssätt som integrerar Bluetooth Low Energy (BLE) och tröghetssensorer för att uppnå konsekventa vinkel- och avståndsberäkningar. Metoden bygger på BLE-vinkelberäkningar och signalstyrka tillsammans med ytterligare sensordata från en Inertial Measurment Unit (IMU). Relevanta funktioner extraheras och en övervakad inlärningsmodell tränas och valideras sedan på tester i bekanta miljöer. Modellen introduceras sedan gradvis till mer obekanta testmiljöer, och dess prestanda utvärderas och jämförs därefter. Detta examensarbete genomfördes på u-blox kontor och presenterar en omfattande metodik som utnyttjar deras befintliga hårdvara. Flera omfattande experiment genomfördes, vilket förfinade både datainsamlingsprocedurer och experimentuppsättningar. Detta iterativa tillvägagångssätt underlättade förbättringen av den övervakade inlärningsmodellen, vilket resulterade i en föreslagen modellarkitektur baserad på transformatorer och konvolutionella lager. Den tillhandahållna metodiken omfattar hela processen, från datainsamling till utvärdering av den föreslagna övervakade inlärningsmodellen, vilket möjliggör direkta jämförelser med befintliga vinkelberäkningslösningar som används på u-blox. Resultaten av dessa jämförelser visar mer exakta resultat jämfört med befintliga lösningar när de valideras i bekanta miljöer. Dock minskar prestandan gradvis när den introduceras till en ny miljö, där den möter ett bredare spektrum av signalförhållanden än vad inlärningsmodellen har tränats på. Avståndsberäkningar jämförs sedan med en matematisk formel, kallat path loss propagation ekvationen, som ger distans som en funktion av uppmätt signalstyrka.
222

Leveraging Infrared Imaging with Machine Learning for Phenotypic Profiling

Liu, Xinwen January 2024 (has links)
Phenotypic profiling systematically maps and analyzes observable traits (phenotypes) exhibited in cells, tissues, organisms or systems in response to various conditions, including chemical, genetic and disease perturbations. This approach seeks to comprehensively understand the functional consequences of perturbations on biological systems, thereby informing diverse research areas such as drug discovery, disease modeling, functional genomics and systems biology. Corresponding techniques should capture high-dimensional features to distinguish phenotypes affected by different conditions. Current methods mainly include fluorescence imaging, mass spectrometry and omics technologies, coupled with computational analysis, to quantify diverse features such as morphology, metabolism and gene expression in response to perturbations. Yet, they face challenges of high costs, complicated operations and strong batch effects. Vibrational imaging offers an alternative for phenotypic profiling, providing a sensitive, cost-effective and easily operated approach to capture the biochemical fingerprint of phenotypes. Among vibrational imaging techniques, infrared (IR) imaging has further advantages of high throughput, fast imaging speed and full spectrum coverage compared with Raman imaging. However, current biomedical applications of IR imaging mainly concentrate on "digital disease pathology", which uses label-free IR imaging with machine learning for tissue pathology classification and disease diagnosis. The thesis contributes as the first comprehensive study of using IR imaging for phenotypic profiling, focusing on three key areas. First, IR-active vibrational probes are systematically designed to enhance metabolic specificity, thereby enriching measured features and improving sensitivity and specificity for phenotype discrimination. Second, experimental workflows are established for phenotypic profiling using IR imaging across biological samples at various levels, including cellular, tissue and organ, in response to drug and disease perturbations. Lastly, complete data analysis pipelines are developed, including data preprocessing, statistical analysis and machine learning methods, with additional algorithmic developments for analyzing and mapping phenotypes. Chapter 1 lays the groundwork for IR imaging by delving into the theory of IR spectroscopy theory and the instrumentation of IR imaging, establishing a foundation for subsequent studies. Chapter 2 discusses the principles of popular machine learning methods applied in IR imaging, including supervised learning, unsupervised learning and deep learning, providing the algorithmic backbone for later chapters. Additionally, it provides an overview of existing biomedical applications using label-free IR imaging combined with machine learning, facilitating a deeper understanding of the current research landscape and the focal points of IR imaging for traditional biomedical studies. Chapter 3-5 focus on applying IR imaging coupled with machine learning for novel application of phenotypic profiling. Chapter 3 explores the design and development of IR-active vibrational probes for IR imaging. Three types of vibrational probes, including azide, 13C-based probes and deuterium-based probes are introduced to study dynamic metabolic activities of protein, lipids and carbohydrates in cells, small organisms and mice for the first time. The developed probes largely improve the metabolic specificity of IR imaging, enhancing the sensitivity of IR imaging towards different phenotypes. Chapter 4 studies the combination of IR imaging, heavy water labeling and unsupervised learning for tissue metabolic profiling, which provides a novel method to map metabolic tissue atlas in complex mammalian systems. In particular, cell type-, tissue- and organ-specific metabolic profiles are identified with spatial information in situ. In addition, this method further captures metabolic changes during brain development and characterized intratumor metabolic heterogeneity of glioblastoma, showing great promise for disease modeling. Chapter 5 developed Vibrational Painting (VIBRANT), a method using IR imaging, multiplexed vibrational probes and supervised learning for cellular phenotypic profiling of drug perturbations. Three IR-active vibrational probes were designed to measure distinct essential metabolic activities in human cancer cells. More than 20,000 single-cell drug responses were collected, corresponding to 23 drug treatments. Supervised learning is used to accurately predict drug mechanism of action at single-cell level with minimal batch effects. We further designed an algorithm to discover drug candidates with novel mechanisms of action and evaluate drug combinations. Overall, VIBRANT has demonstrated great potential across multiple areas of phenotypic drug screening.
223

Μελέτη και σχεδίαση συστήματος ανάλυσης εικόνας κατατμημένου σπερματικού DNA με χρήση τεχνικών υπολογιστικής νοημοσύνης / Study and design of an image analysis system for sperm DNA fragmentation using computational intelligence techniques

Αλμπάνη, Ελένη 13 July 2010 (has links)
Ιατρικές έρευνες έχουν δείξει ότι η ανδρική υπογονιμότητα σχετίζεται άμεσα με την ύπαρξη κατατμημένου DNA στον πυρήνα των σπερματοζωαρίων. Οι διαταραχές στις τιμές της συγκέντρωσης σπερματοζωαρίων, της κινητικότητάς τους, του όγκου της εκσπερμάτισης και στη μορφολογία τους που παρατηρούνται σε ένα σπερμοδιάγραμμα έχουν σα βαθύτερο αίτιο την ύπαρξη κατατμημένου DNA. Το εργαστήριο πειραματικής εμβρυολογίας και ιστολογίας της Ιατρικής Αθηνών χρησιμοποιεί τη μέθοδο TUNEL (deoxynucleotidyl transferase-mediated dUTP nick end labeling) για να σηματοδοτήσει τα άκρα κάθε τμήματος του DNA με χρώμα διαφορετικό από αυτό που χρησιμοποιεί για το υπόλοιπο τμήμα του DNA. Αποτέλεσμα της επεξεργασίας που υφίστανται τα σπερματοζωάρια σε μια αντικειμενοφόρο πλάκα είναι ένα σύνολο από μπλε φθορίζοντα σπερματοζωάρια με πιθανό κόκκινο στο πυρήνα τους, στην περίπτωση που υπάρχει κατατμημένο DNA. Όσο μεγαλύτερος είναι ο βαθμός κατάτμησης, τόσο περισσότερο είναι το κόκκινο και τόσο περισσότερο παθολογικό το σπερματοζωάριο και άρα λιγότερο ικανό να γονιμοποιήσει. Τη διαδικασία της TUNEL ακολουθεί η φωτογράφηση της αντικειμενοφόρου πλάκας με κάμερα υψηλής ανάλυσης και μεγάλης ευαισθησίας, ειδική για εφαρμογές φθορισμού. Στη συνέχεια, οι εικόνες επεξεργάζονται με ειδικό λογισμικό, όπως έχει προταθεί στο «Automatic Analysis of TUNEL assay Microscope Images» από τους Kontaxakis et al. στο 2007 IEEE International Symposium on Signal Processing and Information Technology. Το αποτέλεσμα της επεξεργασίας των εικόνων είναι η ταξινόμηση των αντικειμένων που απεικονίζονται σε ομάδες από α) σπερματοζωάρια μονήρη β) επικαλυπτόμενα και γ) «σκουπίδια» όπως λευκοκύτταρα ή θραύσματα σπερματοζωαρίων. Στη συνέχεια για κάθε μονήρες σπερματοζωάριο γίνεται ο υπολογισμός των κόκκινων και μπλε pixels. Κατ’ αυτό τον τρόπο έχουμε ποσοτικοποιημένη την έκταση του κερματισμού κάθε σπερματοζωαρίου. Στόχος της διπλωματικής εργασίας είναι αρχικά η μελέτη και στη συνέχεια η σχεδίαση και υλοποίηση ενός συστήματος, το οποίο λαμβάνοντας υπόψη τα δεδομένα από την επεξεργασία εικόνας καθώς και δεδομένα που είναι γνωστά από το σπερμοδιάγραμμα, όπως η κινητικότητα και η συγκέντρωση των σπερματοζωαριών, χρησιμοποιώντας τεχνικές της υπολογιστικής νοημοσύνης θα εκπαιδεύεται και θα ταξινομεί αυτόματα ασθενείς ανάλογα με το συνολικό βαθμό κερματισμού του DNA τους. Τέλος, θα υπολογίζει και ένα κατώφλι ή μία περιοχή τιμών άνω της οποίας ένας ασθενής θα χαρακτηρίζεται ως στείρος. Απώτερος στόχος είναι να γίνει όλη η παραπάνω διαδικασία ένας έλεγχος ρουτίνας για τα εργαστήρια που ασχολούνται με την ανδρική υπογονιμότητα και την τεχνητή γονιμοποίηση, προφυλάσσοντας ζευγάρια από άσκοπες και επιβλαβείς για την υγεία της γυναίκας προσπάθειες τεχνητής γονιμοποίησης. / Studies have proven that male infertility is directly connected with the existence of fragmented DNA in sperm nucleus Structural disorders and functional abnormalities are often present in spermatozoa from infertile men, as they are the impact of DNA fragmentation. The histology and embryology laboratory in Medical School in Athens uses the TUNEL assay to mark the edges of DNA helix with color different from the rest of the helix. The result of this procedure is that the human spermatozoa are blue and in the interior of every cell, an area proportional to the degree of the cell DNA fragmentation has been stained in reddish color. The more reddish the area is, the more fragmented the DNA is and the more infertile the patient is. The TUNEL assay is followed by image collection using a camera of high sensitivity appropriate for fluorescence applications. Afterwards, the obtained images are processed as described in “Automatic Analysis of TUNEL assay Microscope Images” at IEEE International Symposium on Signal Processing and Information Technology in 2007. The results of the processing above is image segmentation, shapes classification in 3 groups, solitary spermatozoa, overlapped spermatozoa and debris and at last the area measurement of red pixel for each solitary spermatozoon. This way, we have in numbers how much fragmented the DNA is. This master thesis aims at the study and the design of a system, that taking into consideration the data from the image analysis accompanied by the data from the basic sperm analysis, like sperm concentration and motility, and using computational intelligence techniques, it will be trained and will automatically classify the patients according their DNA fragmentation degree. In the end, it will estimate a threshold or an area of values above which a patient will be considered as infertile. Our ultimate goal is the above procedure to be a routine for the labs that are dealing with male infertility and artificial insemination, so that couples are protected against pointless and prejudicial artificial insemination attempts.
224

Bayes Optimal Feature Selection for Supervised Learning

Saneem Ahmed, C G January 2014 (has links) (PDF)
The problem of feature selection is critical in several areas of machine learning and data analysis such as, for example, cancer classification using gene expression data, text categorization, etc. In this work, we consider feature selection for supervised learning problems, where one wishes to select a small set of features that facilitate learning a good prediction model in the reduced feature space. Our interest is primarily in filter methods that select features independently of the learning algorithm to be used and are generally faster to implement compared to other types of feature selection algorithms. Many common filter methods for feature selection make use of information-theoretic criteria such as those based on mutual information to guide their search process. However, even in simple binary classification problems, mutual information based methods do not always select the best set of features in terms of the Bayes error. In this thesis, we develop a general approach for selecting a set of features that directly aims to minimize the Bayes error in the reduced feature space with respect to the loss or performance measure of interest. We show that the mutual information based criterion is a special case of our setting when the loss function of interest is the logarithmic loss for class probability estimation. We give a greedy forward algorithm for approximately optimizing this criterion and demonstrate its application to several supervised learning problems including binary classification (with 0-1 error, cost-sensitive error, and F-measure), binary class probability estimation (with logarithmic loss), bipartite ranking (with pairwise disagreement loss), and multiclass classification (with multiclass 0-1 error). Our experiments suggest that the proposed approach is competitive with several state-of-the art methods.
225

Online Unsupervised Domain Adaptation / Online-övervakad domänanpassning

Panagiotakopoulos, Theodoros January 2022 (has links)
Deep Learning models have seen great application in demanding tasks such as machine translation and autonomous driving. However, building such models has proved challenging, both from a computational perspective and due to the requirement of a plethora of annotated data. Moreover, when challenged on new situations or data distributions (target domain), those models may perform inadequately. Such examples are transitioning from one city to another, different weather situations, or changes in sunlight. Unsupervised Domain adaptation (UDA) exploits unlabelled data (easy access) to adapt models to new conditions or data distributions. Inspired by the fact that environmental changes happen gradually, we focus on Online UDA. Instead of directly adjusting a model to a demanding condition, we constantly perform minor adaptions to every slight change in the data, creating a soft transition from the current domain to the target one. To perform gradual adaptation, we utilized state-of-the-art semantic segmentation approaches on increasing rain intensities (25, 50, 75, 100, and 200mm of rain). We demonstrate that deep learning models can adapt substantially better to hard domains when exploiting intermediate ones. Moreover, we introduce a model switching mechanism that allows adjusting back to the source domain, after adaptation, without dropping performance. / Deep Learning-modeller har sett stor tillämpning i krävande uppgifter som maskinöversättning och autonom körning. Att bygga sådana modeller har dock visat sig vara utmanande, både ur ett beräkningsperspektiv och på grund av kravet på en uppsjö av kommenterade data. Dessutom, när de utmanas i nya situationer eller datadistributioner (måldomän), kan dessa modeller prestera otillräckligt. Sådana exempel är övergång från en stad till en annan, olika vädersituationer eller förändringar i solljus. Unsupervised Domain adaptation (UDA) utnyttjar omärkt data (enkel åtkomst) för att anpassa modeller till nya förhållanden eller datadistributioner. Inspirerade av att miljöförändringar sker gradvis, fokuserar vi på Online UDA. Istället för att direkt anpassa en modell till ett krävande tillstånd, gör vi ständigt mindre anpassningar till varje liten förändring i data, vilket skapar en mjuk övergång från den aktuella domänen till måldomänen. För att utföra gradvis anpassning använde vi toppmoderna semantiska segmenteringsmetoder för att öka regnintensiteten (25, 50, 75, 100 och 200 mm regn). Vi visar att modeller för djupinlärning kan anpassa sig betydligt bättre till hårda domäner när man utnyttjar mellanliggande. Dessutom introducerar vi en modellväxlingsmekanism som tillåter justering tillbaka till källdomänen, efter anpassning, utan att tappa prestanda.
226

The Role of Data in Projected Quantum Kernels: The Higgs Boson Discrimination / Datans roll i projicerade kvantkärnor: Higgs Boson-diskriminering

Di Marcantonio, Francesco January 2022 (has links)
The development of quantum machine learning is bridging the way to fault tolerant quantum computation by providing algorithms running on the current noisy intermediate scale quantum devices.However, it is difficult to find use-cases where quantum computers exceed their classical counterpart.The high energy physics community is experiencing a rapid growth in the amount of data physicists need to collect, store, and analyze within the more complex experiments are being conceived.Our work approaches the study of a particle physics event involving the Higgs boson from a quantum machine learning perspective.We compare quantum support vector machine with the best classical kernel method grounding our study in a new theoretical framework based on metrics observing at three different aspects: the geometry between the classical and quantum learning spaces, the dimensionality of the feature space, and the complexity of the ML models.We exploit these metrics as a compass in the parameter space because of their predictive power. Hence, we can exclude those areas where we do not expect any advantage in using quantum models and guide our study through the best parameter configurations.Indeed, how to select the number of qubits in a quantum circuits and the number of datapoints in a dataset were so far left to trial and error attempts.We observe, in a vast parameter region, that the used classical rbf kernel model overtakes the performances of the devised quantum kernels.We include in this study the projected quantum kernel - a kernel able to reduce the expressivity of the traditional fidelity quantum kernel by projecting its quantum state back to an approximate classical representation through the measurement of local quantum systems.The Higgs dataset has been proved to be low dimensional in the quantum feature space meaning that the quantum encoding selected is not enough expressive for the dataset under study.Nonetheless, the optimization of the parameters on all the kernels proposed, classical and quantum, revealed a quantum advantage for the projected kernel which well classify the Higgs boson events and surpass the classical ML model. / Utvecklingen inom kvantmaskininlärning banar vägen för nya algoritmer att lösa krävande kvantberäkningar på dagens brusfyllda kvantkomponenter. Däremot är det en utmaning att finna användningsområden för vilka algoritmer som dessa visar sig mer effektiva än sina klassiska motsvarigheter. Forskningen inom högenergifysik upplever för tillfället en drastisk ökning i mängden data att samla, lagra och analysera inom mer komplexa experiment. Detta arbete undersöker Higgsbosonen ur ett kvantmaskinsinlärningsperspektiv. Vi jämför "quantum support vector machine" med den främsta klassiska metoden med avseende på tre olika metriker: geometrin av inlärningsrummen, dimensionaliteten av egenskapsrummen, och tidskomplexiteten av maskininlärningsmetoderna. Dessa tre metriker används för att förutsäga hur problemet manifesterar sig i parameterrummet. På så vis kan vi utesluta regioner i rummet där kvantalgoritmer inte förväntas överprestera klassiska algoritmer. Det finns en godtycklighet i hur antalet qubits och antalet datapunkter bestämms, och resultatet beror på dessa parametrar.I en utbredd region av parameterrummet observerar vi dock att den klassiska rbf-kärnmodellen överpresterar de studerade kvantkärnorna. I denna studie inkluderar vi en projicerad kvantkärna - en kärna som reducerar det totala kvanttillståndet till en ungefärlig klassisk representation genom att mäta en lokal del av kvantsystemet.Den studerade Higgs-datamängden har visat sig vara av låg dimension i kvantegenskapsrummet. Men optimering av parametrarna för alla kärnor som undersökts, klassiska såväl som kvantmekaniska, visade på ett visst kvantövertag för den projicerade kärnan som klassifierar de undersöka Higgs-händelserna som överstiger de klassiska maskininlärningsmodellerna.
227

Semi-Supervised Domain Adaptation for Semantic Segmentation with Consistency Regularization : A learning framework under scarce dense labels / Semi-Superviced Domain Adaption för semantisk segmentering med konsistensregularisering : Ett nytt tillvägagångsätt för lärande under brist på täta etiketter

Morales Brotons, Daniel January 2023 (has links)
Learning from unlabeled data is a topic of critical significance in machine learning, as the large datasets required to train ever-growing models are costly and impractical to annotate. Semi-Supervised Learning (SSL) methods aim to learn from a few labels and a large unlabeled dataset. In another approach, Domain Adaptation (DA) leverages data from a similar source domain to train a model for a target domain. This thesis focuses on Semi-Supervised Domain Adaptation (SSDA) for the dense task of semantic segmentation, where labels are particularly costly to obtain. SSDA has not received much attention yet, even though it has a great potential and represents a realistic scenario. The few existing SSDA methods for semantic segmentation reuse ideas from Unsupervised DA, despite the di↵erences between the two settings. This thesis proposes a new semantic segmentation framework designed particularly for the SSDA setting. The approach followed was to forego domain alignment and focus instead on enhancing clusterability of target domain features, an idea from SSL. The method is based on consistency regularization, combined with pixel contrastive learning and self-training. The proposed framework is found to be e↵ective not only in SSDA, but also in SSL. Ultimately, a unified solution for SSL and SSDA semantic segmentation is presented. Experiments were conducted on the target dataset of Cityscapes and source dataset of GTA5. The method proposed is competitive in both SSL and SSDA, and sets a new state-of-the-art for SSDA achieving a 65.6% mIoU (+4.4) on Cityscapes with 100 labeled samples. This thesis has an immediate impact on practical applications by proposing a new best-performing framework for the under-explored setting of SSDA. Furthermore, it also contributes towards the more ambitious goal of designing a unified solution for learning from unlabeled data. / Inlärning med hjälp av omärkt data är ett område av stor vikt inom maskininlärning. Detta på grund av att de stora datamängder som blivit nödvändiga för att träna konstant växande modeller både är kostsamma och opraktiska att implementera. Målet med Semi-Supervised Learning (SSL) är att kombinera ett fåtal etiketter med en stor mängd omärkt data för inlärning. Som ett annat tillvägagångssätt använder Domain Adaptation (DA) data från en liknande domän för att träna en annan måldomän. I Denna avhandling används Semi-Supervised Domain Adaptation (SSDA) för att utföra sådan semantisk segmentering, i vilken etiketter är särskilt kostsamma att erhålla. SSDA är ännu inte genererat mycket uppmärksamhet, även om det har en stor potential och representerar ett realistiskt scenario. De få metoder av SSDA som existerar för semantisk segmentering återanvänder idéer från Unsupervised DA, trots de olikheter som finns mellan de två modellerna. Denna avhandling föreslår ett nytt ramverk för semantisk segmentering, designat speciellt för SSDA modellen. Detta genom att försaka domänanpassning och i stället fokusera på att förbättra klusterbarheten av måldomänens egenskaper, en idé tagen från SSL. Metoden är baserad på konsistensregularisering, i kombination med pixelkontrastinlärning och självinlärning. Det föreslagna ramverket visar sig vara effektivt, inte bara för SSDA, men även för SSL. Till slut presenteras en enad lösning för semantisk segmentering med SLL och SSDA. Experiment utfördes på måldata från Cityscapes samt källdata från GTA5. Den föreslagna metoden är konkurrenskraftig både för SSL och SSDA, och blir världsledande för SSDA genom att uppnå 65,6% mIoU (+4,4) för Cityscapes med 100 märkta testdata. Denna avhandling har en omedelbar effekt gällande praktiska applikationer genom att föreslå ett nytt ”bäst resulterande” ramverk för dåligt utforskade inställningar av SSDA. Till yttermera visso bidrar avhandlingen även till det mer ambitiösa målet att designa en enad lösning för maskininlärning från omärkta data.
228

[pt] APRENDIZADO SEMI E AUTO-SUPERVISIONADO APLICADO À CLASSIFICAÇÃO MULTI-LABEL DE IMAGENS DE INSPEÇÕES SUBMARINAS / [en] SEMI AND SELF-SUPERVISED LEARNING APPLIED TO THE MULTI-LABEL CLASSIFICATION OF UNDERWATER INSPECTION IMAGE

AMANDA LUCAS PEREIRA 11 July 2023 (has links)
[pt] O segmento offshore de produção de petróleo é o principal produtor nacional desse insumo. Nesse contexto, inspeções submarinas são cruciais para a manutenção preventiva dos equipamentos, que permanecem toda a vida útil em ambiente oceânico. A partir dos dados de imagem e sensor coletados nessas inspeções, especialistas são capazes de prevenir e reparar eventuais danos. Tal processo é profundamente complexo, demorado e custoso, já que profissionais especializados têm que assistir a horas de vídeos atentos a detalhes. Neste cenário, o presente trabalho explora o uso de modelos de classificação de imagens projetados para auxiliar os especialistas a encontrarem o(s) evento(s) de interesse nos vídeos de inspeções submarinas. Esses modelos podem ser embarcados no ROV ou na plataforma para realizar inferência em tempo real, o que pode acelerar o ROV, diminuindo o tempo de inspeção e gerando uma grande redução nos custos de inspeção. No entanto, existem alguns desafios inerentes ao problema de classificação de imagens de inspeção submarina, tais como: dados rotulados balanceados são caros e escassos; presença de ruído entre os dados; alta variância intraclasse; e características físicas da água que geram certas especificidades nas imagens capturadas. Portanto, modelos supervisionados tradicionais podem não ser capazes de cumprir a tarefa. Motivado por esses desafios, busca-se solucionar o problema de classificação de imagens submarinas a partir da utilização de modelos que requerem menos supervisão durante o seu treinamento. Neste trabalho, são explorados os métodos DINO (Self-DIstillation with NO labels, auto-supervisionado) e uma nova versão multi-label proposta para o PAWS (Predicting View Assignments With Support Samples, semi-supervisionado), que chamamos de mPAWS (multi-label PAWS). Os modelos são avaliados com base em sua performance como extratores de features para o treinamento de um classificador simples, formado por uma camada densa. Nos experimentos realizados, para uma mesma arquitetura, se obteve uma performance que supera em 2.7 por cento o f1-score do equivalente supervisionado. / [en] The offshore oil production segment is the main national producer of this input. In this context, underwater inspections are crucial for the preventive maintenance of equipment, which remains in the ocean environment for its entire useful life. From the image and sensor data collected in these inspections,experts are able to prevent and repair damage. Such a process is deeply complex, time-consuming and costly, as specialized professionals have to watch hours of videos attentive to details. In this scenario, the present work explores the use of image classification models designed to help experts to find the event(s) of interest in under water inspection videos. These models can be embedded in the ROV or on the platform to perform real-time inference,which can speed up the ROV, monitor notification time, and greatly reduce verification costs. However, there are some challenges inherent to the problem of classification of images of armored submarines, such as: balanced labeled data are expensive and scarce; the presence of noise among the data; high intraclass variance; and some physical characteristics of the water that achieved certain specificities in the captured images. Therefore, traditional supervised models may not be able to fulfill the task. Motivated by these challenges, we seek to solve the underwater image classification problem using models that require less supervision during their training. In this work, they are explorers of the DINO methods (Self-Distillation with NO labels, self-supervised) anda new multi-label version proposed for PAWS (Predicting View AssignmentsWith Support Samples, semi-supervised), which we propose as mPAWS (multi-label PAWS). The models are evaluated based on their performance as features extractors for training a simple classifier, formed by a dense layer. In the experiments carried out, for the same architecture, a performance was obtained that exceeds by 2.7 percent the f1-score of the supervised equivalent.
229

Automating debugging through data mining / Automatisering av felsökning genom data mining

Thun, Julia, Kadouri, Rebin January 2017 (has links)
Contemporary technological systems generate massive quantities of log messages. These messages can be stored, searched and visualized efficiently using log management and analysis tools. The analysis of log messages offer insights into system behavior such as performance, server status and execution faults in web applications. iStone AB wants to explore the possibility to automate their debugging process. Since iStone does most parts of their debugging manually, it takes time to find errors within the system. The aim was therefore to find different solutions to reduce the time it takes to debug. An analysis of log messages within access – and console logs were made, so that the most appropriate data mining techniques for iStone’s system would be chosen. Data mining algorithms and log management and analysis tools were compared. The result of the comparisons showed that the ELK Stack as well as a mixture between Eclat and a hybrid algorithm (Eclat and Apriori) were the most appropriate choices. To demonstrate their feasibility, the ELK Stack and Eclat were implemented. The produced results show that data mining and the use of a platform for log analysis can facilitate and reduce the time it takes to debug. / Dagens system genererar stora mängder av loggmeddelanden. Dessa meddelanden kan effektivt lagras, sökas och visualiseras genom att använda sig av logghanteringsverktyg. Analys av loggmeddelanden ger insikt i systemets beteende såsom prestanda, serverstatus och exekveringsfel som kan uppkomma i webbapplikationer. iStone AB vill undersöka möjligheten att automatisera felsökning. Eftersom iStone till mestadels utför deras felsökning manuellt så tar det tid att hitta fel inom systemet. Syftet var att därför att finna olika lösningar som reducerar tiden det tar att felsöka. En analys av loggmeddelanden inom access – och konsolloggar utfördes för att välja de mest lämpade data mining tekniker för iStone’s system. Data mining algoritmer och logghanteringsverktyg jämfördes. Resultatet av jämförelserna visade att ELK Stacken samt en blandning av Eclat och en hybrid algoritm (Eclat och Apriori) var de lämpligaste valen. För att visa att så är fallet så implementerades ELK Stacken och Eclat. De framställda resultaten visar att data mining och användning av en plattform för logganalys kan underlätta och minska den tid det tar för att felsöka.
230

Machine Learning for Automatic Annotation and Recognition of Demographic Characteristics in Facial Images / Maskininlärning för Automatisk Annotering och Igenkänning av Demografiska Egenskaper hos Ansiktsbilder

Gustavsson Roth, Ludvig, Rimér Högberg, Camilla January 2024 (has links)
Recent increase in widespread use of facial recognition technologies have accelerated the utilization of demographic information, as extracted from facial features, yet it is accompanied by ethical concerns. It is therefore crucial, for ethical reasons, to ensure that algorithms like face recognition algorithms employed in legal proceedings are equitable and thoroughly documented across diverse populations. Accurate classification of demographic traits are therefore essential for enabling a comprehensive understanding of other algorithms. This thesis explores how classical machine learning algorithms compare to deep-learning models in predicting sex, age and skin color, concluding that the more compute-heavy deep-learning models, where the best performing models achieved an MCC of 0.99, 0.48 and 0.85 for sex, age and skin color respectively, significantly outperform their classical machine learning counterparts which achieved an MCC of 0.57, 0.22 and 0.54 at best. Once establishing that the deep-learning models are superior, further methods such as semi-supervised learning, a multi-characteristic classifier, sex-specific age classifiers and using tightly cropped facial images instead of upper-body images were employed to try and improve the deep-learning results. Throughout all deep-learning experiments the state of the art vision transformer and convolutional neural network were compared. Whilst the different architectures performed remarkably alike, a slight edge was seen for the convolutional neural network. The results further show that using cropped facial images generally improve the model performance and that more specialized models achieve modest improvements as compared to their less specialized counterparts. Semi-supervised learning showed potential in slightly improving the models further. The predictive performances achieved in this thesis indicate that the deep-learning models can reliably predict demographic features close to, or surpassing, a human.

Page generated in 0.065 seconds