• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 202
  • 21
  • 18
  • 9
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 338
  • 338
  • 124
  • 114
  • 84
  • 81
  • 81
  • 65
  • 64
  • 64
  • 58
  • 50
  • 48
  • 48
  • 47
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
241

Handling Domain Shift in 3D Point Cloud Perception

Saltori, Cristiano 10 April 2024 (has links)
This thesis addresses the problem of domain shift in 3D point cloud perception. In the last decades, there has been tremendous progress in within-domain training and testing. However, the performance of perception models is affected when training on a source domain and testing on a target domain sampled from different data distributions. As a result, a change in sensor or geo-location can lead to a harmful drop in model performance. While solutions exist for image perception, addressing this problem in point clouds remains unresolved. The focus of this thesis is the study and design of solutions for mitigating domain shift in 3D point cloud perception. We identify several settings differing in the level of target supervision and the availability of source data. We conduct a thorough study of each setting and introduce a new method to solve domain shift in each configuration. In particular, we study three novel settings in domain adaptation and domain generalization and propose five new methods for mitigating domain shift in 3D point cloud perception. Our methods are used by the research community, and at the time of writing, some of the proposed approaches hold the state-of-the-art. In conclusion, this thesis provides a valuable contribution to the computer vision community, setting the groundwork for the development of future works in cross-domain conditions.
242

<b>A Study on the Use of Unsupervised, Supervised, and Semi-supervised Modeling for Jamming Detection and Classification in Unmanned Aerial Vehicles</b>

Margaux Camille Marie Catafort--Silva (18477354) 02 May 2024 (has links)
<p dir="ltr">In this work, first, unsupervised machine learning is proposed as a study for detecting and classifying jamming attacks targeting unmanned aerial vehicles (UAV) operating at a 2.4 GHz band. Three scenarios are developed with a dataset of samples extracted from meticulous experimental routines using various unsupervised learning algorithms, namely K-means, density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering (AGG) and Gaussian mixture model (GMM). These routines characterize attack scenarios entailing barrage (BA), single- tone (ST), successive-pulse (SP), and protocol-aware (PA) jamming in three different settings. In the first setting, all extracted features from the original dataset are used (i.e., nine in total). In the second setting, Spearman correlation is implemented to reduce the number of these features. In the third setting, principal component analysis (PCA) is utilized to reduce the dimensionality of the dataset to minimize complexity. The metrics used to compare the algorithms are homogeneity, completeness, v-measure, adjusted mutual information (AMI) and adjusted rank index (ARI). The optimum model scored 1.00, 0.949, 0.791, 0.722, and 0.791, respectively, allowing the detection and classification of these four jamming types with an acceptable degree of confidence.</p><p dir="ltr">Second, following a different study, supervised learning (i.e., random forest modeling) is developed to achieve a binary classification to ensure accurate clustering of samples into two distinct classes: clean and jamming. Following this supervised-based classification, two-class and three-class unsupervised learning is implemented considering three of the four jamming types: BA, ST, and SP. In this initial step, the four aforementioned algorithms are used. This newly developed study is intended to facilitate the visualization of the performance of each algorithm, for example, AGG performs a homogeneity of 1.0, a completeness of 0.950, a V-measure of 0.713, an ARI of 0.557 and an AMI of 0.713, and GMM generates 1, 0.771, 0.645, 0.536 and 0.644, respectively. Lastly, to improve the classification of this study, semi-supervised learning is adopted instead of unsupervised learning considering the same algorithms and dataset. In this case, GMM achieves results of 1, 0.688, 0.688, 0.786 and 0.688 whereas DBSCAN achieves 0, 0.036, 0.028, 0.018, 0.028 for homogeneity, completeness, V-measure, ARI and AMI respectively. Overall, this unsupervised learning is approached as a method for jamming classification, addressing the challenge of identifying newly introduced samples.</p>
243

Segmentation in Tomography Data: Exploring Data Augmentation for Supervised and Unsupervised Voxel Classification with Neural Networks

Wagner, Franz 23 September 2024 (has links)
Computed Tomography (CT) imaging provides invaluable insight into internal structures of objects and organisms, which is critical for applications ranging from materials science to medical diagnostics. In CT data, an object is represented by a 3D reconstruction that is generated by combining multiple 2D X-ray images taken from various angles around the object. Each voxel, a volumetric pixel, within the reconstructed volume represents a small cubic element, allowing for detailed spatial representation. To extract meaningful information from CT imaging data and facilitate analysis and interpretation, accurate segmentation of internal structures is essential. However, this can be challenging due to various artifacts introduced by the physics of a CT scan and the properties of the object being imaged. This dissertation directly addresses this challenge by using deep learning techniques. Specifically, Convolutional Neural Networks (CNNs) are used for segmentation. However, they face the problem of limited training data. Data scarcity is addressed by data augmentation through the unsupervised generation of synthetic training data and the use of 2D and 3D data augmentation methods. A combination of these augmentation strategies allows for streamlining segmentation in voxel data and effectively addresses data scarcity. Essentially, the work aims to simplify training of CNNs, using minimal or no labeled data. To enhance accessibility to the results of this thesis, two user-friendly software solutions, unpAIred and AiSeg, have been developed. These platforms enable the generation of training data, data augmentation, as well as training, analysis, and application of CNNs. This cumulative work first examines simpler but efficient conventional data augmentation methods, such as radiometric and geometric image manipulations, which are already widely used in literature. However, these methods are usually randomly applied and do not follow a specific order. The primary focus of the first paper is to investigate this approach and to develop both online and offline data augmentation pipelines that allow for systematic sequencing of these operations. Offline augmentation involves augmenting training data stored on a drive, while online augmentation is performed dynamically at runtime, just before images are fed to the CNN. It is successfully shown that random data augmentation methods are inferior to the new pipelines. A careful comparison of 3D CNNs is then performed to identify optimal models for specific segmentation tasks, such as carbon and pore segmentation in CT scans of Carbon Reinforced Concrete (CRC). Through an evaluation of eight 3D CNN models on six datasets, tailored recommendations are provided for selecting the most effective model based on dataset characteristics. The analysis highlights the consistent performance of the 3D U-Net, one of the CNNs, and its residual variant, which excel at roving (a bundle of carbon fibers) and pore segmentation tasks. Based on the augmentation pipelines and the results of the 3D CNN comparison, the pipelines are extended to 3D, specifically targeting the segmentation of carbon in CT scans of CRC. A comparative analysis of different 3D augmentation strategies, including both offline and online augmentation variants, provides insight into their effectiveness. While offline augmentation results in fewer artifacts, it can only segment rovings already present in the training data, while online augmentation is essential for effectively segmenting different types of rovings contained in CT scans. However, constraints such as limited diversity of the dataset and overly aggressive augmentation that resulted in segmentation artifacts require further investigation to address data scarcity. Recognizing the need for a larger and more diverse dataset, this thesis extends the results of the three former papers by introducing a deep learning-based augmentation using a Generative Adversarial Network (GAN), called Contrastive Unpaired Translation (CUT), for synthetic training data generation. By combining the GAN with augmentation pipelines, semi-supervised and unsupervised end-to-end training methods are introduced and the successful generation of training data for 2D pore segmentation is demonstrated. However, challenges remain in achieving a stable 3D CUT implementation, which warrants further research and development efforts. In summary, the results of this dissertation address the challenges of accurate CT data segmentation in materials science through deep learning techniques and novel 2D and 3D online and offline augmentation pipelines. By evaluating different 3D CNN models, tailored recommendations for specific segmentation tasks are provided. Furthermore, the exploration of deep learning-based augmentation using CUT shows promising results in the generating synthetic training data. Future work will include the development of a stable implementation of a 3D CUT version, the exploration of new model architectures, and the development of sub-voxel accurate segmentation techniques. These have the potential for significant advances in segmentation in tomography data.:Abstract IV Zusammenfassung VI 1 Introduction 1 1.1 Thesis Structure 2 1.2 Scientific Context 3 1.2.1 Developments in the Segmentation in Tomography Data 3 1.2.2 3D Semantic Segmentation using Machine Learning 5 1.2.3 Data Augmentation 6 2 Developed Software Solutions: AiSeg and unpAIred 9 2.1 Software Design 10 2.2 Installation 11 2.3 AiSeg 11 2.4 unpAIred 12 2.5 Limitations 12 3 Factors Affecting Image Quality in Computed Tomography 13 3.1 From CT Scan to Reconstruction 13 3.2 X-ray Tube and Focal Spot 14 3.3 Beam Hardening 14 3.4 Absorption, Scattering and Pairing 15 3.5 X-ray Detector 16 3.6 Geometric Calibration 17 3.7 Reconstruction Algorithm 17 3.8 Artifact corrections 18 4 On the Development of Augmentation Pipelines for Image Segmentation 19 4.0 Abstract 20 4.1 Introduction 20 4.2 Methods 21 4.2.1 Data Preparation 21 4.2.2 Augmentation 21 4.2.3 Networks 24 4.2.4 Training and Metrics 25 4.3 Experimental Design 26 4.3.1 Hardware 26 4.3.2 Workflow 26 4.3.3 Test on Cityscapes 26 4.4 Results and Discussion 26 4.4.1 Stage 1: Crating a Baseline 27 4.4.2 Stage 2: Using Offline Augmentation 27 4.4.3 Stage 3: Using Online Augmentation 27 4.4.4 Test on Cityscapes 29 4.4.5 Future Work – A New Online Augmentation 30 4.5 Conclusion 31 4.6 Appendix 31 4.6.1 Appendix A. List of All Networks 31 4.6.2 Appendix B. Augmentation Methods 32 4.6.3 Appendix C. Used RIWA Online Augmentation Parameters 36 4.6.4 Appendix D. Used Cityscapes Online Augmentation Parameters 36 4.6.5 Appendix E. Comparison of CNNs with best Backbones on RIWA 37 4.6.6 Appendix F. Segmentation Results 38 4.7 References 39 5 Comparison of 3D CNNs for Volume Segmentation 43 5.0 Abstract 44 5.1 Introduction 44 5.2 Datasets 44 5.2.1 Carbon Rovings 45 5.2.2 Concrete Pores 45 5.2.3 Polyethylene Fibers 45 5.2.4 Brain Mitochondria 45 5.2.5 Brain Tumor Segmentation Challenge (BraTS) 46 5.2.6 Head and Neck Cancer 46 5.3 Methods 46 5.3.1 Data Preprocessing 46 5.3.2 Hyperparameters 46 5.3.3 Metrics 47 5.3.4 Experimental Design 48 5.4 Results and Discussion 48 5.4.1 Impact of Initial Random States (Head and Neck Cancer Dataset) 48 5.4.2 Carbon Rovings 48 5.4.3 Concrete Pores 49 5.4.4 Polyethylene Fibers 49 5.4.5 Brain Mitochondria 50 5.4.6 BraTS 51 5.5 Conclusion 51 5.6 References 52 6 Segmentation of Carbon in CRC Using 3D Augmentation 55 6.0 Abstract 56 6.1 Introduction 56 6.2 Materials and Methods 58 6.2.1 Specimens 58 6.2.2 Microtomography 59 6.2.3 AI-Based Segmentation 60 6.2.4 Roving Extraction 64 6.2.5 Multiscale Modeling 65 6.2.6 Scaled Boundary Isogeometric Analysis 66 6.2.7 Parameterized RVE and Definition of Characteristic Geometric Properties 67 6.3 Results and Discussion 70 6.3.1 Microtomography 70 6.3.2 Deep Learning 71 6.3.3 Roving Extraction 74 6.3.4 Parameterized RVE and Definition of Characteristic Geometric Properties 75 6.4 Conclusion 79 6.5 References 80 7 Image-to-Image Translation for Semi-Supervised Semantic Segmentation 85 7.1 Introduction 85 7.2 Methods 86 7.2.1 Generative Adversarial Networks 87 7.2.2 Contrastive Unpaired Translation 87 7.2.3 Fréchet Inception Distance 89 7.2.4 Datasets 89 7.3 Experimental Design 92 7.4 Results and Discussion 94 7.4.1 Training and Inference of CUT 94 7.4.2 End-to-End Training for Semantic Segmentation 99 7.5 Conclusion 104 7.5.1 Future Work 104 8 Synthesis 107 8.1 Research Summary 107 8.1.1 Augmentation Pipelines 107 8.1.2 3D CNN Comparison 108 8.1.3 3D Data Augmentation for the Segmentation of Carbon Rovings 108 8.1.4 Synthetic Training Data Generation 109 8.2 Future Developments 109 8.2.1 Augmentation 109 8.2.2 Pre-trained 3D Encoder 111 8.2.3 On the Quality Control of Carbon Reinforced Concrete 111 8.2.4 Subvoxel Accurate Segmentation 113 8.2.5 Towards Volume-to-Volume Translation 114 8.3 Conclusion 114 References 117 List of Tables 125 List of Figures 127 List of Abbreviations 131 / Computertomographie (CT) bietet wertvolle Einblicke in die inneren Strukturen von Objekten und Organismen, was für Anwendungen von der Materialwissenschaft bis zur medizinischen Diagnostik von entscheidender Bedeutung ist. In CT-Daten ist ein Objekt durch eine 3D-Rekonstruktion dargestellt, die durch die Kombination mehrerer 2D-Röntgenbilder aus verschiedenen Winkeln um das Objekt herum erstellt wird. Jedes Voxel, ein Volumen Pixel, innerhalb des rekonstruierten Volumens stellt ein kleines kubisches Element dar und ermöglicht eine detaillierte räumliche Darstellung. Um aussagekräftige Informationen aus CT-Bilddaten zu extrahieren und eine Analyse und Interpretation zu ermöglichen, ist eine genaue Segmentierung der inneren Strukturen unerlässlich. Dies kann jedoch aufgrund verschiedener Artefakte, die durch die Physik eines CT-Scans und Eigenschaften des abgebildeten Objekts verursacht werden, eine Herausforderung darstellen. Diese Dissertation befasst sich direkt mit dieser Herausforderung, indem sie Techniken des Deep Learnings einsetzt. Konkret werden für die Segmentierung Convolutional Neural Networks (CNNs) verwendet, welche jedoch mit dem Problem begrenzter Trainingsdaten konfrontiert sind. Der Datenknappheit wird dabei durch Datenerweiterung begegnet, indem unbeaufsichtigt synthetische Trainingsdaten erzeugt und 2D- und 3D-Augmentierungssmethoden eingesetzt werden. Eine Kombination dieser Vervielfältigungsstrategien erlaubt eine Vereinfachung der Segmentierung in Voxeldaten und behebt effektiv die Datenknappheit. Im Wesentlichen zielt diese Arbeit darauf ab, das Training von CNNs zu vereinfachen, wobei wenige oder gar keine gelabelten Daten benötigt werden. Um die Ergebnisse dieser Arbeit Forschenden zugänglicher zu machen, wurden zwei benutzerfreundliche Softwarelösungen, unpAIred und AiSeg, entwickelt. Diese ermöglichen die Generierung von Trainingsdaten, die Augmentierung sowie das Training, die Analyse und die Anwendung von CNNs. In dieser kumulativen Arbeit werden zunächst einfachere, aber effiziente konventionelle Methoden zur Datenvervielfältigung untersucht, wie z. B. radiometrische und geometrische Bildmanipulationen, die bereits häufig in der Literatur verwendet werden. Diese Methoden werden jedoch in der Regel zufällig nacheinander angewandt und folgen keiner bestimmten Reihenfolge. Der Schwerpunkt des ersten Forschungsartikels liegt darin, diesen Ansatz zu untersuchen und sowohl Online- als auch Offline-Datenerweiterungspipelines zu entwickeln, die eine systematische Sequenzierung dieser Operationen ermöglichen. Bei der Offline Variante werden die auf der Festplatte gespeicherten Trainingsdaten vervielfältigt, während die Online-Erweiterung dynamisch zur Laufzeit erfolgt, kurz bevor die Bilder dem CNN gezeigt werden. Es wird erfolgreich gezeigt, dass eine zufällige Verkettung von geometrischen und radiometrischen Methoden den neuen Pipelines unterlegen ist. Anschließend wird ein Vergleich von 3D-CNNs durchgeführt, um die optimalen Modelle für Segmentierungsaufgaben zu identifizieren, wie z.B. die Segmentierung von Carbonbewehrung und Luftporen in CT-Scans von carbonverstärktem Beton (CRC). Durch die Bewertung von acht 3D-CNN-Modellen auf sechs Datensätzen werden Empfehlungen für die Auswahl des genauesten Modells auf der Grundlage der Datensatzeigenschaften gegeben. Die Analyse unterstreicht die konstante Überlegenheit des 3D UNets, eines der CNNs, und seiner Residualversion bei Segmentierung von Rovings (Carbonfaserbündel) und Poren. Aufbauend auf den 2D Augmentierungspipelines und den Ergebnissen des 3D-CNN-Vergleichs werden die Pipelines auf die dritte Dimension erweitert, um insbesondere die Segmentierung der Carbonbewehrung in CT-Scans von CRC zu ermöglichen. Eine vergleichende Analyse verschiedener 3D Augmentierungsstrategien, die sowohl Offline- als auch Online-Erweiterungsvarianten umfassen, gibt Aufschluss über deren Effektivität. Die Offline-Augmentierung führt zwar zu weniger Artefakten, kann aber nur Rovings segmentieren, die bereits in den Trainingsdaten vorhanden sind. Die Online-Augmentierung erweist sich hingegen als unerlässlich für die effektive Segmentierung von Carbon-Roving-Typen, die nicht im Datensatz enthalten sind. Einschränkungen wie die geringe Vielfalt des Datensatzes und eine zu aggressive Online-Datenerweiterung, die zu Segmentierungsartefakten führt, erfordern jedoch weitere Methoden, um die Datenknappheit zu beheben. In Anbetracht der Notwendigkeit eines größeren und vielfältigeren Datensatzes erweitert diese Arbeit die Ergebnisse der drei Forschungsartikel durch die Einführung einer auf Deep Learning basierenden Augmentierung, die ein Generative Adversarial Network (GAN), genannt Contrastive Unpaired Translation (CUT), zur Erzeugung synthetischer Trainingsdaten verwendet. Durch die Kombination des GANs mit den Augmentierungspipelines wird eine halbüberwachte Ende-zu-Ende-Trainingsmethode vorgestellt und die erfolgreiche Erzeugung von Trainingsdaten für die 2D-Porensegmentierung demonstriert. Es bestehen jedoch noch Herausforderungen bei der Implementierung einer stabilen 3D-CUT-Version, was weitere Forschungs- und Entwicklungsanstrengungen erfordert. Zusammenfassend adressieren die Ergebnisse dieser Dissertation Herausforderungen der CT-Datensegmentierung in der Materialwissenschaft, die durch Deep-Learning-Techniken und neuartige 2D- und 3D-Online- und Offline-Augmentierungspipelines gelöst werden. Durch die Evaluierung verschiedener 3D-CNN-Modelle werden maßgeschneiderte Empfehlungen für spezifische Segmentierungsaufgaben gegeben. Darüber hinaus zeigen Untersuchungen zur Deep Learning basierten Augmentierung mit CUT vielversprechende Ergebnisse bei der Generierung synthetischer Trainingsdaten. Zukünftige Arbeiten umfassen die Entwicklung einer stabilen Implementierung einer 3D-CUT-Version, die Erforschung neuer Modellarchitekturen und die Entwicklung von subvoxelgenauen Segmentierungstechniken. Diese haben das Potenzial für bedeutende Fortschritte bei der Segmentierung in Tomographiedaten.:Abstract IV Zusammenfassung VI 1 Introduction 1 1.1 Thesis Structure 2 1.2 Scientific Context 3 1.2.1 Developments in the Segmentation in Tomography Data 3 1.2.2 3D Semantic Segmentation using Machine Learning 5 1.2.3 Data Augmentation 6 2 Developed Software Solutions: AiSeg and unpAIred 9 2.1 Software Design 10 2.2 Installation 11 2.3 AiSeg 11 2.4 unpAIred 12 2.5 Limitations 12 3 Factors Affecting Image Quality in Computed Tomography 13 3.1 From CT Scan to Reconstruction 13 3.2 X-ray Tube and Focal Spot 14 3.3 Beam Hardening 14 3.4 Absorption, Scattering and Pairing 15 3.5 X-ray Detector 16 3.6 Geometric Calibration 17 3.7 Reconstruction Algorithm 17 3.8 Artifact corrections 18 4 On the Development of Augmentation Pipelines for Image Segmentation 19 4.0 Abstract 20 4.1 Introduction 20 4.2 Methods 21 4.2.1 Data Preparation 21 4.2.2 Augmentation 21 4.2.3 Networks 24 4.2.4 Training and Metrics 25 4.3 Experimental Design 26 4.3.1 Hardware 26 4.3.2 Workflow 26 4.3.3 Test on Cityscapes 26 4.4 Results and Discussion 26 4.4.1 Stage 1: Crating a Baseline 27 4.4.2 Stage 2: Using Offline Augmentation 27 4.4.3 Stage 3: Using Online Augmentation 27 4.4.4 Test on Cityscapes 29 4.4.5 Future Work – A New Online Augmentation 30 4.5 Conclusion 31 4.6 Appendix 31 4.6.1 Appendix A. List of All Networks 31 4.6.2 Appendix B. Augmentation Methods 32 4.6.3 Appendix C. Used RIWA Online Augmentation Parameters 36 4.6.4 Appendix D. Used Cityscapes Online Augmentation Parameters 36 4.6.5 Appendix E. Comparison of CNNs with best Backbones on RIWA 37 4.6.6 Appendix F. Segmentation Results 38 4.7 References 39 5 Comparison of 3D CNNs for Volume Segmentation 43 5.0 Abstract 44 5.1 Introduction 44 5.2 Datasets 44 5.2.1 Carbon Rovings 45 5.2.2 Concrete Pores 45 5.2.3 Polyethylene Fibers 45 5.2.4 Brain Mitochondria 45 5.2.5 Brain Tumor Segmentation Challenge (BraTS) 46 5.2.6 Head and Neck Cancer 46 5.3 Methods 46 5.3.1 Data Preprocessing 46 5.3.2 Hyperparameters 46 5.3.3 Metrics 47 5.3.4 Experimental Design 48 5.4 Results and Discussion 48 5.4.1 Impact of Initial Random States (Head and Neck Cancer Dataset) 48 5.4.2 Carbon Rovings 48 5.4.3 Concrete Pores 49 5.4.4 Polyethylene Fibers 49 5.4.5 Brain Mitochondria 50 5.4.6 BraTS 51 5.5 Conclusion 51 5.6 References 52 6 Segmentation of Carbon in CRC Using 3D Augmentation 55 6.0 Abstract 56 6.1 Introduction 56 6.2 Materials and Methods 58 6.2.1 Specimens 58 6.2.2 Microtomography 59 6.2.3 AI-Based Segmentation 60 6.2.4 Roving Extraction 64 6.2.5 Multiscale Modeling 65 6.2.6 Scaled Boundary Isogeometric Analysis 66 6.2.7 Parameterized RVE and Definition of Characteristic Geometric Properties 67 6.3 Results and Discussion 70 6.3.1 Microtomography 70 6.3.2 Deep Learning 71 6.3.3 Roving Extraction 74 6.3.4 Parameterized RVE and Definition of Characteristic Geometric Properties 75 6.4 Conclusion 79 6.5 References 80 7 Image-to-Image Translation for Semi-Supervised Semantic Segmentation 85 7.1 Introduction 85 7.2 Methods 86 7.2.1 Generative Adversarial Networks 87 7.2.2 Contrastive Unpaired Translation 87 7.2.3 Fréchet Inception Distance 89 7.2.4 Datasets 89 7.3 Experimental Design 92 7.4 Results and Discussion 94 7.4.1 Training and Inference of CUT 94 7.4.2 End-to-End Training for Semantic Segmentation 99 7.5 Conclusion 104 7.5.1 Future Work 104 8 Synthesis 107 8.1 Research Summary 107 8.1.1 Augmentation Pipelines 107 8.1.2 3D CNN Comparison 108 8.1.3 3D Data Augmentation for the Segmentation of Carbon Rovings 108 8.1.4 Synthetic Training Data Generation 109 8.2 Future Developments 109 8.2.1 Augmentation 109 8.2.2 Pre-trained 3D Encoder 111 8.2.3 On the Quality Control of Carbon Reinforced Concrete 111 8.2.4 Subvoxel Accurate Segmentation 113 8.2.5 Towards Volume-to-Volume Translation 114 8.3 Conclusion 114 References 117 List of Tables 125 List of Figures 127 List of Abbreviations 131
244

A MONTE CARLO APPROACH TO MULTISCALE MODELING OF GRANULAR GAS OF NON-SPHERICAL PARTICLES

Muhammed Anifowose Gbolasere (20322738) 10 January 2025 (has links)
<p dir="ltr">Granular flow of non-spherical particles is common in nature and industrial processes. To understand the behavior of granular systems of these non-spherical particles, computational methods are employed to simulate these systems. However, current state-of-the-art simulation methods (TFM and DEM)have two primary drawbacks: (1) high computational cost restricting this method to small-scale systems (DEM) and (2) the use of empirical correlations that cannot be reliably extrapolated to different systems (TFM). Also, due to the statistical limitation and lack of physics-based continuum description, making progress in non-spherical particle flow dynamics with the study of its higher-order moments and transport coefficients is virtually unfeasible. To address these challenges, a DSMC model is developed to simulate a granular gas of spherocylinders with varying aspect ratios. In this work, a 3D classical trajectory calculation (CTC) code is developed to generate pairwise collision data sets. In addition, the Gaussian mixture model, an unsupervised machine learning technique, is used to construct the complex probability distributions required by the DSMC model. Subsequently, the model is implemented and validated against exact solutions derived from equivalent DEM simulations. The model shows high accuracy on both the macroscopic and microscopic scales and is more than 50 times faster than the DEM. The distribution functions of energies and velocities are extracted over time. Following the methodology presented, this approach can be easily adjusted to accommodate different particle shapes.</p>
245

Slowness and sparseness for unsupervised learning of spatial and object codes from naturalistic data

Franzius, Mathias 27 June 2008 (has links)
Diese Doktorarbeit führt ein hierarchisches Modell für das unüberwachte Lernen aus quasi-natürlichen Videosequenzen ein. Das Modell basiert auf den Lernprinzipien der Langsamkeit und Spärlichkeit, für die verschiedene Ansätze und Implementierungen vorgestellt werden. Eine Vielzahl von Neuronentypen im Hippocampus von Nagern und Primaten kodiert verschiedene Aspekte der räumlichen Umgebung eines Tieres. Dazu gehören Ortszellen (place cells), Kopfrichtungszellen (head direction cells), Raumansichtszellen (spatial view cells) und Gitterzellen (grid cells). Die Hauptergebnisse dieser Arbeit basieren auf dem Training des hierarchischen Modells mit Videosequenzen aus einer Virtual-Reality-Umgebung. Das Modell reproduziert die wichtigsten räumlichen Codes aus dem Hippocampus. Die Art der erzeugten Repräsentationen hängt hauptsächlich von der Bewegungsstatistik des simulierten Tieres ab. Das vorgestellte Modell wird außerdem auf das Problem der invaranten Objekterkennung angewandt, indem Videosequenzen von simulierten Kugelhaufen oder Fischen als Stimuli genutzt wurden. Die resultierenden Modellrepräsentationen erlauben das unabhängige Auslesen von Objektidentität, Position und Rotationswinkel im Raum. / This thesis introduces a hierarchical model for unsupervised learning from naturalistic video sequences. The model is based on the principles of slowness and sparseness. Different approaches and implementations for these principles are discussed. A variety of neuron classes in the hippocampal formation of rodents and primates codes for different aspects of space surrounding the animal, including place cells, head direction cells, spatial view cells and grid cells. In the main part of this thesis, video sequences from a virtual reality environment are used for training the hierarchical model. The behavior of most known hippocampal neuron types coding for space are reproduced by this model. The type of representations generated by the model is mostly determined by the movement statistics of the simulated animal. The model approach is not limited to spatial coding. An application of the model to invariant object recognition is described, where artificial clusters of spheres or rendered fish are presented to the model. The resulting representations allow a simple extraction of the identity of the object presented as well as of its position and viewing angle.
246

Training deep convolutional architectures for vision

Desjardins, Guillaume 08 1900 (has links)
Les tâches de vision artificielle telles que la reconnaissance d’objets demeurent irrésolues à ce jour. Les algorithmes d’apprentissage tels que les Réseaux de Neurones Artificiels (RNA), représentent une approche prometteuse permettant d’apprendre des caractéristiques utiles pour ces tâches. Ce processus d’optimisation est néanmoins difficile. Les réseaux profonds à base de Machine de Boltzmann Restreintes (RBM) ont récemment été proposés afin de guider l’extraction de représentations intermédiaires, grâce à un algorithme d’apprentissage non-supervisé. Ce mémoire présente, par l’entremise de trois articles, des contributions à ce domaine de recherche. Le premier article traite de la RBM convolutionelle. L’usage de champs réceptifs locaux ainsi que le regroupement d’unités cachées en couches partageant les même paramètres, réduit considérablement le nombre de paramètres à apprendre et engendre des détecteurs de caractéristiques locaux et équivariant aux translations. Ceci mène à des modèles ayant une meilleure vraisemblance, comparativement aux RBMs entraînées sur des segments d’images. Le deuxième article est motivé par des découvertes récentes en neurosciences. Il analyse l’impact d’unités quadratiques sur des tâches de classification visuelles, ainsi que celui d’une nouvelle fonction d’activation. Nous observons que les RNAs à base d’unités quadratiques utilisant la fonction softsign, donnent de meilleures performances de généralisation. Le dernière article quand à lui, offre une vision critique des algorithmes populaires d’entraînement de RBMs. Nous montrons que l’algorithme de Divergence Contrastive (CD) et la CD Persistente ne sont pas robustes : tous deux nécessitent une surface d’énergie relativement plate afin que leur chaîne négative puisse mixer. La PCD à "poids rapides" contourne ce problème en perturbant légèrement le modèle, cependant, ceci génère des échantillons bruités. L’usage de chaînes tempérées dans la phase négative est une façon robuste d’adresser ces problèmes et mène à de meilleurs modèles génératifs. / High-level vision tasks such as generic object recognition remain out of reach for modern Artificial Intelligence systems. A promising approach involves learning algorithms, such as the Arficial Neural Network (ANN), which automatically learn to extract useful features for the task at hand. For ANNs, this represents a difficult optimization problem however. Deep Belief Networks have thus been proposed as a way to guide the discovery of intermediate representations, through a greedy unsupervised training of stacked Restricted Boltzmann Machines (RBM). The articles presented here-in represent contributions to this field of research. The first article introduces the convolutional RBM. By mimicking local receptive fields and tying the parameters of hidden units within the same feature map, we considerably reduce the number of parameters to learn and enforce local, shift-equivariant feature detectors. This translates to better likelihood scores, compared to RBMs trained on small image patches. In the second article, recent discoveries in neuroscience motivate an investigation into the impact of higher-order units on visual classification, along with the evaluation of a novel activation function. We show that ANNs with quadratic units using the softsign activation function offer better generalization error across several tasks. Finally, the third article gives a critical look at recently proposed RBM training algorithms. We show that Contrastive Divergence (CD) and Persistent CD are brittle in that they require the energy landscape to be smooth in order for their negative chain to mix well. PCD with fast-weights addresses the issue by performing small model perturbations, but may result in spurious samples. We propose using simulated tempering to draw negative samples. This leads to better generative models and increased robustness to various hyperparameters.
247

Apprentissage de représentations sur-complètes par entraînement d’auto-encodeurs

Lajoie, Isabelle 12 1900 (has links)
Les avancés dans le domaine de l’intelligence artificielle, permettent à des systèmes informatiques de résoudre des tâches de plus en plus complexes liées par exemple à la vision, à la compréhension de signaux sonores ou au traitement de la langue. Parmi les modèles existants, on retrouve les Réseaux de Neurones Artificiels (RNA), dont la popularité a fait un grand bond en avant avec la découverte de Hinton et al. [22], soit l’utilisation de Machines de Boltzmann Restreintes (RBM) pour un pré-entraînement non-supervisé couche après couche, facilitant grandement l’entraînement supervisé du réseau à plusieurs couches cachées (DBN), entraînement qui s’avérait jusqu’alors très difficile à réussir. Depuis cette découverte, des chercheurs ont étudié l’efficacité de nouvelles stratégies de pré-entraînement, telles que l’empilement d’auto-encodeurs traditionnels(SAE) [5, 38], et l’empilement d’auto-encodeur débruiteur (SDAE) [44]. C’est dans ce contexte qu’a débuté la présente étude. Après un bref passage en revue des notions de base du domaine de l’apprentissage machine et des méthodes de pré-entraînement employées jusqu’à présent avec les modules RBM, AE et DAE, nous avons approfondi notre compréhension du pré-entraînement de type SDAE, exploré ses différentes propriétés et étudié des variantes de SDAE comme stratégie d’initialisation d’architecture profonde. Nous avons ainsi pu, entre autres choses, mettre en lumière l’influence du niveau de bruit, du nombre de couches et du nombre d’unités cachées sur l’erreur de généralisation du SDAE. Nous avons constaté une amélioration de la performance sur la tâche supervisée avec l’utilisation des bruits poivre et sel (PS) et gaussien (GS), bruits s’avérant mieux justifiés que celui utilisé jusqu’à présent, soit le masque à zéro (MN). De plus, nous avons démontré que la performance profitait d’une emphase imposée sur la reconstruction des données corrompues durant l’entraînement des différents DAE. Nos travaux ont aussi permis de révéler que le DAE était en mesure d’apprendre, sur des images naturelles, des filtres semblables à ceux retrouvés dans les cellules V1 du cortex visuel, soit des filtres détecteurs de bordures. Nous aurons par ailleurs pu montrer que les représentations apprises du SDAE, composées des caractéristiques ainsi extraites, s’avéraient fort utiles à l’apprentissage d’une machine à vecteurs de support (SVM) linéaire ou à noyau gaussien, améliorant grandement sa performance de généralisation. Aussi, nous aurons observé que similairement au DBN, et contrairement au SAE, le SDAE possédait une bonne capacité en tant que modèle générateur. Nous avons également ouvert la porte à de nouvelles stratégies de pré-entraînement et découvert le potentiel de l’une d’entre elles, soit l’empilement d’auto-encodeurs rebruiteurs (SRAE). / Progress in the machine learning domain allows computational system to address more and more complex tasks associated with vision, audio signal or natural language processing. Among the existing models, we find the Artificial Neural Network (ANN), whose popularity increased suddenly with the recent breakthrough of Hinton et al. [22], that consists in using Restricted Boltzmann Machines (RBM) for performing an unsupervised, layer by layer, pre-training initialization, of a Deep Belief Network (DBN), which enables the subsequent successful supervised training of such architecture. Since this discovery, researchers studied the efficiency of other similar pre-training strategies such as the stacking of traditional auto-encoder (SAE) [5, 38] and the stacking of denoising auto-encoder (SDAE) [44]. This is the context in which the present study started. After a brief introduction of the basic machine learning principles and of the pre-training methods used until now with RBM, AE and DAE modules, we performed a series of experiments to deepen our understanding of pre-training with SDAE, explored its different proprieties and explored variations on the DAE algorithm as alternative strategies to initialize deep networks. We evaluated the sensitivity to the noise level, and influence of number of layers and number of hidden units on the generalization error obtained with SDAE. We experimented with other noise types and saw improved performance on the supervised task with the use of pepper and salt noise (PS) or gaussian noise (GS), noise types that are more justified then the one used until now which is masking noise (MN). Moreover, modifying the algorithm by imposing an emphasis on the corrupted components reconstruction during the unsupervised training of each different DAE showed encouraging performance improvements. Our work also allowed to reveal that DAE was capable of learning, on naturals images, filters similar to those found in V1 cells of the visual cortex, that are in essence edges detectors. In addition, we were able to verify that the learned representations of SDAE, are very good characteristics to be fed to a linear or gaussian support vector machine (SVM), considerably enhancing its generalization performance. Also, we observed that, alike DBN, and unlike SAE, the SDAE had the potential to be used as a good generative model. As well, we opened the door to novel pre-training strategies and discovered the potential of one of them : the stacking of renoising auto-encoders (SRAE).
248

Moranapho : apprentissage non supervisé de la morphologie d'une langue par généralisation de relations analogiques

Lavallée, Jean-François 08 1900 (has links)
Récemment, nous avons pu observer un intérêt grandissant pour l'application de l'analogie formelle à l'analyse morphologique. L'intérêt premier de ce concept repose sur ses parallèles avec le processus mental impliqué dans la création de nouveaux termes basée sur les relations morphologiques préexistantes de la langue. Toutefois, l'utilisation de ce concept reste tout de même marginale due notamment à son coût de calcul élevé.Dans ce document, nous présenterons le système à base de graphe Moranapho fondé sur l'analogie formelle. Nous démontrerons par notre participation au Morpho Challenge 2009 (Kurimo:10) et nos expériences subséquentes, que la qualité des analyses obtenues par ce système rivalise avec l'état de l'art. Nous analyserons aussi l'influence de certaines de ses composantes sur la qualité des analyses morphologiques produites. Nous appuierons les conclusions tirées de nos analyses sur des théories bien établies dans le domaine de la linguistique. Ceci nous permet donc de fournir certaines prédictions sur les succès et les échecs de notre système, lorsqu'appliqué à d'autres langues que celles testées au cours de nos expériences. / Recently, we have witnessed a growing interest in applying the concept of formal analogy to unsupervised morphology acquisition. The attractiveness of this concept lies in its parallels with the mental process involved in the creation of new words based on morphological relations existing in the language. However, the use of formal analogy remain marginal partly due to their high computational cost. In this document, we present Moranapho, a graph-based system founded on the concept of formal analogy. Our participation in the 2009 Morpho Challenge (Kurimo:10) and our subsequent experiments demonstrate that the performance of Moranapho are favorably comparable to the state-of-the-art. We studied the influence of some of its components on the quality of the morphological analysis produced as well. Finally, we will discuss our findings based on well-established theories in the field of linguistics. This allows us to provide some predictions on the successes and failures of our system when applied to languages other than those tested in our experiments.
249

Understanding deep architectures and the effect of unsupervised pre-training

Erhan, Dumitru 10 1900 (has links)
Cette thèse porte sur une classe d'algorithmes d'apprentissage appelés architectures profondes. Il existe des résultats qui indiquent que les représentations peu profondes et locales ne sont pas suffisantes pour la modélisation des fonctions comportant plusieurs facteurs de variation. Nous sommes particulièrement intéressés par ce genre de données car nous espérons qu'un agent intelligent sera en mesure d'apprendre à les modéliser automatiquement; l'hypothèse est que les architectures profondes sont mieux adaptées pour les modéliser. Les travaux de Hinton (2006) furent une véritable percée, car l'idée d'utiliser un algorithme d'apprentissage non-supervisé, les machines de Boltzmann restreintes, pour l'initialisation des poids d'un réseau de neurones supervisé a été cruciale pour entraîner l'architecture profonde la plus populaire, soit les réseaux de neurones artificiels avec des poids totalement connectés. Cette idée a été reprise et reproduite avec succès dans plusieurs contextes et avec une variété de modèles. Dans le cadre de cette thèse, nous considérons les architectures profondes comme des biais inductifs. Ces biais sont représentés non seulement par les modèles eux-mêmes, mais aussi par les méthodes d'entraînement qui sont souvent utilisés en conjonction avec ceux-ci. Nous désirons définir les raisons pour lesquelles cette classe de fonctions généralise bien, les situations auxquelles ces fonctions pourront être appliquées, ainsi que les descriptions qualitatives de telles fonctions. L'objectif de cette thèse est d'obtenir une meilleure compréhension du succès des architectures profondes. Dans le premier article, nous testons la concordance entre nos intuitions---que les réseaux profonds sont nécessaires pour mieux apprendre avec des données comportant plusieurs facteurs de variation---et les résultats empiriques. Le second article est une étude approfondie de la question: pourquoi l'apprentissage non-supervisé aide à mieux généraliser dans un réseau profond? Nous explorons et évaluons plusieurs hypothèses tentant d'élucider le fonctionnement de ces modèles. Finalement, le troisième article cherche à définir de façon qualitative les fonctions modélisées par un réseau profond. Ces visualisations facilitent l'interprétation des représentations et invariances modélisées par une architecture profonde. / This thesis studies a class of algorithms called deep architectures. We argue that models that are based on a shallow composition of local features are not appropriate for the set of real-world functions and datasets that are of interest to us, namely data with many factors of variation. Modelling such functions and datasets is important if we are hoping to create an intelligent agent that can learn from complicated data. Deep architectures are hypothesized to be a step in the right direction, as they are compositions of nonlinearities and can learn compact distributed representations of data with many factors of variation. Training fully-connected artificial neural networks---the most common form of a deep architecture---was not possible before Hinton (2006) showed that one can use stacks of unsupervised Restricted Boltzmann Machines to initialize or pre-train a supervised multi-layer network. This breakthrough has been influential, as the basic idea of using unsupervised learning to improve generalization in deep networks has been reproduced in a multitude of other settings and models. In this thesis, we cast the deep learning ideas and techniques as defining a special kind of inductive bias. This bias is defined not only by the kind of functions that are eventually represented by such deep models, but also by the learning process that is commonly used for them. This work is a study of the reasons for why this class of functions generalizes well, the situations where they should work well, and the qualitative statements that one could make about such functions. This thesis is thus an attempt to understand why deep architectures work. In the first of the articles presented we study the question of how well our intuitions about the need for deep models correspond to functions that they can actually model well. In the second article we perform an in-depth study of why unsupervised pre-training helps deep learning and explore a variety of hypotheses that give us an intuition for the dynamics of learning in such architectures. Finally, in the third article, we want to better understand what a deep architecture models, qualitatively speaking. Our visualization approach enables us to understand the representations and invariances modelled and learned by deeper layers.
250

Using unsupervised machine learning for fault identification in virtual machines

Schneider, C. January 2015 (has links)
Self-healing systems promise operating cost reductions in large-scale computing environments through the automated detection of, and recovery from, faults. However, at present there appears to be little known empirical evidence comparing the different approaches, or demonstrations that such implementations reduce costs. This thesis compares previous and current self-healing approaches before demonstrating a new, unsupervised approach that combines artificial neural networks with performance tests to perform fault identification in an automated fashion, i.e. the correct and accurate determination of which computer features are associated with a given performance test failure. Several key contributions are made in the course of this research including an analysis of the different types of self-healing approaches based on their contextual use, a baseline for future comparisons between self-healing frameworks that use artificial neural networks, and a successful, automated fault identification in cloud infrastructure, and more specifically virtual machines. This approach uses three established machine learning techniques: Naïve Bayes, Baum-Welch, and Contrastive Divergence Learning. The latter demonstrates minimisation of human-interaction beyond previous implementations by producing a list in decreasing order of likelihood of potential root causes (i.e. fault hypotheses) which brings the state of the art one step closer toward fully self-healing systems. This thesis also examines the impact of that different types of faults have on their respective identification. This helps to understand the validity of the data being presented, and how the field is progressing, whilst examining the differences in impact to identification between emulated thread crashes and errant user changes – a contribution believed to be unique to this research. Lastly, future research avenues and conclusions in automated fault identification are described along with lessons learned throughout this endeavor. This includes the progression of artificial neural networks, how learning algorithms are being developed and understood, and possibilities for automatically generating feature locality data.

Page generated in 0.1108 seconds