421 |
Developing a User-Independent Deep Learning-Based Biomechanical Gait Analysis System Using Full Body Kinematics and ElectromyographyAvdan, Goksu 01 August 2024 (has links) (PDF)
Motion capture (mocap) systems integrated with force plates and electromyography (EMG) collect detailed kinematic and kinetic data on subjects, including stride length, width, cadence, speed, and other spatiotemporal parameters. These systems allow clinicians and researchers to analyze movements, both cyclic (e.g., walking, running) and non-cyclic (e.g., jumping, falling), which is crucial for understanding movement patterns and identifying abnormalities. Clinical gait analysis, a key application, focuses on detecting musculoskeletal issues and walking impairments. While essential for diagnosing gait disorders and planning interventions, clinical gait analysis faces challenges such as noise, outliers, and marker occlusion in optical motion tracking data, requiring complex post-processing. Additionally, the measurement of ground reaction forces (GRFs) and moments (GRMs) is limited due to the restricted number of force plates. There are also challenges in EMG data collection, such as finding optimal MVC positions and developing nonlinear normalization techniques to replace traditional methods.To address these challenges, this research aims to develop an AI-driven gait analysis system that is cost-effective, user-independent, and relies solely on kinematic and EMG data for real-time analysis. The system is specifically designed to assess musculoskeletal characteristics in individuals with special needs, walking disabilities, or injuries, where measuring MVC levels is impractical or unsafe. The research has four main objectives: (1) standardize MVC positions for four lower limb muscles, (2) develop an alternative EMG normalization technique using nonlinear data analysis, (3) create an unsupervised framework using transformers for missing marker recovery without perfect ground-truth data, and (4) generate GRFs, GRMs, and JMs from lower limb kinematics using a 1D-CNN, improving accuracy and efficiency with transfer learning, without requiring force plates. While addressing these challenges, the proposed system aims to minimize user interaction, reduce pre- and post-processing, and lower costs for researchers and clinicians. The designed tool will integrate with existing optical marker-based mocap systems, providing greater flexibility and usability. In educational settings, it will offer students hands-on experience in advanced gait analysis techniques. Economically, widespread adoption of the tool in research and clinical settings will reduce data collection and analysis costs, making advanced gait analysis more accessible. Additionally, this tool can be applied to other fields, such as precision manufacturing, security, and predictive maintenance, where analyzing data can predict failures. Consequently, this research will significantly advance the field of human movement, increasing the volume and quality of research using optical marker-based mocap systems.
|
422 |
DeepARG+ - A Computational Pipeline for the Prediction of Antibiotic ResistanceKulkarni, Rutwik Shashank 16 June 2021 (has links)
The global spread of antibiotic resistance warrants concerted surveillance in the clinic and in the environment. The widespread use of metagenomics for various studies has led to the generation of a large amount of sequencing data. Next-generation sequencing of microbial communities provides an opportunity for proactive detection of emerging antibiotic resistance genes (ARGs) from such data, but there are a limited number of pipelines that enable the identification of novel ARGs belonging to diverse antibiotic classes at present. Therefore, there is a need for the development of computational pipelines that can identify these putative novel ARGs. Such pipelines should be scalable, accessible and have good performance.
To address this problem we develop a new method for predicting novel ARGs from genomic or metagenomic sequences, leveraging known ARGs of different resistance categories. Our method takes into account the physio-chemical properties that are intrinsic to different ARG families. Traditionally, new ARGs are predicted by making sequence alignment and calculating sequence similarity to existing ARG reference databases, which can be very time consuming. Here we introduce an alignment free and deep learning prediction method that incorporates both the primary protein sequences of ARGs and their physio-chemical properties.
We compare our method with existing pipelines including hidden Markov model based Resfams and fARGene, sequence alignment and machine learning-based DeepARG-LS, and homology modelling based Pairwise Comparative Modelling. We also use our model to detect novel ARGs from various environments including human-gut, soil, activated sludge and the influent samples collected from a waste water treatment plant. Results show that our method achieves greater accuracy compared to existing models for the prediction of ARGs and enables the detection of putative novel ARGs, providing promising targets for experimental characterization to the scientific community. / Master of Science / Various bacteria contain genes that allow them to survive and grow even after the application of antibiotics. Such genes are called antibiotic resistance genes (ARGs). Each ARG has properties that make it resistant to a particular class of antibiotics. This class is called the resistance class/category of the gene. Antimicrobial resistance (AMR) is one of the biggest challenges to public health in recent times. It has been projected that a large number of deaths might occur due to AMR in the future. Therefore, there is a need for monitoring AMR in various environments. Currently, developed methods use the sequence's similarity with the existing database as a feature for ARG prediction. Some tools also use the 3D structure of proteins as a feature for ARG prediction. In this thesis, we develop a tool that incorporates both the sequence similarity and the structural information of proteins for ARG prediction. The structural information is encoded with physio-chemical properties (such as hydrophobicity, molecular weight etc.) of the amino acids. Our results show the efficacy of the pipeline in various environments. Results also show that our method achieves accuracy greater than existing models for the prediction of ARGs from metagenomic data. It also enables the detection of putative novel ARGs, providing promising targets for experimental characterization to the scientific community.
|
423 |
Deep Learning-Based Image Analysis for Microwell AssayBiörck, Jonatan, Staniszewski, Maciej January 2024 (has links)
This thesis investigates the performance of deep learning models, specifically Resnet50 and TransUnet, in semantic image segmentation on microwell images containing tumor and natural killer (NK) cells. The main goal is to examine the effect of only using bright-field data (1-channel) as input instead of both fluorescent and brightfield data (4-channel); this is interesting since fluorescent imaging can cause damage to the cells being analyzed. The network performance is measured by Intersection over Union (IoU), the networks were trained and using manually annotated data from Onfelt Lab. TransUnet consistently outperformed the Resnet50 for both the 4-channel and 1-channel data. Moreover, the 4-channel input generally resulted in a better IoU compared to using only the bright-field channel. Furthermore, a significant decline in performance is observed when the networks are tested on the control data. For the control data, the overall IoU for the best performing 4-channel model dropped from 86.2\% to 73.9\%. The best performing 1-channel model dropped from 83.8\% to 70.8\% overall IoU.
|
424 |
Ocean Rain Detection and Wind Retrieval Through Deep Learning Architectures on Advanced Scatterometer DataMcKinney, Matthew Yoshinori Otani 18 June 2024 (has links) (PDF)
The Advanced Scatterometer (ASCAT) is a satellite-based remote sensing instrument designed for measuring wind speed and direction over the Earth's oceans. This thesis aims to expand and improve the capabilities of ASCAT by adding rain detection and advancing wind retrieval. Additionally, this expansion to ASCAT serves as evidence of Artificial Intelligence (AI) techniques learning both novel and traditional methods in remote sensing. I apply semantic segmentation to ASCAT measurements to detect rain over the oceans, enhancing capabilities to monitor global precipitation. I use two common neural network architectures and train them on measurements from the Tropical Rainfall Measuring Mission (TRMM) collocated with ASCAT measurements. I apply the same semantic segmentation techniques on wind retrieval in order to create a machine learning model that acts as an inverse Geophysical Model Function (GMF). I use three common neural network architectures and train the models on ASCAT data collocated with European Centre for Medium-Range Weather Forecasts (ECMWF) wind vector data. I successfully increase the capabilities of the ASCAT satellite to detect rainfall in Earth's oceans, with the ability to retrieve wind vectors without a GMF or Maximum Likelihood Estimation (MLE).
|
425 |
BEYOND LOCAL NEIGHBORHOODS: LEVERAGING INFORMATIVE NODES FOR IMPROVED GRAPH NEURAL NETWORKS PERFORMANCELiang, Peiyu 12 1900 (has links)
Many real-world datasets, such as those from social and scientific domains, can be represented as graphs, where entities are depicted as nodes and their relationships as edges. To analyze the properties of individual entities (node classification) or the community as a whole (graph classification), graph neural networks (GNNs) serve as a powerful tool. Most GNNs utilize a message-passing scheme to aggregate information from neighboring nodes. This localized aggregation allows the network to learn representations that incorporate the context of each node, thereby enhancing its ability to capture complex local structures and relationships.
Despite their success, many GNNs heavily rely on local 1-hop neighborhood information and a stacked architecture of $K$ layers. This dependency can result in poor handling of long-range dependencies and lead to issues like information over-squashing. Consequently, there is a pressing need for advanced methodologies that can systematically aggregate more informative nodes beyond the default graph structure to achieve more accurate classification results.
In this thesis, we highlight the challenges of information over-squashing and the limited capacity of existing GNNs to capture long-range dependencies, focusing on addressing these issues through innovative informative node selection and end-to-end learning strategy using three approaches. Our first approach, \textit{Two-view GNNs with adaptive view-wise structure learning strategy}, posits that more informative nodes should have proximal node representations within a graph structure constructed on such attributes. We reconstruct a new graph structure based on the proximity of node representations and simultaneously learn a graph object from both the newly constructed and default graph structures for relationship reasoning. Additionally, we employ an adaptive strategy that learns inter-structure relationships based on classifier performance. While this approach achieves more accurate classifications, it is still limited by relying on a single or two graph structures. Our second approach, \textit{Cauchy-smoothing GCN (CauchyGCN)}, utilizes the default graph structure but regards more informative nodes as those closely embedded in the embedding space. CauchyGCN develops a new layer-wise message-passing scheme that follows the properties of the Cauchy distribution, preserving smoothness between closely embedded nodes while penalizing distant 1-hop neighbors less severely. This approach shows competitive results compared to other advancements. From our first two approaches, we observe that (1) Understanding the graph requires learning from multiple perspectives, and informative nodes could reside beyond the default graph structure. (2) Preserving smoothness among informative nodes is beneficial for effective learning. Our third approach, \textit{Topological-induced Graph Transformer (TOPGT)}, defines the additional useful graph structures as topological structures and leverages a self-attention mechanism to assess the importance of closely embedded nodes. This approach achieves state-of-the-art performance compared to existing methods in the domain.
Finally, we summarize our contributions through the three approaches that address the challenges that this thesis highlights. Additionally, I discuss potential future work to explore and utilize informative node information beyond local neighborhoods, aiming to develop large pre-trained GNNs capable of tackling various downstream tasks across different domains. / Computer and Information Science
|
426 |
Traffic Forecasting Applications Using Crowdsourced Traffic Reports and Deep LearningAlammari, Ali 05 1900 (has links)
Intelligent transportation systems (ITS) are essential tools for traffic planning, analysis, and forecasting that can utilize the huge amount of traffic data available nowadays. In this work, we aggregated detailed traffic flow sensor data, Waze reports, OpenStreetMap (OSM) features, and weather data, from California Bay Area for 6 months. Using that data, we studied three novel ITS applications using convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The first experiment is an analysis of the relation between roadway shapes and accident occurrence, where results show that the speed limit and number of lanes are significant predictors for major accidents on highways. The second experiment presents a novel method for forecasting congestion severity using crowdsourced data only (Waze, OSM, and weather), without the need for traffic sensor data. The third experiment studies the improvement of traffic flow forecasting using accidents, number of lanes, weather, and time-related features, where results show significant performance improvements when the additional features where used.
|
427 |
An Investigation of Scale Factor in Deep Networks for Scene RecognitionQiao, Zhinan 05 1900 (has links)
Is there a significant difference in the design of deep networks for the tasks of classifying object-centric images and scenery images? How to design networks that extract the most representative features for scene recognition? To answer these questions, we design studies to examine the scales and richness of image features for scenery image recognition. Three methods are proposed that integrate the scale factor to the deep networks and reveal the fundamental network design strategies. In our first attempt to integrate scale factors into the deep network, we proposed a method that aggregates both the context and multi-scale object information of scene images by constructing a multi-scale pyramid. In our design, integration of object-centric multi-scale networks achieved a performance boost of 9.8%; integration of object- and scene-centric models obtained an accuracy improvement of 5.9% compared with single scene-centric models. We also exploit bringing the attention scheme to the deep network and proposed a Scale Attentive Network (SANet). The SANet streamlines the multi-scale scene recognition pipeline, learns comprehensive scene features at various scales and locations, addresses the inter-dependency among scales, and further assists feature re-calibration as well as the aggregation process. The proposed network achieved a Top-1 accuracy increase by 1.83% on Place365 standard dataset with only 0.12% additional parameters and 0.24% additional GFLOPs using ResNet-50 as the backbone. We further bring the scale factor implicitly into network backbone design by proposing a Deep-Narrow Network and Dilated Pooling module. The Deep-narrow architecture increased the depth of the network as well as decreased the width of the network, which uses a variety of receptive fields by stacking more layers. We further proposed a Dilated Pooling module which expanded the pooling scope and made use of multi-scale features in the pooling operation. By embedding the Dilated Pooling into Deep-Narrow Network, we obtained a Top-1 accuracy boost of 0.40% using less than half of the GFLOPs and parameters compared to benchmark ResNet-50.
|
428 |
Integrating Multiple Deep Learning Models to Classify Disaster Scene VideosLi, Yuan 12 1900 (has links)
Recently, disaster scene description and indexing challenges attract the attention of researchers. In this dissertation, we solve a disaster-related multi-labeling task using a newly developed Low Altitude Disaster Imagery dataset. In the first task, we realize video content by selecting a set of summary key frames to represent the video sequence. Through inter-frame differences, the key frames are generated. The key frame extraction of disaster-related video clips is a powerful tool that can efficiently convert video data into image-level data, reduce the requirements for the extraction environment and improve the applicable environment. In the second, we propose a novel application of using deep learning methods on low altitude disaster video feature recognition. Supervised learning-based deep-learning approaches are effective in disaster-related features recognition via foreground object detection and background classification. Performed dataset validation, our model generalized well and improved performance by optimizing the YOLOv3 model and combining it with Resnet50. The comprehensive models showed more efficient and effective than those in prior published works. In the third task, we optimize the whole scene labeling classification by pruning the lightweight model MobileNetV3, which shows superior generalizability and can disaster features recognition from a disaster-related dataset be accomplished efficiently to assist disaster recovery.
|
429 |
Multihazard-Expositionsmodellierung mit multimodalen Geobilddaten und Deep Learning / Multi-hazard exposure modeling with multimodal geo-image data and deep learningAravena Pelizari, Patrick January 2025 (has links) (PDF)
Aufgrund der fortschreitenden Prozesse des Bevölkerungswachstums, der Urbanisierung und des Klimawandels sind weltweit erheblich mehr Menschen und Sachwerte Naturgefahren ausgesetzt. Essenziell für eine wirksame Risikoreduktion und ein effektives Katastrophenmanagement sind aktuelle Expositionsmodelle mit detaillierten, räumlich verorteten Informationen über die gebaute Umwelt und deren Vulnerabilität. Diese Daten sind jedoch oft nur unzureichend verfügbar. Gleichzeitig sind die Anforderungen an das Expositionsmodell hinsichtlich des thematischen Informationsgehalts und der räumlichen Auflösung für eine konsistente Vulnerabilitätsbewertung im Kontext multipler Naturgefahren hoch, da i) unterschiedliche Gebäudeattribute die Vulnerabilität gegenüber unterschiedlichen Naturgefahren bedingen, ii) unterschiedliche Naturgefahren auf unterschiedlichen räumlichen Skalen und mit unterschiedlicher räumlicher Variabilität auftreten.
Georeferenzierte bildgebende Sensordaten sind heute eine essenzielle Quelle für die automatisierte Gewinnung räumlicher Informationen. Die zunehmende Verfügbarkeit von Datenerfassungsinitiativen (Remote- und In-situ-Sensing) sowie Social Media und die Fortschritte in der künstlichen Intelligenz haben diese Entwicklung stark vorangetrieben. Das übergeordnete Ziel dieser Dissertation ist es, auf der Grundlage heterogener Geobilddaten und aktueller Techniken des Deep Learning (DL) Methoden aufzuzeigen, die eine effiziente, räumlich hoch aufgelöste sowie großflächige multikriterielle Charakterisierung der gebauten Umwelt für die Multirisikoanalyse ermöglichen. Dieses Ziel wird anhand von drei Teilstudien adressiert. Testgebiet ist die erdbebengefährdete Millionenmetropole Santiago de Chile.
Die erste Studie untersucht das Potenzial von Convolutional Neural Networks (CNN) und Street-Level-Bilddaten (SLI) für die automatisierte Erfassung vulnerabilitätsrelevanter Gebäudecharakteristika. Der verfolgte Ansatz beinhaltet einen hierarchischen Workflow, um die heterogenen SLI anwendungsorientiert zu akquirieren und strukturieren. Es werden dem Stand der Forschung entsprechende CNN eingesetzt, um i) den Seismischen Gebäudestrukturtyp (SBST), ii) das laterale Last abtragende System (LLRS; Lateral Load Resisting System) und iii) die Gebäudehöhe abzuleiten. Diese Attribute reflektieren die tragende Struktur eines Gebäudes und damit seine Widerstandsfähigkeit gegenüber den durch Naturgefahrenereignisse einwirkenden Kräften. Die experimentellen Ergebnisse zeigen für alle Klassifikationsaufgaben Genauigkeiten jenseits von 𝜅 = 0,81. Dies unterstreicht das Potenzial von SLI und DL für die In-situ-Gebäudedatenerfassung zur großräumigen Bewertung von Naturgefahrenrisiken.
Die zweite Studie zielt auf eine synergistische und effiziente multikriterielle Charakterisierung von Gebäuden für die Multihazard-Risikobewertung ab. Deep Multitask Learning (MTL) wird eingesetzt, um diese Anforderung zu erfüllen. Der entwickelte Ansatz verbessert die Prädiktionsgenauigkeit bei multiplen Bildklassifikationsproblemen durch die Berücksichtigung Task-übergreifender Interdependenzen. Ein intermediär überwachtes Hard Parameter Sharing CNN gibt Task-weise prädizierte Interimsklassenwahrscheinlichkeiten aus. Interdependenzen werden dann auf zwei Arten erfasst: i) durch das direkte Anhängen der Klassenwahrscheinlichkeiten an den Bildmerkmalsvektor (Multitask Stacking) und ii) durch deren Weiterleitung an rekurrente neuronale Netze (Gated Recurrent Units), um explizit Interdependenzrepräsentationen zu lernen (Interdependency Representation Learning). Die MTL-Architektur wird für die Klassifikation von Gebäuden nach fünf Zielvariablen angewandt: Höhe, LLRS-Material, SBST, Dachform und Blockposition. Die Klassifikationsgenauigkeiten der neuen MTL-Ansätze übertreffen sowohl Single Task Learning als auch klassisches Hard Parameter Sharing MTL. Bereits hohe initial geschätzte Generalisierungsfähigkeiten, konnten mit akkumulierten Task-spezifischen Residuen von mehr als +6 % 𝜅 deutlich gesteigert werden und erreichten mittlere Task-Genauigkeiten von bis zu 88,43 % Overall accuracy und 84,49 % 𝜅. Hinsichtlich des Trainingszeitaufwands erweisen sich die vorgeschlagenen MTL-Methoden als sehr effizient.
Die dritte Studie fokussiert auf das Potenzial der Integration heterogener multimodaler Geobilddaten – SLI, sehr hoch aufgelöste optische Fernerkundungsdaten sowie ein normalisiertes digitales Oberflächenmodell – für die flächenhafte Charakterisierung naturgefahrenexponierter Gebäude. Es wird eine objektbasierte Multi-Input-/Multi-Output-DL-Methodik vorgestellt, die eine multikriterielle Gebäudeklassifikation durch die synergistische Fusion der multisensoralen Daten ermöglicht. Um das Problem partiell fehlender SLI zu adressieren, wird der transformerbasierte SLI Spatial Context Encoder verwendet. Dieser nutzt die räumlichen Korrelationen zwischen physisch-strukturellen Gebäudeattributen, um die semantischen Informationen der vorhandenen SLI flächendeckend zugänglich zu machen. Für die Integration der multimodalen Informationen wird die Task-wise Modality Attention (TMA) Fusion vorgeschlagen. Diese optimiert die Merkmalsrepräsentationen für die einzelnen Inferenz-Tasks separat, nach deren spezifischen Anforderungen. Auf dieser Grundlage wird unter Berücksichtigung der beiden Datensituationen SLI verfügbar und SLI fehlend ein umfassender experimenteller Kreuzvergleich der Generalisierungsfähigkeiten durchgeführt und der Mehrwert der unterschiedlichen Modalitäten, ihrer Kombinationen sowie der TMA-Datenfusionsmethode evaluiert. Die Ergebnisse verdeutlichen den hohen semantischen Mehrwert von SLI sowie der abgeleiteten räumlich-kontextuellen Information für die Erfassung physisch-struktureller Gebäudecharakteristika. Zudem zeigen sie, dass alle Modalitätskombinationen positive Synergien bieten. Dabei erzielt die TMA-Fusion durchweg höhere mittlere Task-Genauigkeiten als die Benchmark-Methoden. Die genauesten Modelle werden für die Ableitung eines räumlich kontinuierlichen Expositionsmodells angewandt.
Die vorgestellte Methodik ermöglicht die automatisierte, großräumige Erfassung vulnerabilitätsbedingender Gebäudeattribute mit einzigartiger räumlicher und thematischer Auflösung. Diese Detailtiefe ist entscheidend für eine konsistente Bewertung der Multihazard-Vulnerabilität und damit für ein erfolgreiches Risiko- und Katastrophenmanagement. Die Ergebnisse dieser Dissertation liefern fundierte Einblicke in das Potenzial von multimodalen Geobilddaten und DL zur effizienten Bereitstellung von Expositionsinformationen. / The ongoing processes of population growth, urbanization, and climate change have led to a drastic increase in the number of people and assets exposed to natural hazards worldwide. For effective risk reduction and disaster management, up-to-date exposure models with detailed, spatially localized information on the built environment and its vulnerability are essential. However, such data are often insufficiently available. At the same time, holistic vulnerability assessments across multiple natural hazards place high demands on exposure models in terms of thematic information and spatial resolution, as i) different building attributes may affect vulnerability to different hazards, and ii) natural hazards may differ in spatial scale and exhibit distinct spatial variabilities.
Today, geo-referenced imaging sensor data are an essential source for the automated extraction of spatial information. This has been largely driven by the ever-increasing availability of data collection initiatives – both remote and in-situ sensing – along with social media and advancements in data analysis methods, particularly in the field of artificial intelligence. The overarching goal of this dissertation is to demonstrate methods based on heterogeneous geospatial image data and current deep learning (DL) techniques that enable efficient, high spatial resolution, large-scale multicriteria characterization of the built environment for multi-risk analysis. This objective is addressed through three sub-studies, with the test area being the earthquake-prone metropolis of Santiago, Chile.
The first study investigates the potential of Convolutional Neural Networks (CNN) and Street-Level Imagery (SLI) for the automated collection of vulnerability-related building characteristics. The approach involves a hierarchical workflow to acquire and structure heterogeneous SLI in an application-oriented manner. State-of-the-art CNN are used to derive: i) the Seismic Building Structural Type (SBST), ii) the Lateral Load Resisting System (LLRS) material, and iii) building height. These attributes reflect a building’s load-bearing structure and, consequently, its resistance to forces exerted by natural hazard events. The experimental results show classification accuracies above 𝜅 = 0.81 for all tasks, underlining the high potential of SLI and DL for in situ building data collection for natural hazard risk assessment at large spatial scales.
The second study aims at a synergistic and efficient multi-criteria characterization of buildings for multi-hazard risk assessment. Deep Multitask Learning (MTL) is used to address this challenge. The proposed deep MTL architecture enhances prediction accuracy in multiple image classification tasks by accounting for cross-task interdependencies. These interdependencies are inferred based on task-wise interim class label probability predictions from an intermediately supervised hard parameter sharing CNN: i) by directly stacking label probability sequences to the image feature vector (i.e., multitask stacking), and ii) by passing probability sequences to recurrent neural networks (Gated Reccurent Units) to explicitly learn cross-task interdependency representations (i.e., interdependency representation learning). The MTL architecture is applied to classify buildings according to five target variables: height, LLRS material, SBST, roof shape, and block position. The classification accuracies of the new MTL approaches outperform both single-task learning and classical hard-parameter sharing MTL. Already high initial estimated generalization capabilities can be significantly increased with accumulated task-specific residuals of more than +6 % 𝜅, achieving mean cross-task accuracy values of up to 88.43 % overall accuracy and 84.49 % 𝜅. In terms of training time, the proposed MTL methods also prove very efficient.
The third study explores the potential of integrating heterogeneous multimodal geospatial image data – SLI, very high-resolution optical remote sensing data, and a normalized digital surface model – for large-scale characterization of buildings exposed to natural hazards. An object-based multi-Input-/multi-Output DL methodology is presented, enabling the synergistic fusion of multi-sensor data for multi-criteria building classification. To address the issue of partially missing SLI data, the transformer-based SLI Spatial Context Encoder is proposed. This model leverages spatial correlations between physical-structural building attributes to make the semantic information of available SLI widely accessible. Task-wise Modality Attention (TMA) fusion is proposed to integrate the multimodal information. TMA optimizes feature representations for individual inference tasks separately, according to their specific requirements. Considering the two data scenarios – SLI available and SLI missing – a comprehensive experimental cross-comparison of generalization capabilities is conducted to evaluate the added value of the different modalities, their combinations, and TMA fusion. The results emphasize the high semantic value of SLI and the derived spatio-contextual representations for capturing physical-structural building characteristics. Additionally, they demonstrate that all modality combinations offer positive synergies. TMA fusion consistently outperforms benchmark methods. The most accurate models are subsequently applied to derive a spatially continuous exposure model.
The presented methodology enables the automated, large-scale collection of vulnerability-related building attributes with an unprecedented combination of spatial and thematic resolution. This level of detail is essential for a consistent assessment of multi-hazard vulnerability and, consequently, for effective risk and disaster management. The results of this dissertation offer indepth insights into the potential of multimodal geoimage data and DL for the efficient provision of exposure information.
|
430 |
Apprentissage profond pour l'analyse de l'EEG continu / Deep learning for continuous EEG analysisSors, Arnaud 27 February 2018 (has links)
Ces travaux de recherche visent à développer des méthodes d’apprentissage automatique pour l’analyse de l’électroencéphalogramme (EEG) continu. L’EEG continu est une modalité avantageuse pour l’évaluation fonctionnelle des états cérébraux en réanimation ou pour d’autres applications. Cependant son utilisation aujourd’hui demeure plus restreinte qu’elle ne pourrait l’être, car dans la plupart des cas l’interprétation est effectuée visuellement par des spécialistes.Les sous-parties de ce travail s’articulent autour de l’évaluation pronostique du coma post-anoxique, choisie comme application pilote. Un petit nombre d’enregistrement longue durée a été réalisé, et des enregistrements existants ont été récupérés au CHU Grenoble.Nous commençons par valider l’efficacité des réseaux de neurones profonds pour l’analyse EEG d’échantillons bruts. Nous choisissons à cet effet de travailler sur la classification de stades de sommeil. Nous utilisons un réseau de neurones convolutionnel adapté pour l’EEG que nous entrainons et évaluons sur le jeu de données SHHS (Sleep Heart Health Study). Cela constitue le premier system neuronal à cette échelle (5000 patients) pour l’analyse du sommeil. Les performances de classification atteignent ou dépassent l’état de l’art.En utilisation réelle, pour la plupart des applications cliniques le défi principal est le manque d’annotations adéquates sur les patterns EEG ou sur de court segments de données (et la difficulté d’en établir). Les annotations disponibles sont généralement haut niveau (par exemple, le devenir clinique) est sont donc peu nombreuses. Nous recherchons comment apprendre des représentations compactes de séquences EEG de façon non-supervisée/semi-supervisée. Le domaine de l’apprentissage non supervisé est encore jeune. Pour se comparer aux travaux existants nous commençons avec des données de type image, et investiguons l’utilisation de réseaux adversaires génératifs (GANs) pour l’apprentissage adversaire non-supervisé de représentations. La qualité et la stabilité de différentes variantes sont évaluées. Nous appliquons ensuite un GAN de Wasserstein avec pénalité sur les gradients à la génération de séquences EEG. Le système, entrainé sur des séquences mono-piste de patients en coma post anoxique, est capable de générer des séquences réalistes. Nous développons et discutons aussi des idées originales pour l’apprentissage de représentations en alignant des distributions dans l’espace de sortie du réseau représentatif.Pour finir, les signaux EEG multipistes ont des spécificités qu’il est souhaitable de prendre en compte dans les architectures de caractérisation. Chaque échantillon d’EEG est un mélange instantané des activités d’un certain nombre de sources. Partant de ce constat nous proposons un système d’analyse composé d’un sous-système d’analyse spatiale suivi d’un sous-système d’analyse temporelle. Le sous-système d’analyse spatiale est une extension de méthodes de séparation de sources construite à l’aide de couches neuronales avec des poids adaptatifs pour la recombinaison des pistes, c’est à dire que ces poids ne sont pas appris mais dépendent de caractéristiques du signal d’entrée. Nous montrons que cette architecture peut apprendre à réaliser une analyse en composantes indépendantes, si elle est entrainée sur une mesure de non-gaussianité. Pour l’analyse temporelle, des réseaux convolutionnels classiques utilisés séparément sur les pistes recombinées peuvent être utilisés. / The objective of this research is to explore and develop machine learning methods for the analysis of continuous electroencephalogram (EEG). Continuous EEG is an interesting modality for functional evaluation of cerebral state in the intensive care unit and beyond. Today its clinical use remains more limited that it could be because interpretation is still mostly performed visually by trained experts. In this work we develop automated analysis tools based on deep neural models.The subparts of this work hinge around post-anoxic coma prognostication, chosen as pilot application. A small number of long-duration records were performed and available existing data was gathered from CHU Grenoble. Different components of a semi-supervised architecture that addresses the application are imagined, developed, and validated on surrogate tasks.First, we validate the effectiveness of deep neural networks for EEG analysis from raw samples. For this we choose the supervised task of sleep stage classification from single-channel EEG. We use a convolutional neural network adapted for EEG and we train and evaluate the system on the SHHS (Sleep Heart Health Study) dataset. This constitutes the first neural sleep scoring system at this scale (5000 patients). Classification performance reaches or surpasses the state of the art.In real use for most clinical applications, the main challenge is the lack of (and difficulty of establishing) suitable annotations on patterns or short EEG segments. Available annotations are high-level (for example, clinical outcome) and therefore they are few. We search how to learn compact EEG representations in an unsupervised/semi-supervised manner. The field of unsupervised learning using deep neural networks is still young. To compare to existing work we start with image data and investigate the use of generative adversarial networks (GANs) for unsupervised adversarial representation learning. The quality and stability of different variants are evaluated. We then apply Gradient-penalized Wasserstein GANs on EEG sequences generation. The system is trained on single channel sequences from post-anoxic coma patients and is able to generate realistic synthetic sequences. We also explore and discuss original ideas for learning representations through matching distributions in the output space of representative networks.Finally, multichannel EEG signals have specificities that should be accounted for in characterization architectures. Each EEG sample is an instantaneous mixture of the activities of a number of sources. Based on this statement we propose an analysis system made of a spatial analysis subsystem followed by a temporal analysis subsystem. The spatial analysis subsystem is an extension of source separation methods built with a neural architecture with adaptive recombination weights, i.e. weights that are not learned but depend on features of the input. We show that this architecture learns to perform Independent Component Analysis if it is trained on a measure of non-gaussianity. For temporal analysis, standard (shared) convolutional neural networks applied on separate recomposed channels can be used.
|
Page generated in 0.0971 seconds