401 |
ADVANCED TRANSFER LEARNING IN DOMAINS WITH LOW-QUALITY TEMPORAL DATA AND SCARCE LABELSAbdel Hai, Ameen, 0000-0001-5173-5291 12 1900 (has links)
Numerous of high-impact applications involve predictive modeling of real-world data. This spans from hospital readmission prediction for enhanced patient care up to event detection in power systems for grid stabilization. Developing performant machine learning models necessitates extensive high-quality training data, ample labeled samples, and training and testing datasets derived from identical distributions. Though, such methodologies may be impractical in applications where obtaining labeled data is expensive or challenging, the quality of data is low, or when challenged with covariate or concept shifts. Our emphasis was on devising transfer learning methods to address the inherent challenges across two distinct applications.We delved into a notably challenging transfer learning application that revolves around predicting hospital readmission risks using electronic health record (EHR) data to identify patients who may benefit from extra care. Readmission models based on EHR data can be compromised by quality variations due to manual data input methods. Utilizing high-quality EHR data from a different hospital system to enhance prediction on a target hospital using traditional approaches might bias the dataset if distributions of the source and target data are different. To address this, we introduce an Early Readmission Risk Temporal Deep Adaptation Network, ERR-TDAN, for cross-domain knowledge transfer. A model developed using target data from an urban academic hospital was enhanced by transferring knowledge from high-quality source data. Given the success of our method in learning from data sourced from multiple hospital systems with different distributions, we further addressed the challenge and infeasibility of developing hospital-specific readmission risk prediction models using data from individual hospital systems. Herein, based on an extension of the previous method, we introduce an Early Readmission Risk Domain Generalization Network, ERR-DGN. It is adept at generalizing across multiple EHR data sources and seamlessly adapting to previously unseen test domains.
In another challenging application, we addressed event detection in electrical grids where dependencies are spatiotemporal, highly non-linear, and non-linear systems using high-volume field-recorded data from multiple Phasor Measurement Units (PMUs). Existing historical event logs created manually do not correlate well with the corresponding PMU measurements due to scarce and temporally imprecise labels. Extending event logs to a more complete set of labeled events is very costly and often infeasible to obtain. We focused on utilizing a transfer learning method tailored for event detection from PMU data to reduce the need for additional manual labeling. To demonstrate the feasibility, we tested our approach on large datasets collected from the Western and Eastern Interconnections of the U.S.A. by reusing a small number of carefully selected labeled PMU data from a power system to detect events from another.
Experimental findings suggest that the proposed knowledge transfer methods for healthcare and power system applications have the potential to effectively address the identified challenges and limitations. Evaluation of the proposed readmission models show that readmission risk predictions can be enhanced when leveraging higher-quality EHR data from a different site, and when trained on data from multiple sites and subsequently applied to a novel hospital site. Moreover, labels scarcity in power systems can be addressed by a transfer learning method in conjunction with a semi-supervised algorithm that is capable of detecting events based on minimal labeled instances. / Computer and Information Science
|
402 |
End-To-End Text Detection Using Deep LearningIbrahim, Ahmed Sobhy Elnady 19 December 2017 (has links)
Text detection in the wild is the problem of locating text in images of everyday scenes. It is a challenging problem due to the complexity of everyday scenes. This problem possesses a great importance for many trending applications, such as self-driving cars.
Previous research in text detection has been dominated by multi-stage sequential approaches which suffer from many limitations including error propagation from one stage to the next.
Another line of work is the use of deep learning techniques. Some of the deep methods used for text detection are box detection models and fully convolutional models. Box detection models suffer from the nature of the annotations, which may be too coarse to provide detailed supervision. Fully convolutional models learn to generate pixel-wise maps that represent the location of text instances in the input image. These models suffer from the inability to create accurate word level annotations without heavy post processing.
To overcome these aforementioned problems we propose a novel end-to-end system based on a mix of novel deep learning techniques. The proposed system consists of an attention model, based on a new deep architecture proposed in this dissertation, followed by a deep network based on Faster-RCNN. The attention model produces a high-resolution map that indicates likely locations of text instances. A novel aspect of the system is an early fusion step that merges the attention map directly with the input image prior to word-box prediction. This approach suppresses but does not eliminate contextual information from consideration. Progressively larger models were trained in 3 separate phases. The resulting system has demonstrated an ability to detect text under difficult conditions related to illumination, resolution, and legibility.
The system has exceeded the state of the art on the ICDAR 2013 and COCO-Text benchmarks with F-measure values of 0.875 and 0.533, respectively. / Ph. D. / Text detection and recognition in the wild is the problem of locating and reading text in images of everyday scenes. Text detection refers to finding the bounding boxes that describe the location of text areas in an input image, while text recognition describes the problem of generating a transcript out of the detected text areas. Recognition can be viewed as simply Optical Character Recognition (OCR). OCR is an old problem where the developed models are considered mature. Text detection and recognition are challenging problems due to the complexity of everyday scenes, compared to the simpler problem of recognizing text in scanned documents. This problem possesses a great importance to many trending applications that need to locate and read text in the wild, such as self-driving cars. Researchers tend to focus on the text detection problem only due to the maturity of research related to text recognition. Previous research in text detection has been dominated by multi-stage sequential approaches. Those methods suffer from many limitations including, but not limited to, error propagation from the earlier stages to the later stages of the pipeline. Another line of work is the use of deep learning techniques. Deep learning is the state of the art in machine learning. It has demonstrated great success in many domains, including computer vision. Some of the deep methods used for text detection are box detection models and fully convolutional models. Box detection models learn to generate bounding box coordinates for text instances that exist in the input image. Box detection models suffer from the nature of the annotations, which may be too coarse to provide detailed supervision. Fully convolutional models learn to generate pixel-wise maps that represent the location of text instances in the input image. These models suffer from the inability to create accurate word level annotations without heavy post processing. To overcome these aforementioned problems we propose a novel end-to-end system based on a mix of novel deep learning techniques. The proposed system consists of an attention model followed by a network based on Faster-RCNN that has been conditioned to generate word-box predictions. The attention model produces a high-resolution map that indicates likely locations of text instances. A novel aspect of the system is an early fusion step that merges the attention map directly with the input image prior to word-box prediction. This approach suppresses but does not eliminate contextual information from consideration, and avoids the common problem of discarding small text regions. To facilitate training of the end-to-end system, progressively larger models were trained in 3 separate phases. The resulting system has demonstrated an ability to detect text under difficult conditions related to illumination, resolution, and legibility. The system has exceeded the state of the art on the well-known ICDAR 2013 and COCO-Text benchmarks. For the former case, the system has produced results with an F-measure value of 0.875. For the more challenging COCO-Text dataset, the system has shown a dramatic increase in performance with an F-measure value to 0.533, as compared to previously reported values in the range of 0.33 to 0.37. In order to build a powerful system, we introduced a novel deep learning architecture that achieved impressive performance on standard benchmarks. This architecture has been used as a backbone for the proposed attention model. A description of the proposed end-to-end system, as well as the implementation steps, will be detailed in the following sections.
|
403 |
Computational extended depth of field fluorescence microscopy in miniaturized and tabletop platformsGreene, Joseph 10 September 2024 (has links)
Fluorescence microscopy has become an indispensable technology to push fundamental neuroscience by recovering labeled neural structures with high resolution. To enable these studies, the field has adopted the use of low-cost widefield 1-photon epi-fluorescence microscopes to image fixed samples and miniaturized head-mounted miniscopes to monitor neural activity in freely behaving animals. However, fluorescence imaging platforms face a number of challenges such as a limited depth of field (DoF), lack of optical sectioning, and susceptibility to scattering and aberrations which compromises the image quality and signal fidelity. As a result, neural studies are often constrained to a shallow volume near the surface of the sample and are limited by high noise and background.
To overcome these challenges, this thesis introduces two novel frameworks that combine pupil engineering with computational imaging to push the performance of miniaturized and tabletop fluorescence neural imaging platforms. These strategies will directly optimize and integrate custom phase elements on the often-vacant pupil plane to enable the encoding of extended fluorescence signals by designing a point spread function (PSF) that exhibits an extended depth of field (EDoF) in scattering media. Next, these strategies will use tailored post processing algorithms to recover that extended information from the resulting images. As a result, this strategy allows for the recovery of sources in an extended neural volume without compromising the optical resolution or imaging speed on the underlying platform.
First, this thesis introduces EDoF-Miniscope, a miniaturized neural imaging platform which utilizes a novel physics-informed genetic algorithm to optimize a lightweight binary diffractive optical element (DOE) on the pupil plane. By integrating the binary DOE into a prototype platform, EDoF-Miniscope is able to achieve a 2.8x extension in the DoF between twin imaging foci in neural samples. To enable the recovery of the extended sources, this thesis utilizes a straightforward post-processing filter, which can recover neuronal signals with an SBR down to 1.08. Overall, this framework introduces a generalizable, compact and lightweight solution for augmenting miniscopes with a computational EDoF.
Next, I improve upon the proposed framework by designing a flexible 1-photon widefield tabletop platform, entitled EDoF-Tabletop, that exhibits comparable field-of-view (FoV, FoV = 0.6x0.6mm), numerical aperture (NA, NA = 0.5) and aberrations to a miniscope. This platform utilizes a spatial light modulator (SLM) on the pupil plane to rapidly deploy optimized pupil phase profiles without the need of manufacturing, aligning and integrating miniaturized optics. EDoF-Tabletop incorporates a deep optics pipeline, which utilizes novel physical modeling, initialization and training strategies to simultaneously and reliably learn a user-defined EDoFs and a reconstruction using synthetic-only data. As a result, EDoF-Tabletop is able to encode and recover signals from EDoFs up to 140-microns deep in neural samples and 400-microns deep in non-scattering samples.
By combining pupil engineering with computational imaging, EDoF-Miniscope and
EDoF-Tabletop showcase the potential to enhance neural imaging platforms by extracting information from extended volumes in the brain. By focusing on flexible optimization algorithms and rapid prototyping capabilities, the advancements introduced in this thesis promise broader utility across fluorescence microscopy, where capturing detailed information from complex biological samples is essential for advancing scientific understanding.
|
404 |
The Automated Prediction of Solar Flares from SDO Images Using Deep LearningAbed, Ali K., Qahwaji, Rami S.R., Abed, A. 21 March 2021 (has links)
Yes / In the last few years, there has been growing interest in near-real-time solar data processing, especially for space weather applications. This is due to space weather impacts on both space-borne and ground-based systems, and industries, which subsequently impacts our lives. In the current study, the deep learning approach is used to establish an automated hybrid computer system for a short-term forecast; it is achieved by using the complexity level of the sunspot group on SDO/HMI Intensitygram images. Furthermore, this suggested system can generate the forecast for solar flare occurrences within the following 24 h. The input data for the proposed system are SDO/HMI full-disk Intensitygram images and SDO/HMI full-disk magnetogram images. System outputs are the “Flare or Non-Flare” of daily flare occurrences (C, M, and X classes). This system integrates an image processing system to automatically detect sunspot groups on SDO/HMI Intensitygram images using active-region data extracted from SDO/HMI magnetogram images (presented by Colak and Qahwaji, 2008) and deep learning to generate these forecasts. Our deep learning-based system is designed to analyze sunspot groups on the solar disk to predict whether this sunspot group is capable of releasing a significant flare or not. Our system introduced in this work is called ASAP_Deep. The deep learning model used in our system is based on the integration of the Convolutional Neural Network (CNN) and Softmax classifier to extract special features from the sunspot group images detected from SDO/HMI (Intensitygram and magnetogram) images. Furthermore, a CNN training scheme based on the integration of a back-propagation algorithm and a mini-batch AdaGrad optimization method is suggested for weight updates and to modify learning rates, respectively. The images of the sunspot regions are cropped automatically by the imaging system and processed using deep learning rules to provide near real-time predictions. The major results of this study are as follows. Firstly, the ASAP_Deep system builds on the ASAP system introduced in Colak and Qahwaji (2009) but improves the system with an updated deep learning-based prediction capability. Secondly, we successfully apply CNN to the sunspot group image without any pre-processing or feature extraction. Thirdly, our system results are considerably better, especially for the false alarm ratio (FAR); this reduces the losses resulting from the protection measures applied by companies. Also, the proposed system achieves a relatively high scores for True Skill Statistics (TSS) and Heidke Skill Score (HSS).
|
405 |
Deep Autofocusing for Digital Pathology Whole Slide ImagingLi, Qiang January 2024 (has links)
The quality of clinical pathology is a critical index for evaluating a nation's healthcare level. Recently developed digital pathology techniques have the capability to transform pathological slides into digital whole slide images (WSI). This transformation facilitates data storage, online transmission, real-time viewing, and remote consultations, significantly elevating clinical diagnosis. The effectiveness and efficiency of digital pathology imaging often hinge on the precision and speed of autofocusing.
However, achieving autofocusing of pathological images presents challenges under constraints including uneven focus distribution and limited Depth of Field (DoF).
Current autofocusing methods, such as those relying on image stacks, need to use more time and resources for capturing and processing images. Moreover, autofocusing based on reflective hardware systems, despite its efficiency, incurs significant hardware costs and suffers from a lack of system compatibility. Finally, machine learning-based autofocusing can circumvent repetitive mechanical movements and camera shots. However, a simplistic end-to-end implementation that does not account for the imaging process falls short of delivering satisfactory focus prediction and in-focus image restoration.
In this thesis, we present three distinct autofocusing techniques for defocus pathology images:
(1) Aberration-aware Focal Distance Prediction leverages the asymmetric effects of optical aberrations, making it ideal for focus prediction within focus map scenarios;
(2) Dual-shot Deep Autofocusing with a Fixed Offset Prior is designed to merge two images taken at different defocus distances with fixed positions, ensuring heightened accuracy in in-focus image restoration for fast offline situations;
(3) Semi-blind Deep Restoration of Defocus Images utilizes multi-task joint prediction guided by PSF, enabling high-efficiency, single-pass scanning for offline in-focus image restoration. / Thesis / Doctor of Philosophy (PhD)
|
406 |
Think outside the Black Box: Model-Agnostic Deep Learning with Domain Knowledge / Think outside the Black Box: Modellagnostisches Deep Learning mit DomänenwissenKobs, Konstantin January 2024 (has links) (PDF)
Deep Learning (DL) models are trained on a downstream task by feeding (potentially preprocessed) input data through a trainable Neural Network (NN) and updating its parameters to minimize the loss function between the predicted and the desired output. While this general framework has mainly remained unchanged over the years, the architectures of the trainable models have greatly evolved. Even though it is undoubtedly important to choose the right architecture, we argue that it is also beneficial to develop methods that address other components of the training process. We hypothesize that utilizing domain knowledge can be helpful to improve DL models in terms of performance and/or efficiency. Such model-agnostic methods can be applied to any existing or future architecture. Furthermore, the black box nature of DL models motivates the development of techniques to understand their inner workings. Considering the rapid advancement of DL architectures, it is again crucial to develop model-agnostic methods.
In this thesis, we explore six principles that incorporate domain knowledge to understand or improve models. They are applied either on the input or output side of the trainable model. Each principle is applied to at least two DL tasks, leading to task-specific implementations. To understand DL models, we propose to use Generated Input Data coming from a controllable generation process requiring knowledge about the data properties. This way, we can understand the model’s behavior by analyzing how it changes when one specific high-level input feature changes in the generated data. On the output side, Gradient-Based Attribution methods create a gradient at the end of the NN and then propagate it back to the input, indicating which low-level input features have a large influence on the model’s prediction. The resulting input features can be interpreted by humans using domain knowledge.
To improve the trainable model in terms of downstream performance, data and compute efficiency, or robustness to unwanted features, we explore principles that each address one of the training components besides the trainable model. Input Masking and Augmentation directly modifies the training input data, integrating knowledge about the data and its impact on the model’s output. We also explore the use of Feature Extraction using Pretrained Multimodal Models which can be seen as a beneficial preprocessing step to extract useful features. When no training data is available for the downstream task, using such features and domain knowledge expressed in other modalities can result in a Zero-Shot Learning (ZSL) setting, completely eliminating the trainable model. The Weak Label Generation principle produces new desired outputs using knowledge about the labels, giving either a good pretraining or even exclusive training dataset to solve the downstream task. Finally, improving and choosing the right Loss Function is another principle we explore in this thesis. Here, we enrich existing loss functions with knowledge about label interactions or utilize and combine multiple task-specific loss functions in a multitask setting.
We apply the principles to classification, regression, and representation tasks as well as to image and text modalities. We propose, apply, and evaluate existing and novel methods to understand and improve the model. Overall, this thesis introduces and evaluates methods that complement the development and choice of DL model architectures. / Deep-Learning-Modelle (DL-Modelle) werden trainiert, indem potenziell vorverarbeitete Eingangsdaten durch ein trainierbares Neuronales Netz (NN) geleitet und dessen Parameter aktualisiert werden, um die Verlustfunktion zwischen der Vorhersage und der gewünschten Ausgabe zu minimieren. Während sich dieser allgemeine Ablauf kaum geändert hat, haben sich die verwendeten NN-Architekturen erheblich weiterentwickelt. Auch wenn die Wahl der Architektur für die Aufgabe zweifellos wichtig ist, schlagen wir in dieser Arbeit vor, Methoden für andere Komponenten des Trainingsprozesses zu entwickeln. Wir vermuten, dass die Verwendung von Domänenwissen hilfreich bei der Verbesserung von DL-Modellen bezüglich ihrer Leistung und/oder Effizienz sein kann. Solche modellagnostischen Methoden sind dann bei jeder bestehenden oder zukünftigen NN-Architektur anwendbar. Die Black-Box-Natur von DL-Modellen motiviert zudem die Entwicklung von Methoden, die zum Verständnis der Funktionsweise dieser Modelle beitragen. Angesichts der schnellen Architektur-Entwicklung ist es wichtig, modellagnostische Methoden zu entwickeln.
In dieser Arbeit untersuchen wir sechs Prinzipien, die Domänenwissen verwenden, um Modelle zu verstehen oder zu verbessern. Sie werden auf Trainingskomponenten im Eingang oder Ausgang des Modells angewendet. Jedes Prinzip wird dann auf mindestens zwei DL-Aufgaben angewandt, was zu aufgabenspezifischen Implementierungen führt. Um DL-Modelle zu verstehen, verwenden wir kontrolliert generierte Eingangsdaten, was Wissen über die Dateneigenschaften benötigt. So können wir das Verhalten des Modells verstehen, indem wir die Ausgabeänderung bei der Änderung von abstrahierten Eingabefeatures beobachten. Wir untersuchen zudem gradienten-basierte Attribution-Methoden, die am Ausgang des NN einen Gradienten anlegen und zur Eingabe zurückführen. Eingabefeatures mit großem Einfluss auf die Modellvorhersage können so identifiziert und von Menschen mit Domänenwissen interpretiert werden.
Um Modelle zu verbessern (in Bezug auf die Ergebnisgüte, Daten- und Recheneffizienz oder Robustheit gegenüber ungewollten Eingaben), untersuchen wir Prinzipien, die jeweils eine Trainingskomponente neben dem trainierbaren Modell betreffen. Das Maskieren und Augmentieren von Eingangsdaten modifiziert direkt die Trainingsdaten und integriert dabei Wissen über ihren Einfluss auf die Modellausgabe. Die Verwendung von vortrainierten multimodalen Modellen zur Featureextraktion kann als ein Vorverarbeitungsschritt angesehen werden. Bei fehlenden Trainingsdaten können die Features und Domänenwissen in anderen Modalitäten als Zero-Shot Setting das trainierbare Modell gänzlich eliminieren. Das Weak-Label-Generierungs-Prinzip erzeugt neue gewünschte Ausgaben anhand von Wissen über die Labels, was zu einem Pretrainings- oder exklusiven Trainigsdatensatz führt. Schließlich ist die Verbesserung und Auswahl der Verlustfunktion ein weiteres untersuchtes Prinzip. Hier reichern wir bestehende Verlustfunktionen mit Wissen über Label-Interaktionen an oder kombinieren mehrere aufgabenspezifische Verlustfunktionen als Multi-Task-Ansatz.
Wir wenden die Prinzipien auf Klassifikations-, Regressions- und Repräsentationsaufgaben sowie Bild- und Textmodalitäten an. Wir stellen bestehende und neue Methoden vor, wenden sie an und evaluieren sie für das Verstehen und Verbessern von DL-Modellen, was die Entwicklung und Auswahl von DL-Modellarchitekturen ergänzt.
|
407 |
Optimizing Neural Network Models for Healthcare and Federated LearningVerardo, Giacomo January 2024 (has links)
Neural networks (NN) have demonstrated considerable capabilities in tackling tasks in a diverse set of fields, including natural language processing, image classification, and regression. In recent years, the amount of available data to train Deep Learning (DL) models has increased tremendously, thus requiring larger and larger models to learn the underlying patterns in the data. Inference time, communication cost in the distributed case, required storage resources, and computational capabilities have increased proportional to the model's size, thus making NNs less suitable for two cases: i) tasks requiring low inference time (e.g., real-time monitoring) and ii) training on low powered devices. These two cases, which have become crucial in the last decade due to the pervasiveness of low-powered devices and NN models, are addressed in this licentiate thesis. As the first contribution, we analyze the distributed case with multiple low-powered devices in a federated scenario. Cross-device Federated Learning (FL) is a branch of Machine Learning (ML) where multiple participants train a common global model without sharing data in a centralized location. In this thesis, a novel technique named Coded Federated Dropout (CFD) is proposed to carefully split the global model into sub-models, thus increasing communication efficiency and reducing the burden on the devices with only a slight increase in training time. We showcase our results for an example image classification task. As the second contribution, we consider the anomaly detection task on Electrocardiogram (ECG) recordings and show that including prior knowledge in NNs models drastically reduces model size, inference time, and storage resources for multiple state-of-the-art NNs. In particular, this thesis focuses on AEs, a subclass of NNs, which is suitable for anomaly detection. I propose a novel approach, called FMM-Head, which incorporates basic knowledge of the ECG waveform shape into an AE. The evaluation shows that we improve the AUROC of baseline models while guaranteeing under-100ms inference time, thus enabling real-time monitoring of ECG recordings from hospitalized patients. Finally, several potential future works are presented. The inclusion of prior knowledge can be further exploited in the ECG Imaging (ECGI) case, where hundreds of ECG sensors are used to reconstruct the 3D electrical activity of the heart. For ECGI, the reduction in the number of sensors employed (i.e., the input space) is also beneficial in terms of reducing model size. Moreover, this thesis advocates additional techniques to integrate ECG anomaly detection in a distributed and federated case. / Neurala nätverk (NN) har visat god förmåga att tackla uppgifter inom en mängd olika områden, inklusive Natural Language Processing (NLP), bildklassificering och regression. Under de senaste åren har mängden tillgänglig data för att träna Deep Learning (DL)-modeller ökat enormt, vilket kräver större och större modeller för att lära sig de underliggande mönstren i datan. Inferens tid och kommunikationskostnad i det distribuerade fallet, nödvändiga lagringsresurser och beräkningskapacitet har ökat proportionerligt mot modellens storlek vilket gör NN mindre lämpliga använda i två fall: (i) uppgifter som kräver snabba slutledningar (t.ex. realtidsövervakning) och (ii) användning på mindre kraftfulla enheter. De här två fallen, som har blivit mer förekommande under det senaste decenniet på grund av omfattningen av mindre kraftfulla enheter och NN-modeller, behandlas i denna licentiatuppsats. Som det första bidraget analyserar vi det distribuerade fallet med flera lättdrivna enheter i ett federerat scenario. Cross-device Federated Learning (FL) är en gren av Machine Learning (ML) där flera deltagare tränar en gemensam global modell utan att dela data på en centraliserad plats. I denna avhandling föreslås en nyteknik, Coded Federated Dropout (CFD), som delar upp den globala modellen i undermodeller, vilket ökar kommunikationseffektiviteten och samtidigt minskar belastningen på enheterna. Detta erhålls med endast en liten förlängning av träningstiden. Vi delger våra resultat för en exempeluppgift för bildklassificering. Som det andra bidraget betraktar vi anomalidetekteringsuppgiften Elektrokardiogram (EKG)-registrering och visar att inklusionen av förkunskaper i NN-modeller drastiskt minskar modellstorlek, inferenstider och lagringsresurser för flera moderna NN. Speciellt fokuserar denna avhandling på Autoencoders (AEs), en delmängd av NN, lämplig för avvikelsedetektering. En ny metod, kallad FMM-Head, föreslås. vilken omformar grundläggande kunskaper om EKG-vågformen till en AE. Utvärderingen visar att vi förbättrar arean under kurvan (AUROC) för baslinjemodeller samtidigt som vi garanterar under 100 ms inferenstid, vilket möjliggör realtidsövervakning av EKG-inspelningar från inlagda patienter. Slutligen presenteras flera potentiella framtida utvidgningar. Införandet av tidigare kunskap kan utnyttjas ytterligare i fallet med EKG Imaging (ECGI), där hundratals EKG-sensorer används för att rekonstruera den elektriska 3D-aktiviteten hos hjärtat. För ECGI är minskningen av antalet använda sensorer (dvs inmatningsutrymme) också fördelaktig när det gäller att minska modellstorleken. Dessutom förespråkas i denna avhandling ytterligare tekniker för att integrera EKG-avvikelsedetektering i distribuerade och federerade fall. / <p>This research leading to this thesis is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Research Administration (ORA) under Award No. ORA-CRG2021-4699</p>
|
408 |
Development Of A Deep Learning Algorithm Using Electromyography (EMG) And Acceleration To Monitor Upper Extremity Behavior With Application To Individuals Post-StrokeDodd, Nathan 01 June 2024 (has links) (PDF)
Stroke is a chronic illness which often impairs survivors for extended periods of time,
leaving the individual limited in motor function. The ability to perform daily activities
(ADL) is closely linked to motor recovery following a stroke. The objective of
this work is to employ surface electromyography (sEMG) gathered through a novel,
wearable armband sensor to monitor and quantify ADL performance. The first contribution
of this work seeks to develop a relationship between sEMG and and grip
aperture, a metric tied to the success of post-stroke individuals’ functional independence.
The second contribution of this work aims to develop a deep learning model
to classify RTG movements in the home setting using continuous EMG and acceleration
data. In contribution one, ten non-disabled participants (10M, 22.5 0.5 years)
were recruited. We performed a correlation analysis between aperture and peak EMG
value, as well as a one-way non parametric analysis to determine cylinder diameter
effect on aperture. In contribution two, one non-disabled participant is instructed to
wash a set of dishes. The EMG and acceleration data collected is input into a recurrent
neural network (RNN) machine learning model to classify movement patterns.
The first contribution’s analysis demonstrated a strong positive correlation between
aperture and peak EMG value, as well as a statistically significant effect of diameter
(p < 0.001). The RNN model built in contribution two demonstrated high capability
at classifying movement at 94% accuracy and an F1-score of 86%. These results
demonstrate promising feasibility for long-term, in-home classification of daily tasks.
Future applications of this approach should consider extending the procedure to
include post-stroke individuals, as this could offer valuable insight into motor recovery
within the home setting.
|
409 |
Intelligently Leveraging Multi-Channel Image Processing Neural Networks for Multi-View Co-Channel Signal DetectionKoppikar, Nidhi Nitin 19 August 2024 (has links)
The evolution of technology and gadgets has led to a significant increase in the number of transmitted signals, making RF sensing more complex than ever. Challenges such as signal interference and the lack of prior information about all signal parameters further complicate the task. To address this challenge, researchers have explored machine learning and deep learning approaches to generalize solutions for real-world sensing problems. In this thesis, we focus on two key issues in RF signal detection using deep learning. Firstly, we tackle the problem of increasing signal detection coverage by utilizing multiple resolution eigengram images derived from a bank of channelizers. These channelizers, varying in size, are adept at sensing different types of signals, such as low duration or low bandwidth signals. Channelizer deconfliction is a known challenge in RFML. We use YOLO, a deep learning algorithm, to deconflict the outputs from different channelizers to avoid overreporting. YOLO's ability to handle three channels makes it ideal for our study as we also use three channelizers.
While our approach is not dependent on YOLO, it provides a good testing ground for this study. To address signal overlap, we utilize an eigengram image capturing the overlap region between signals. By overlaying this eigengram onto the original, we create a composite image highlighting the overlap. We train another YOLO model using two channels, one for each eigengram, enabling detection even with over 50 percent overlap. This work is versatile and promising, extending to other signal visualizations. It has significant potential for wireless industry applications and sets the stage for further RFML research. / Master of Science / Due to the exponential growth in Radio Frequency (RF) signals over the last few decades, brought about by the proliferation of gadgets, signal detection has become more complex than ever. To address these complexities in signal sensing, adopting a dynamic approach that is not reliant on specific parameters or thresholds is essential. RF approaches using deep learning show great promise in tackling these challenges. Deep learning is the branch of machine learning based on artificial neural networks. An artificial neural network uses layers of interconnected nodes called neurons that work together to process and learn from the input data. The first part of this thesis addresses increasing signal coverage by leveraging different signal perspectives, each capturing unique characteristics. By combining these perspectives into a dataset, we train a deep learning model that incorporates the strengths of each view, resulting in maximum detection coverage. The novelty lies in innovative data preprocessing techniques and using YOLO to deconflict signal views with up to three channelizers. In the second part, we focus on detecting overlapped or occluded signals.
We utilize a new dimension of information describing interference regions between signals.
By integrating this overlap perspective, we enhance the dataset to identify instances of extensive signal overlap and determine their regions of coverage. This enhancement enables the deep learning network to identify patterns and effectively detect highly overlapped or completely occluded signals.
|
410 |
Deep Learning Using Vision And LiDAR For Global Robot LocalizationGowling, Brett E 01 May 2024 (has links) (PDF)
As the field of mobile robotics rapidly expands, precise understanding of a robot’s position and orientation becomes critical for autonomous navigation and efficient task performance. In this thesis, we present a snapshot-based global localization machine learning model for a mobile robot, the e-puck, in a simulated environment. Our model uses multimodal data to predict both position and orientation using the robot’s on-board cameras and LiDAR sensor. In an effort to minimize localization error, we explore different sensor configurations by varying the number of cameras and LiDAR layers used. Additionally, we investigate the performance benefits of different multimodal fusion strategies while leveraging the EfficientNet CNN architecture as our model’s foundation. Data collection and testing is conducted using Webots simulation software, and our results show that, when tested in a 12m x 12m simulated apartment environment, our model is able to achieve positional accuracy within 0.2m for each of the x and y coordinates and orientation accuracy within 2°, all without the need for sequential data history. Our results demonstrate the potential for accurate global localization of mobile robots in simulated environments without the need for existing maps or temporal data.
|
Page generated in 0.1182 seconds