Spelling suggestions: "subject:"deep learning (cachine learning)"" "subject:"deep learning (amachine learning)""
1 |
High performance Deep Learning based Digital Pre-distorters for RF Power AmplifiersKudupudi, Rajesh 25 January 2022 (has links)
In this work, we present different deep learning-based digital pre-distorters and compare them based on their performance towards improving the linearity of highly non-linear power amplifiers. The simulation results show that BiLSTM based DPDs work the best in terms of improving the linearity performance. We also compare two methodologies of direct learning and indirect learning to develop deep learning-based digital pre-distorters (DL-DPDs) models and evaluate their improvement on the linearity of Power Amplifiers (PA). We carry out a theoretical analysis on the differences between these training methodologies and verify their performance with simulation results on class-AB and class-F⁻¹ PAs. The simulation results show that both the learning methods lead to an improvement of more than 12 dB and 11dB in the linearity of class-AB and class-F⁻¹ PAs respectively, with indirect learning DL-DPD offering marginally better performance. Moreover, we compare the DL-DPD with memory polynomial models and show that using the former gives a significant improvement over the memory polynomials. Furthermore, we discuss the advantages of exploiting a BiLSTM based neural network architecture for designing direct/indirect DPDs. We demonstrate that BiLSTM DPD can be used to pre distort signals of any size without the drop in linearity. Moreover, based on the insights we develop a frequency domain loss using which further increased the linearity of the PA. / Master of Science / Wireless communication devices have fundamentally changed the way we interact with people. This increased the user's reliance on communication devices and significantly grew the need for higher data rates and faster internet speeds. But one major obstacle inside the transmitter chain (antenna) with increasing the data rates is the power amplifier, which distorts the signals at these higher powers. This distortion will reduce the efficiency and reliability of communication systems, greatly decreasing the quality of communication. So, we developed a high-performance DPD using deep learning to combat this issue. In this paper, we compare different deep learning-based DPDs and analyze which offers better performance. We also contrast two training methodologies to learn these DL-DPDs, theoretically and with simulation to arrive at which method offers better performing DPDs. We do these experiments on two different types of power amplifiers, and signals of any length. We design a new loss function, such that optimizing it leads to better DL-DPDs.
|
2 |
Synthesizing Realistic Data for Vision Based Drone-to-Drone DetectionYellapantula, Sudha Ravali 15 July 2019 (has links)
In the thesis, we aimed at building a robust UAV(drone) detection algorithm through which, one drone could detect another drone in flight. Though this was a straight forward object detection problem, the biggest challenge we faced for drone detection is the limited amount of drone images for training. To address this issue, we used Generative Adversarial Networks, CycleGAN to be precise, for the generation of realistic looking fake images which were indistinguishable from real data. CycleGAN is a classic example of Image to Image Translation technique, and we this applied in our situation where synthetic images from one domain were transformed into another domain, containing real data. The model, once trained, was capable of generating realistic looking images from synthetic data without the presence of real images. Following this, we employed a state of the art object detection model, YOLO(You Only Look Once), to build a Drone Detection model that was trained on the generated images. Finally, the performance of this model was compared against different datasets in order to evaluate its performance. / Master of Science / In the recent years, technologies like Deep Learning and Machine Learning have seen many rapid developments. Among the many applications they have, object detection is one of the widely used application and well established problems. In our thesis, we deal with a scenario where we have a swarm of drones and our aim is for one drone to recognize another drone in its field of vision. As there was no drone image dataset readily available, we explored different ways of generating realistic data to address this issue. Finally, we proposed a solution to generate realistic images using Deep Learning techniques and trained an object detection model on it where we evaluated how well it has performed against other models.
|
3 |
Artificial intelligence system for continuous affect estimation from naturalistic human expressionsAbd Gaus, Yona Falinie January 2018 (has links)
The analysis and automatic affect estimation system from human expression has been acknowledged as an active research topic in computer vision community. Most reported affect recognition systems, however, only consider subjects performing well-defined acted expression, in a very controlled condition, so they are not robust enough for real-life recognition tasks with subject variation, acoustic surrounding and illumination change. In this thesis, an artificial intelligence system is proposed to continuously (represented along a continuum e.g., from -1 to +1) estimate affect behaviour in terms of latent dimensions (e.g., arousal and valence) from naturalistic human expressions. To tackle the issues, feature representation and machine learning strategies are addressed. In feature representation, human expression is represented by modalities such as audio, video, physiological signal and text modality. Hand- crafted features is extracted from each modality per frame, in order to match with consecutive affect label. However, the features extracted maybe missing information due to several factors such as background noise or lighting condition. Haar Wavelet Transform is employed to determine if noise cancellation mechanism in feature space should be considered in the design of affect estimation system. Other than hand-crafted features, deep learning features are also analysed in terms of the layer-wise; convolutional and fully connected layer. Convolutional Neural Network such as AlexNet, VGGFace and ResNet has been selected as deep learning architecture to do feature extraction on top of facial expression images. Then, multimodal fusion scheme is applied by fusing deep learning feature and hand-crafted feature together to improve the performance. In machine learning strategies, two-stage regression approach is introduced. In the first stage, baseline regression methods such as Support Vector Regression are applied to estimate each affect per time. Then in the second stage, subsequent model such as Time Delay Neural Network, Long Short-Term Memory and Kalman Filter is proposed to model the temporal relationships between consecutive estimation of each affect. In doing so, the temporal information employed by a subsequent model is not biased by high variability present in consecutive frame and at the same time, it allows the network to exploit the slow changing dynamic between emotional dynamic more efficiently. Following of two-stage regression approach for unimodal affect analysis, fusion information from different modalities is elaborated. Continuous emotion recognition in-the-wild is leveraged by investigating mathematical modelling for each emotion dimension. Linear Regression, Exponent Weighted Decision Fusion and Multi-Gene Genetic Programming are implemented to quantify the relationship between each modality. In summary, the research work presented in this thesis reveals a fundamental approach to automatically estimate affect value continuously from naturalistic human expression. The proposed system, which consists of feature smoothing, deep learning feature, two-stage regression framework and fusion using mathematical equation between modalities is demonstrated. It offers strong basis towards the development artificial intelligent system on estimation continuous affect estimation, and more broadly towards building a real-time emotion recognition system for human-computer interaction.
|
4 |
Multiscale Modeling with Meshfree MethodsXu, Wentao January 2023 (has links)
Multiscale modeling has become an important tool in material mechanics because material behavior can exhibit varied properties across different length scales. The use of multiscale modeling is essential for accurately capturing these characteristics and predicting material behavior. Mesh-free methods have also been gaining attention in recent years due to their innate ability to handle complex geometries and large deformations. These methods provide greater flexibility and efficiency in modeling complex material behavior, especially for problems involving discontinuities, such as fractures and cracks. Moreover, mesh-free methods can be easily extended to multiple lengths and time scales, making them particularly suitable for multiscale modeling.
The thesis focuses on two specific problems of multiscale modeling with mesh-free methods. The first problem is the atomistically informed constitutive model for the study of high-pressure induced densification of silica glass. Molecular Dynamics (MD) simulations are carried out to study the atomistic level responses of fused silica under different pressure and strain-rate levels, Based on the data obtained from the MD simulations, a novel continuum-based multiplicative hyper-elasto-plasticity model that accounts for the anomalous densification behavior is developed and then parameterized using polynomial regression and deep learning techniques. To incorporate dynamic damage evolution, a plasticity-damage variable that controls the shrinkage of the yield surface is introduced and integrated into the elasto-plasticity model. The resulting coupled elasto-plasticity-damage model is reformulated to a non-ordinary state-based peridynamics (NOSB-PD) model for the computational efficiency of impact simulations. The developed peridynamics (PD) model reproduces coarse-scale quantities of interest found in MD simulations and can simulate at a component level. Finally, the proposed atomistically-informed multiplicative hyper-elasto-plasticity-damage model has been validated against limited available experimental results for the simulation of hyper-velocity impact simulation of projectiles on silica glass targets.
The second problem addressed in the thesis involves the upscaling approach for multi-porosity media, analyzed using the so-called MultiSPH method, which is a sequential SPH (Smoothed Particle Hydrodynamics) solver across multiple scales. Multi-porosity media is commonly found in natural and industrial materials, and their behavior is not easily captured with traditional numerical methods. The upscaling approach presented in the thesis is demonstrated on a porous medium consisting of three scales, it involves using SPH methods to characterize the behavior of individual pores at the microscopic scale and then using a homogenization technique to upscale to the meso and macroscopic level. The accuracy of the MultiSPH approach is confirmed by comparing the results with analytical solutions for simple microstructures, as well as detailed single-scale SPH simulations and experimental data for more complex microstructures.
|
5 |
Deep Self-Modeling for Robotic SystemsKwiatkowski, Robert January 2022 (has links)
As self-awareness is important to human higher level cognition so too is the ability to self-model important to performing complex behaviors. The power of these self-models is one that I demonstrate grows with the complexity of problems being solved, and thus provides the framework for higher level cognition. I demonstrate that self-models can be used to effectively control and improve on existing control algorithms to allow agents to perform complex tasks. I further investigate new ways in which these self-models can be learned and applied to increase their efficacy and improve the ability of these models to generalize across tasks and bodies. Finally, I demonstrate the overall power of these self-models to allow for complex tasks to be completed with little data across a variety of bodies and using a number of algorithms.
|
6 |
Deep Learning for Enhancing Precision MedicineOh, Min 07 June 2021 (has links)
Most medical treatments have been developed aiming at the best-on-average efficacy for large populations, resulting in treatments successful for some patients but not for others. It necessitates the need for precision medicine that tailors medical treatment to individual patients. Omics data holds comprehensive genetic information on individual variability at the molecular level and hence the potential to be translated into personalized therapy. However, the attempts to transform omics data-driven insights into clinically actionable models for individual patients have been limited. Meanwhile, advances in deep learning, one of the most promising branches of artificial intelligence, have produced unprecedented performance in various fields. Although several deep learning-based methods have been proposed to predict individual phenotypes, they have not established the state of the practice, due to instability of selected or learned features derived from extremely high dimensional data with low sample sizes, which often results in overfitted models with high variance. To overcome the limitation of omics data, recent advances in deep learning models, including representation learning models, generative models, and interpretable models, can be considered. The goal of the proposed work is to develop deep learning models that can overcome the limitation of omics data to enhance the prediction of personalized medical decisions. To achieve this, three key challenges should be addressed: 1) effectively reducing dimensions of omics data, 2) systematically augmenting omics data, and 3) improving the interpretability of omics data. / Doctor of Philosophy / Most medical treatments have been developed aiming at the best-on-average efficacy for large populations, resulting in treatments successful for some patients but not for others. It necessitates the need for precision medicine that tailors medical treatment to individual patients. Biological data such as DNA sequences and snapshots of genetic activities hold comprehensive information on individual variability and hence the potential to accelerate personalized therapy. However, the attempts to transform data-driven insights into clinical models for individual patients have been limited. Meanwhile, advances in deep learning, one of the most promising branches of artificial intelligence, have produced unprecedented performance in various fields. Although several deep learning-based methods have been proposed to predict individual treatment or outcome, they have not established the state of the practice, due to the complexity of biological data and limited availability, which often result in overfitted models that may work on training data but not on test data or unseen data. To overcome the limitation of biological data, recent advances in deep learning models, including representation learning models, generative models, and interpretable models, can be considered. The goal of the proposed work is to develop deep learning models that can overcome the limitation of omics data to enhance the prediction of personalized medical decisions. To achieve this, three key challenges should be addressed: 1) effectively reducing the complexity of biological data, 2) generating realistic biological data, and 3) improving the interpretability of biological data.
|
7 |
Towards Naturalistic Exoskeleton Glove Control for Rehabilitation and AssistanceChauhan, Raghuraj Jitendra 11 January 2020 (has links)
This thesis presents both a control scheme for naturalistic control of an exoskeleton glove and a glove design. Exoskeleton development has been focused primarily on design, improving soft actuator and cable-driven systems, with only limited focus on intelligent control. There is a need for control that is not limited to position or force reference signals and is user-driven. By implementing a motion amplification controller to increase weak movements of an impaired individual, a finger joint trajectory can be observed and used to predict their grasping intention. The motion amplification functions off of a virtual dynamical system that safely enforces the range of motion of the finger joints and ensures stability. Three grasp prediction algorithms are developed with improved levels of accuracy: regression, trajectory, and deep learning based. These algorithms were tested on published finger joint trajectories. The fusion of the amplification and prediction could be used to achieve naturalistic, user-guided control of an exoskeleton glove. The key to accomplishing this is series elastic actuators to move the finger joints, thereby allowing the wearer to deflect against the glove and inform the controller of their intention. These actuators are used to move the fingers in a nine degree of freedom exoskeleton that is capable of achieving all the grasps used most frequently in daily life. The controllers and exoskeleton presented here are the basis for improved exoskeleton glove control that can be used to assist or rehabilitate impaired individuals. / Master of Science / Millions of Americans report difficulty holding small or even lightweight objects. In many of these cases, their difficulty stems from a condition such as a stroke or arthritis, requiring either rehabilitation or assistance. For both treatments, exoskeleton gloves are a potential solution; however, widespread deployment of exoskeletons in the treatment of hand conditions requires significant advancement. Towards that end, the research community has devoted itself to improving the design of exoskeletons. Systems that use soft actuation or are driven by artificial tendons have merit in that they are comfortable to the wearer, but lack the rigidity required for monitoring the state of the hand and controlling it. Electromyography sensors are also a commonly explored technology for determining motion intention; however, only primitive conclusions can be drawn when using these sensors on the muscles that control the human hand. This thesis proposes a system that does not rely on soft actuation but rather a deflectable exoskeleton that can be used in rehabilitation or assistance. By using series elastic actuators to move the exoskeleton, the wearer of the glove can exert their influence over the machine. Additionally, more intelligent control is needed in the exoskeleton. The approach taken here is twofold. First, a motion amplification controller increases the finger movements of the wearer. Second, the amplified motion is processed using machine learning algorithms to predict what type of grasp the user is attempting. The controller would then be able to fuse the two, the amplification and prediction, to control the glove naturalistically.
|
8 |
Applying Natural Language Processing and Deep Learning Techniques for Raga Recognition in Indian Classical MusicPeri, Deepthi 27 August 2020 (has links)
In Indian Classical Music (ICM), the Raga is a musical piece's melodic framework. It encompasses the characteristics of a scale, a mode, and a tune, with none of them fully describing it, rendering the Raga a unique concept in ICM. The Raga provides musicians with a melodic fabric, within which all compositions and improvisations must take place. Identifying and categorizing the Raga is challenging due to its dynamism and complex structure as well as the polyphonic nature of ICM. Hence, Raga recognition—identify the constituent Raga in an audio file—has become an important problem in music informatics with several known prior approaches. Advancing the state of the art in Raga recognition paves the way to improving other Music Information Retrieval tasks in ICM, including transcribing notes automatically, recommending music, and organizing large databases.
This thesis presents a novel melodic pattern-based approach to recognizing Ragas by representing this task as a document classification problem, solved by applying a deep learning technique. A digital audio excerpt is hierarchically processed and split into subsequences and gamaka sequences to mimic a textual document structure, so our model can learn the resulting tonal and temporal sequence patterns using a Recurrent Neural Network. Although training and testing on these smaller sequences, we predict the Raga for the entire audio excerpt, with the accuracy of 90.3% for the Carnatic Music Dataset and 95.6% for the Hindustani Music Dataset, thus outperforming prior approaches in Raga recognition. / Master of Science / In Indian Classical Music (ICM), the Raga is a musical piece's melodic framework. The Raga is a unique concept in ICM, not fully described by any of the fundamental concepts of Western classical music. The Raga provides musicians with a melodic fabric, within which all compositions and improvisations must take place. Raga recognition refers to identifying the constituent Raga in an audio file, a challenging and important problem with several known prior approaches and applications in Music Information Retrieval. This thesis presents a novel approach to recognizing Ragas by representing this task as a document classification problem, solved by applying a deep learning technique. A digital audio excerpt is processed into a textual document structure, from which the constituent Raga is learned. Based on the evaluation with third-party datasets, our recognition approach achieves high accuracy, thus outperforming prior approaches.
|
9 |
End-To-End Text Detection Using Deep LearningIbrahim, Ahmed Sobhy Elnady 19 December 2017 (has links)
Text detection in the wild is the problem of locating text in images of everyday scenes. It is a challenging problem due to the complexity of everyday scenes. This problem possesses a great importance for many trending applications, such as self-driving cars.
Previous research in text detection has been dominated by multi-stage sequential approaches which suffer from many limitations including error propagation from one stage to the next.
Another line of work is the use of deep learning techniques. Some of the deep methods used for text detection are box detection models and fully convolutional models. Box detection models suffer from the nature of the annotations, which may be too coarse to provide detailed supervision. Fully convolutional models learn to generate pixel-wise maps that represent the location of text instances in the input image. These models suffer from the inability to create accurate word level annotations without heavy post processing.
To overcome these aforementioned problems we propose a novel end-to-end system based on a mix of novel deep learning techniques. The proposed system consists of an attention model, based on a new deep architecture proposed in this dissertation, followed by a deep network based on Faster-RCNN. The attention model produces a high-resolution map that indicates likely locations of text instances. A novel aspect of the system is an early fusion step that merges the attention map directly with the input image prior to word-box prediction. This approach suppresses but does not eliminate contextual information from consideration. Progressively larger models were trained in 3 separate phases. The resulting system has demonstrated an ability to detect text under difficult conditions related to illumination, resolution, and legibility.
The system has exceeded the state of the art on the ICDAR 2013 and COCO-Text benchmarks with F-measure values of 0.875 and 0.533, respectively. / Ph. D. / Text detection and recognition in the wild is the problem of locating and reading text in images of everyday scenes. Text detection refers to finding the bounding boxes that describe the location of text areas in an input image, while text recognition describes the problem of generating a transcript out of the detected text areas. Recognition can be viewed as simply Optical Character Recognition (OCR). OCR is an old problem where the developed models are considered mature. Text detection and recognition are challenging problems due to the complexity of everyday scenes, compared to the simpler problem of recognizing text in scanned documents. This problem possesses a great importance to many trending applications that need to locate and read text in the wild, such as self-driving cars. Researchers tend to focus on the text detection problem only due to the maturity of research related to text recognition. Previous research in text detection has been dominated by multi-stage sequential approaches. Those methods suffer from many limitations including, but not limited to, error propagation from the earlier stages to the later stages of the pipeline. Another line of work is the use of deep learning techniques. Deep learning is the state of the art in machine learning. It has demonstrated great success in many domains, including computer vision. Some of the deep methods used for text detection are box detection models and fully convolutional models. Box detection models learn to generate bounding box coordinates for text instances that exist in the input image. Box detection models suffer from the nature of the annotations, which may be too coarse to provide detailed supervision. Fully convolutional models learn to generate pixel-wise maps that represent the location of text instances in the input image. These models suffer from the inability to create accurate word level annotations without heavy post processing. To overcome these aforementioned problems we propose a novel end-to-end system based on a mix of novel deep learning techniques. The proposed system consists of an attention model followed by a network based on Faster-RCNN that has been conditioned to generate word-box predictions. The attention model produces a high-resolution map that indicates likely locations of text instances. A novel aspect of the system is an early fusion step that merges the attention map directly with the input image prior to word-box prediction. This approach suppresses but does not eliminate contextual information from consideration, and avoids the common problem of discarding small text regions. To facilitate training of the end-to-end system, progressively larger models were trained in 3 separate phases. The resulting system has demonstrated an ability to detect text under difficult conditions related to illumination, resolution, and legibility. The system has exceeded the state of the art on the well-known ICDAR 2013 and COCO-Text benchmarks. For the former case, the system has produced results with an F-measure value of 0.875. For the more challenging COCO-Text dataset, the system has shown a dramatic increase in performance with an F-measure value to 0.533, as compared to previously reported values in the range of 0.33 to 0.37. In order to build a powerful system, we introduced a novel deep learning architecture that achieved impressive performance on standard benchmarks. This architecture has been used as a backbone for the proposed attention model. A description of the proposed end-to-end system, as well as the implementation steps, will be detailed in the following sections.
|
10 |
CloudCV: Deep Learning and Computer Vision on the CloudAgrawal, Harsh 20 June 2016 (has links)
We are witnessing a proliferation of massive visual data. Visual content is arguably the fastest growing data on the web. Photo-sharing websites like Flickr and Facebook now host more than 6 and 90 billion photos, respectively. Unfortunately, scaling existing computer vision algorithms to large datasets leaves researchers repeatedly solving the same algorithmic and infrastructural problems. Designing and implementing efficient and provably correct computer vision algorithms is extremely challenging. Researchers must repeatedly solve the same low-level problems: building and maintaining a cluster of machines, formulating each component of the computer vision pipeline, designing new deep learning layers, writing custom hardware wrappers, etc. This thesis introduces CloudCV, an ambitious system that contain algorithms for end-to-end processing of visual content.
The goal of the project is to democratize computer vision; one should not have to be a computer vision, big data and deep learning expert to have access to state-of-the-art distributed computer vision algorithms. We provide researchers, students and developers access to state-of-art distributed computer vision and deep learning algorithms as a cloud service through web interface and APIs. / Master of Science
|
Page generated in 0.18 seconds