Global ETD Search

91	3D Reconstruction from Satellite Imagery Using Deep Learning Yngesjö, Tim January 2021 (has links) Learning-based multi-view stereo (MVS) has shown promising results in the domain of general 3D reconstruction. However, no work before this thesis has applied learning-based MVS to urban 3D reconstruction from satellite images. In this thesis, learning-based MVS is used to infer depth maps from satellite images. Models are trained on both synthetic and real satellite images from Las Vegas with ground truth data from a high-resolution aerial-based 3D model. This thesis also evaluates different methods for reconstructing digital surface models (DSM) and compares them to existing satellite-based 3D models at Maxar Technologies. The DSMs are created by either post-processing point clouds obtained from predicted depth maps or by an end-to-end approach where the depth map for an orthographic satellite image is predicted. This thesis concludes that learning-based MVS can be used to predict accurate depth maps. Models trained on synthetic data yielded relatively good results, but not nearly as good as for models trained on real satellite images. The trained models also generalize relatively well to cities not present in training. This thesis also concludes that the reconstructed DSMs achieve better quantitative results than the existing 3D model in Las Vegas and similar results for the test sets from other cities. Compared to ground truth, the best-performing method achieved an L1 and L2 error of 14 % and 29 % lower than Maxar's current 3D model, respectively. The method that uses a point cloud as an intermediate step achieves better quantitative results compared to the end-to-end system. Very promising qualitative results are achieved with the proposed methods, especially when utilizing an end-to-end approach. 3D reconstruction Deep learning Signal Processing Signalbehandling
92	Assessment of malalignment factors related to the Invisalign treatment time using artificial intelligence Lee, Sanghee 09 August 2022 (has links) No description available. Dentistry
93	Particle detection, extraction, and state estimation in single particle tracking microscopy Lin, Ye 20 June 2022 (has links) Single Particle Tracking (SPT) plays an important role in the study of physical and dynamic properties of biomolecules moving in their native environment. To date, many algorithms have been developed for localization and parameter estimation in SPT. Though the performance of these methods is good when the signal level is high and the motion model simple, they begin to fail as the signal level decreases or model complexity increases. In addition, the inputs to the SPT algorithms are sequences of images that are cropped from a large data set and that focus on a single particle. This motivates us to seek machine learning tools to deal with that initial step of extracting data from larger images containing multiple particles. This thesis makes contributions to both data extraction question and to the problem of state and parameter estimation. First, we build upon the Expectation Maximization (EM) algorithm to create a generic framework for joint localization refinement and parameter estimation in SPT. Under the EM-based scheme, two representative methods are considered for generating the filtered and smoothed distributions needed by EM: Sequential Monte Carlo - Expectation Maximization (SMC-EM), and Unscented - Expectation Maximization (U-EM). The selection of filtering and smoothing algorithms is very flexible so long as they provide the necessary distributions for EM. The versatility and reliability of EM based framework have been validated via data-intensive modeling and simulation where we considered a variety of influential factors, such as a wide range of {\color{red}Signal-to-background ratios (SBRs)}, diffusion speeds, motion blur, camera types, image length, etc. Meanwhile, under the EM-based scheme, we make an effort to improve the overall computational efficiency by simplifying the mathematical expression of models, replacing filtering/smoothing algorithms with more efficient ones {\color{purple} (trading some accuracy for reduced computation time)}, and using parallel computation and other computing techniques. In terms of localization refinement and parameter estimation in SPT, we also conduct an overall quantitative comparison among EM based methods and standard two-step methods. Regarding the U-EM, we conduct transformation methods to make it adapted to the nonlinearities and complexities of measurement model. We also extended the application of U-EM to more complicated SPT scenarios, including time-varying parameters and additional observation models that are relevant to the biophysical setting. The second area of contribution is in the particle detection and extraction problem to create data to feed into the EM-based approaches. Here we build Particle Identification Networks (PINs) covering three different network architectures. The first, \PINCNN{}, is based on a standard Convolutional Neural Network (CNN) structure that has previously been successfully applied in particle detection and localization. The second, \PINRES, uses a Residual Neural Network (ResNet) architecture that is significantly deeper than the CNN while the third, \PINFPN{}, is based on a more advanced Feature Pyramid Network (FPN) that can take advantage of multi-scale information in an image. All networks are trained using the same collection of simulated data created with a range of SBRs and fluorescence emitter densities, as well as with three different Point Spread Functions (PSFs): a standard Born-Wolf model, a model for astigmatic imaging to allow localization in three dimensions, and a model of the Double-Helix engineered PSF. All PINs are evaluated and compared through data-intensive simulation and experiments under a variety of settings. In the final contribution, we link all above together to create an algorithm that takes in raw camera data and produces trajectories and parameter estimates for multiple particles in an image sequence. Mathematics Deep Learning Nonlinear System Identification Optimization
94	AMMNet: an Attention-based Multi-scale Matting Network Niu, Chenxiao January 2019 (has links) Matting, which aims to separate the foreground object from the background of an image, is an important problem in computer vision. Most existing methods rely on auxiliary information such as trimaps or scibbles to alleviate the difficulty arising from the underdetermined nature of the matting problem. However, such methods tend to be sensitive to the quality of auxiliary information, and are unsuitable for real-time deployment. In this paper, we propose a novel Attention-based Multi-scale Matting Network (AMMNet), which can estimate the alpha matte from a given RGB image without resorting to any auxiliary information. The proposed AMMNet consists of three (sub-)networks: 1) a multi-scale neural network designed to provide the semantic information of the foreground object, 2) a Unet-like network for attention mask generation, and 3) a Convolutional Neural Network (CNN) customized to integrate high- and low-level features extracted by the first two (sub-)networks. The AMMNet is generic in nature and can be trained end-to-end in a straightforward manner. The experimental results indicate that the performance of AMMNet is competitive against the state-of-the-art matting methods, which either require additional side information or are tailored to images with a specific type of content (e.g., portrait). / Thesis / Master of Applied Science (MASc) Deep Learning Computer Vision Alpha Matting
95	Reducing motion-related artifacts in human brain measurements using functional near infrared spectroscopy (fNIRS) Serani, Teah 24 May 2024 (has links) Functional near-infrared spectroscopy (fNIRS) is a neuroimaging modality that measures the hemodynamic responses to brain activation. With its cost-effectiveness and portability, fNIRS can be utilized to measure brain signals in the everyday world. However, factors such as blood pressure, cardiac rhythms, and motion can obscure the hemodynamic response function (HRF) obtained in fNIRS data. Motion, in particular, poses a significant challenge in obtaining the HRF for measurements conducted in everyday world activities when the subject is free to move. To address this, the General Linear Model (GLM) with temporally embedded Canonical Correlation Analysis (tCCA) has been shown to be effective in extracting the HRF by reducing motion and other systemic interferences. Recently, deep learning methods have also demonstrated its potential for time-series data analysis. The objective of this project is to evaluate the effectiveness of a novel transformer-based deep learning approach in comparison to the tradition method of GLM with tCCA Biomedical engineering Deep learning fNIRS Neuroscience
96	ImageSI: Interactive Deep Learning for Image Semantic Interaction Lin, Jiayue 04 June 2024 (has links) Interactive deep learning frameworks are crucial for effectively exploring and analyzing complex image datasets in visual analytics. However, existing approaches often face challenges related to inference accuracy and adaptability. To address these issues, we propose ImageSI, a framework integrating deep learning models with semantic interaction techniques for interactive image data analysis. Unlike traditional methods, ImageSI directly incorporates user feedback into the image model, updating underlying embeddings through customized loss functions, thereby enhancing the performance of dimension reduction tasks. We introduce three variations of ImageSI, ImageSI$_{text{MDS}^{-1}}$, prioritizing explicit pairwise relationships from user interaction, and ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{PHTriplet}}$, emphasizing clustering by defining groups of images based on user input. Through usage scenarios and quantitative analyses centered on algorithms, we demonstrate the superior performance of ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{MDS}^{-1}}$ in terms of inference accuracy and interaction efficiency. Moreover, ImageSI$_{text{PHTriplet}}$ shows competitive results. The baseline model, WMDS$^{-1}$, generally exhibits lower performance metrics. / Master of Science / Interactive deep learning frameworks are crucial for effectively exploring and analyzing complex image datasets in visual analytics. However, existing approaches often face challenges related to inference accuracy and adaptability. To address these issues, we propose ImageSI, a framework integrating deep learning models with semantic interaction techniques for interactive image data analysis. Unlike traditional methods, ImageSI directly incorporates user feedback into the image model, updating underlying embeddings through customized loss functions, thereby enhancing the performance of dimension reduction tasks. We introduce three variations of ImageSI, ImageSI$_{text{MDS}^{-1}}$, prioritizing explicit pairwise relationships from user interaction, and ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{PHTriplet}}$, emphasizing clustering by defining groups of images based on user input. Through usage scenarios and quantitative analyses centered on algorithms, we demonstrate the superior performance of ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{MDS}^{-1}}$ in terms of inference accuracy and interaction efficiency. Moreover, ImageSI$_{text{PHTriplet}}$ shows competitive results. The baseline model, WMDS$^{-1}$, generally exhibits lower performance metrics. Semantic Interaction Deep Learning Dimension Reduction Images
97	Condition Assessment of Civil Infrastructure and Materials Using Deep Learning Liu, Fangyu 24 August 2022 (has links) The abilities of powerful regression and multi-type data processing allow deep learning to effectively and accurately complete multi-tasks, which is the need of civil engineering. More cases showed that deep learning has become a greatly powerful and increasingly popular tool for civil engineering. Based on these, this dissertation developed deep learning studies for the condition assessment of civil infrastructure and materials. This dissertation included five main works: (1) Deep learning and infrared thermography for asphalt pavement crack severity classification. This work focused on longitudinal or transverse cracking. This work first built a dataset with four severity levels (no, low-severity, medium-severity, and high-severity) and three image types (visible, infrared, and fusion). Then this work applied the convolutional neural network (CNN) to classify the crack severity based on two strategies deep learning from scratch and transfer learning). This work also investigated the effect of image types on the accuracy of these two strategies and on the classification of different severity levels. (2) Asphalt pavement crack detection based on convolutional neural network and infrared thermography. This work first built an open dataset with three image types (visible, infrared, and fusion) and different conditions (single, multi, thin, and thick cracks; clean, rough, light, and dark backgrounds) and periods (morning, noon, and dusk). Then this work evaluated the performance of the CNN model based on the accuracy and complexity (computational and model). (3) An artificial neural network model on tensile behavior of hybrid steel-PVA fiber reinforced concrete containing fly ash and slag powder. This work considered a total of 23 factors for predicting the tensile behavior of hybrid fiber reinforced concrete (HFRC), including fibers' characteristics, mechanical properties of plain concrete, and concrete composition. Then this work compared the performance of the artificial neural network (ANN) method and the traditional equation-based method in terms of predicting the tensile stress, tensile strength, and strain corresponding to tensile strength. (4) Deep transfer learning-based vehicle classification by asphalt pavement vibration. This work first applied the pavement vibration IoT monitoring system to collect raw vibration signals and performed the wavelet transform to obtain denoised vibration signals. Then this work represented the vibration signals in two different ways, including the time-domain graph and the time-frequency graph. Finally, this work proposed two deep transfer learning-based vehicle classification methods according to these two representations of vibration signals. (5) Physical-informed long short-term memory (PI-LSTM) network for data-driven structural response modeling. This work first applied the single-degree-of-freedom (SDOF) system to investigate the performance of the proposed PI-LSTM network compared with the existing methods. Then this work further investigated and validated the proposed PI-LSTM network in terms of the experimental results of one six-story building and the numerical simulation results of another six-story building. / Doctor of Philosophy / With the development of technologies, deep learning has been applied to numerous fields to improve accuracy and efficiency. More work shows that deep learning has become a greatly powerful and increasingly popular tool for civil engineering. Since civil infrastructure and materials play a dominant role in civil engineering, this dissertation applied deep learning to the condition assessment of civil infrastructure and materials. Deep learning methods were applied to detect cracks in asphalt pavements. The mechanical properties of fiber reinforced concrete were investigated by deep learning methods. Based on the asphalt pavement vibration, the type of vehicles was classified by deep learning methods. Deep learning methods were also used to investigate the structural response.
98	High performance Deep Learning based Digital Pre-distorters for RF Power Amplifiers Kudupudi, Rajesh 25 January 2022 (has links) In this work, we present different deep learning-based digital pre-distorters and compare them based on their performance towards improving the linearity of highly non-linear power amplifiers. The simulation results show that BiLSTM based DPDs work the best in terms of improving the linearity performance. We also compare two methodologies of direct learning and indirect learning to develop deep learning-based digital pre-distorters (DL-DPDs) models and evaluate their improvement on the linearity of Power Amplifiers (PA). We carry out a theoretical analysis on the differences between these training methodologies and verify their performance with simulation results on class-AB and class-F⁻¹ PAs. The simulation results show that both the learning methods lead to an improvement of more than 12 dB and 11dB in the linearity of class-AB and class-F⁻¹ PAs respectively, with indirect learning DL-DPD offering marginally better performance. Moreover, we compare the DL-DPD with memory polynomial models and show that using the former gives a significant improvement over the memory polynomials. Furthermore, we discuss the advantages of exploiting a BiLSTM based neural network architecture for designing direct/indirect DPDs. We demonstrate that BiLSTM DPD can be used to pre distort signals of any size without the drop in linearity. Moreover, based on the insights we develop a frequency domain loss using which further increased the linearity of the PA. / Master of Science / Wireless communication devices have fundamentally changed the way we interact with people. This increased the user's reliance on communication devices and significantly grew the need for higher data rates and faster internet speeds. But one major obstacle inside the transmitter chain (antenna) with increasing the data rates is the power amplifier, which distorts the signals at these higher powers. This distortion will reduce the efficiency and reliability of communication systems, greatly decreasing the quality of communication. So, we developed a high-performance DPD using deep learning to combat this issue. In this paper, we compare different deep learning-based DPDs and analyze which offers better performance. We also contrast two training methodologies to learn these DL-DPDs, theoretically and with simulation to arrive at which method offers better performing DPDs. We do these experiments on two different types of power amplifiers, and signals of any length. We design a new loss function, such that optimizing it leads to better DL-DPDs. Deep learning (Machine learning) Digital Pre-distorters
99	Visual Question Answering in the Medical Domain Sharma, Dhruv 21 July 2020 (has links) Medical images are extremely complicated to comprehend for a person without expertise. The limited number of practitioners across the globe often face the issue of fatigue due to the high number of cases. This fatigue, physical and mental, can induce human-errors during the diagnosis. In such scenarios, having an additional opinion can be helpful in boosting the confidence of the decision-maker. Thus, it becomes crucial to have a reliable Visual Question Answering (VQA) system which can provide a "second opinion" on medical cases. However, most of the VQA systems that work today cater to real-world problems and are not specifically tailored for handling medical images. Moreover, the VQA system for medical images needs to consider a limited amount of training data available in this domain. In this thesis, we develop a deep learning-based model for VQA on medical images taking the associated challenges into account. Our MedFuseNet system aims at maximizing the learning with minimal complexity by breaking the problem statement into simpler tasks and weaving everything together to predict the answer. We tackle two types of answer prediction - categorization and generation. We conduct an extensive set of both quantitative and qualitative analyses to evaluate the performance of MedFuseNet. Our results conclude that MedFuseNet outperforms other state-of-the-art methods available in the literature for these tasks. / Master of Science / Medical images are extremely complicated to comprehend for a person without expertise. The limited number of practitioners across the globe often face the issue of fatigue due to the high number of cases. This fatigue, physical and mental, can induce human-errors during the diagnosis. In such scenarios, having an additional opinion can be helpful in boosting the confidence of the decision-maker. Thus, it becomes crucial to have a reliable Visual Question Answering (VQA) system which can provide a "second opinion" on medical cases. However, most of the VQA systems that work today cater to real-world problems and are not specifically tailored for handling medical images. In this thesis, we propose an end-to-end deep learning-based system, MedFuseNet, for predicting the answer for the input query associated with the image. We cater to close-ended as well as open-ended type question-answer pairs. We conduct an extensive analysis to evaluate the performance of MedFuseNet. Our results conclude that MedFuseNet outperforms other state-of-the-art methods available in the literature for these tasks. Visual Question Answering deep learning medical images
100	Synthesizing Realistic Data for Vision Based Drone-to-Drone Detection Yellapantula, Sudha Ravali 15 July 2019 (has links) In the thesis, we aimed at building a robust UAV(drone) detection algorithm through which, one drone could detect another drone in flight. Though this was a straight forward object detection problem, the biggest challenge we faced for drone detection is the limited amount of drone images for training. To address this issue, we used Generative Adversarial Networks, CycleGAN to be precise, for the generation of realistic looking fake images which were indistinguishable from real data. CycleGAN is a classic example of Image to Image Translation technique, and we this applied in our situation where synthetic images from one domain were transformed into another domain, containing real data. The model, once trained, was capable of generating realistic looking images from synthetic data without the presence of real images. Following this, we employed a state of the art object detection model, YOLO(You Only Look Once), to build a Drone Detection model that was trained on the generated images. Finally, the performance of this model was compared against different datasets in order to evaluate its performance. / Master of Science / In the recent years, technologies like Deep Learning and Machine Learning have seen many rapid developments. Among the many applications they have, object detection is one of the widely used application and well established problems. In our thesis, we deal with a scenario where we have a swarm of drones and our aim is for one drone to recognize another drone in its field of vision. As there was no drone image dataset readily available, we explored different ways of generating realistic data to address this issue. Finally, we proposed a solution to generate realistic images using Deep Learning techniques and trained an object detection model on it where we evaluated how well it has performed against other models. GANs Deep learning (Machine learning) Object Detection

Search results