• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1804
  • 57
  • 53
  • 38
  • 37
  • 34
  • 18
  • 12
  • 10
  • 7
  • 4
  • 4
  • 2
  • 2
  • 1
  • Tagged with
  • 2606
  • 2606
  • 1080
  • 924
  • 817
  • 597
  • 561
  • 473
  • 473
  • 442
  • 425
  • 423
  • 400
  • 396
  • 360
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

FROM SEEING BETTER TO UNDERSTANDING BETTER: DEEP LEARNING FOR MODERN COMPUTER VISION APPLICATIONS

Tianqi Guo (12890459) 17 June 2022 (has links)
<p>In this dissertation, we document a few of our recent attempts in bridging the gap between the fast evolving deep learning research and the vast industry needs for dealing with computer vision challenges. More specifically, we developed novel deep-learning-based techniques for the following application-driven computer vision challenges: image super-resolution with quality restoration, motion estimation by optical flow, object detection for shape reconstruction, and object segmentation for motion tracking. Those four topics cover the computer vision hierarchy from the low level where digital images are processed to restore missing information for better human perception, to middle level where certain objects of interest are recognized and their motions are analyzed, finally to high level where the scene captured in the video footage will be interpreted for further analysis. In the process of building the whole-package of  ready-to-deploy solutions, we center our efforts on designing and training the most suitable convolutional neural networks for the particular computer vision problem at hand. Complementary procedures for data collection, data annotation,  post-processing of network outputs tailored for specific application needs, and deployment details will also be discussed where necessary. We hope our work demonstrates the applicability and versatility of convolutional neural networks for real-world computer vision tasks on a broad spectrum, from seeing better to understanding better.</p>
112

A SYSTEMATIC STUDY OF SPARSE DEEP LEARNING WITH DIFFERENT PENALTIES

Xinlin Tao (13143465) 25 April 2023 (has links)
<p>Deep learning has been the driving force behind many successful data science achievements. However, the deep neural network (DNN) that forms the basis of deep learning is</p> <p>often over-parameterized, leading to training, prediction, and interpretation challenges. To</p> <p>address this issue, it is common practice to apply an appropriate penalty to each connection</p> <p>weight, limiting its magnitude. This approach is equivalent to imposing a prior distribution</p> <p>on each connection weight from a Bayesian perspective. This project offers a systematic investigation into the selection of the penalty function or prior distribution. Specifically, under</p> <p>the general theoretical framework of posterior consistency, we prove that consistent sparse</p> <p>deep learning can be achieved with a variety of penalty functions or prior distributions.</p> <p>Examples include amenable regularization penalties (such as MCP and SCAD), spike-and?slab priors (such as mixture Gaussian distribution and mixture Laplace distribution), and</p> <p>polynomial decayed priors (such as the student-t distribution). Our theory is supported by</p> <p>numerical results.</p> <p><br></p>
113

TOWARD ROBUST AND INTERPRETABLE GRAPH AND IMAGE REPRESENTATION LEARNING

Juan Shu (14816524) 27 April 2023 (has links)
<p>Although deep learning models continue to gain momentum, their robustness and interpretability have always been a big concern because of the complexity of such models. In this dissertation, we studied several topics on the robustness and interpretability of convolutional neural networks (CNNs) and graph neural networks (GNNs). We first identified the structural problem of deep convolutional neural networks that leads to the adversarial examples and defined DNN uncertainty regions. We also argued that the generalization error, the large sample theoretical guarantee established for DNN, cannot adequately capture the phenomenon of adversarial examples. Secondly, we studied the dropout in GNNs, which is an effective regularization approach to prevent overfitting. Contrary to CNN, GNN usually has a shallow structure because a deep GNN normally sees performance degradation. We studied different dropout schemes and established a connection between dropout and over-smoothing in GNNs. Therefore we developed layer-wise compensation dropout, which allows GNN to go deeper without suffering performance degradation. We also developed a heteroscedastic dropout which effectively deals with a large number of missing node features due to heavy experimental noise or privacy issues. Lastly, we studied the interpretability of graph neural networks. We developed a self-interpretable GNN structure that denoises useless edges or features, leading to a more efficient message-passing process. The GNN prediction and explanation accuracy were boosted compared with baseline models. </p>
114

Analysis and Comparison of Distributed Training Techniques for Deep Neural Networks in a Dynamic Environment / Analys och jämförelse av distribuerade tränings tekniker för djupa neurala nätverk i en dynamisk miljö

Gebremeskel, Ermias January 2018 (has links)
Deep learning models' prediction accuracy tends to improve with the size of the model. The implications being that the amount of computational power needed to train models is continuously increasing. Distributed deep learning training tries to address this issue by spreading the computational load onto several devices. In theory, distributing computation onto N devices should give a performance improvement of xN. Yet, in reality the performance improvement is rarely xN, due to communication and other overheads. This thesis will study the communication overhead incurred when distributing deep learning training. Hopsworks is a platform designed for data science. The purpose of this work is to explore a feasible way of deploying distributed deep learning training on a shared cluster and analyzing the performance of different distributed deep learning algorithms to be used on this platform. The findings of this study show that bandwidth-optimal communication algorithms like ring all-reduce scales better than many-to-one communication algorithms like parameter server, but were less fault tolerant. Furthermore, system usage statistics collected revealed a network bottleneck when training is distributed on multiple machines. This work also shows that it is possible to run MPI on a hadoop cluster by building a prototype that orchestrates resource allocation, deployment, and monitoring of MPI based training jobs. Even though the experiments did not cover different cluster configurations, the results are still relevant in showing what considerations need to be made when distributing deep learning training. / Träffsäkerheten hos djupinlärningsmodeller tenderar att förbättras i relation med storleken på modellen. Implikationen blir att mängden beräkningskraft som krävs för att träna modeller ökar kontinuerligt.Distribuerad djupinlärning försöker lösa detta problem genom att distribuera beräkningsbelastning på flera enheter. Att distribuera beräkningarna på N enheter skulle i teorin innebär en linjär skalbarhet (xN). I verkligenheten stämmer sällan detta på grund av overhead från nätverkskommunikation eller I/O. Hopsworks är en dataanalys och maskininlärningsplattform. Syftetmed detta arbeta är att utforska ett möjligt sätt att utföra distribueraddjupinlärningträning på ett delat datorkluster, samt analysera prestandan hos olika algoritmer för distribuerad djupinlärning att använda i plattformen. Resultaten i denna studie visar att nätverksoptimala algoritmer såsom ring all-reduce skalar bättre för distribuerad djupinlärning änmånga-till-en kommunikationsalgoritmer såsom parameter server, men är inte lika feltoleranta. Insamlad data från experimenten visade på en flaskhals i nätverket vid träning på flera maskiner. Detta arbete visar även att det är möjligt att exekvera MPI program på ett hadoopkluster genom att bygga en prototyp som orkestrerar resursallokering, distribution och övervakning av exekvering. Trots att experimenten inte täcker olika klusterkonfigurationer så visar resultaten på vilka faktorer som bör tas hänsyn till vid distribuerad träning av djupinlärningsmodeller.
115

[pt] DETECÇÃO VISUAL DE FILEIRA DE PLANTAÇÃO COM TAREFA AUXILIAR DE SEGMENTAÇÃO PARA NAVEGAÇÃO DE ROBÔS MÓVEIS / [en] VISUAL CROP ROW DETECTION WITH AUXILIARY SEGMENTATION TASK FOR MOBILE ROBOT NAVIGATION

IGOR FERREIRA DA COSTA 07 November 2023 (has links)
[pt] Com a evolução da agricultura inteligente, robôs autônomos agrícolas têm sido pesquisados de forma extensiva nos últimos anos, ao passo que podem resultar em uma grande melhoria na eficiência do campo. No entanto, navegar em um campo de cultivo aberto ainda é um grande desafio. O RTKGNSS é uma excelente ferramenta para rastrear a posição do robô, mas precisa de mapeamento e planejamento precisos, além de ser caro e dependente de qualidade do sinal. Como tal, sistemas on-board que podem detectar o campo diretamente para guiar o robô são uma boa alternativa. Esses sistemas detectam as linhas com técnicas de processamento de imagem e estimam a posição aplicando algoritmos à máscara obtida, como a transformada de Hough ou regressão linear. Neste trabalho, uma abordagem direta é apresentada treinando um modelo de rede neural para obter a posição das linhas de corte diretamente de uma imagem RGB. Enquanto a câmera nesses sistemas está, geralmente, voltada para o campo, uma câmera próxima ao solo é proposta para aproveitar túneis ou paredes de plantas formadas entre as fileiras. Um ambiente de simulação para avaliar o desempenho do modelo e o posicionamento da câmera foi desenvolvido e disponibilizado no Github. Também são propostos quatro conjuntos de dados para treinar os modelos, sendo dois para as simulações e dois para os testes do mundo real. Os resultados da simulação são mostrados em diferentes resoluções e estágios de crescimento da planta, indicando as capacidades e limitações do sistema e algumas das melhores configurações são verificadas em dois tipos de ambientes agrícolas. / [en] Autonomous robots for agricultural tasks have been researched to great extent in the past years as they could result in a great improvement of field efficiency. Navigating an open crop field still is a great challenge. RTKGNSS is a excellent tool to track the robot’s position, but it needs precise mapping and planning while also being expensive and signal dependent. As such, onboard systems that can sense the field directly to guide the robot are a good alternative. Those systems detect the rows with adequate image processing techniques and estimate the position by applying algorithms to the obtained mask, such as the Hough transform or linear regression. In this work, a direct approach is presented by training a neural network model to obtain the position of crop lines directly from an RGB image. While, usually, the camera in these kinds of systems is looking down to the field, a camera near the ground is proposed to take advantage of tunnels or walls of plants formed between rows. A simulation environment for evaluating both the model’s performance and camera placement was developed and made available on Github, also four datasets to train the models are proposed, being two for the simulations and two for the real world tests. The results from the simulation are shown across different resolutions and stages of plant growth, indicating the system’s capabilities and limitations. Some of the best configurations are then verified in two types of agricultural environments.
116

Efficient Decentralized Learning Methods for Deep Neural Networks

Sai Aparna Aketi (18258529) 26 March 2024 (has links)
<p dir="ltr">Decentralized learning is the key to training deep neural networks (DNNs) over large distributed datasets generated at different devices and locations, without the need for a central server. They enable next-generation applications that require DNNs to interact and learn from their environment continuously. The practical implementation of decentralized algorithms brings about its unique set of challenges. In particular, these algorithms should be (a) compatible with time-varying graph structures, (b) compute and communication efficient, and (c) resilient to heterogeneous data distributions. The objective of this thesis is to enable efficient decentralized learning in deep neural networks addressing the abovementioned challenges. Towards this, firstly a communication-efficient decentralized algorithm (Sparse-Push) that supports directed and time-varying graphs with error-compensated communication compression is proposed. Second, a low-precision decentralized training that aims to reduce memory requirements and computational complexity is proposed. Here, we design ”Range-EvoNorm” as the normalization activation layer which is better suited for low-precision decentralized training. Finally, addressing the problem of data heterogeneity, three impactful advancements namely Neighborhood Gradient Mean (NGM), Global Update Tracking (GUT), and Cross-feature Contrastive Loss (CCL) are proposed. NGM utilizes extra communication rounds to obtain cross-agent gradient information whereas GUT tracks global update information with no communication overhead, improving the performance on heterogeneous data. CCL explores an orthogonal direction of using a data-free knowledge distillation approach to handle heterogeneous data in decentralized setups. All the algorithms are evaluated on computer vision tasks using standard image-classification datasets. We conclude this dissertation by presenting a summary of the proposed decentralized methods and their trade-offs for heterogeneous data distributions. Overall, the methods proposed in this thesis address the critical limitations of training deep neural networks in a decentralized setup and advance the state-of-the-art in this domain.</p>
117

<strong>A LARGE-SCALE UAV AUDIO DATASET AND AUDIO-BASED UAV CLASSIFICATION USING CNN</strong>

Yaqin Wang (8797037) 17 July 2023 (has links)
<p>The growing popularity and increased accessibility of unmanned aerial vehicles (UAVs) have raised concerns about potential threats they may pose. In response, researchers have devoted significant efforts to developing UAV detection and classification systems, utilizing diverse methodologies such as computer vision, radar, radio frequency, and audio-based approaches. However, the availability of publicly accessible UAV audio datasets remains limited. Consequently, this research endeavor was undertaken to address this gap by undertaking the collection of a comprehensive UAV audio dataset, alongside the development of a precise and efficient audio-based UAV classification system.</p> <p>This research project is structured into three distinct phases, each serving a unique purpose in data collection and training the proposed UAV classifier. These phases encompass data collection, dataset evaluation, the implementation of a proposed convolutional neural network, training procedures, as well as an in-depth analysis and evaluation of the obtained results. To assess the effectiveness of the model, several evaluation metrics are employed, including training accuracy, loss rate, the confusion matrix, and ROC curves.</p> <p>The findings from this study conclusively demonstrate that the proposed CNN classi- fier exhibits nearly flawless performance in accurately classifying UAVs across 22 distinct categories.</p>
118

HIGHLY ACCURATE MACROMOLECULAR STRUCTURE COMPLEX DETECTION, DETERMINATION AND EVALUATION BY DEEP LEARNING

Xiao Wang (17405185) 17 November 2023 (has links)
<p dir="ltr">In life sciences, the determination of macromolecular structures and their functions, particularly proteins and protein complexes, is of paramount importance, as these molecules play critical roles within cells. The specific physical interactions of macromolecules govern molecular and cellular functions, making the 3D structure elucidation of these entities essential for comprehending the mechanisms underlying life processes, diseases, and drug discovery. Cryo-electron microscopy (cryo-EM) has emerged as a promising experimental technique for obtaining 3D macromolecular structures. In the course of my research, I proposed CryoREAD, an innovative AI-based method for <i>de nov</i>o DNA/RNA structure modeling. This novel approach represents the first fully automated solution for DNA/RNA structure modeling from cryo-EM maps at near-atomic resolution. However, as the resolution decreases, structure modeling becomes significantly more challenging. To address this challenge, I introduced Emap2sec+, a 3D deep convolutional neural network designed to identify protein secondary structures, RNA, and DNA information from cryo-EM maps at intermediate resolutions ranging from 5-10 Å. Additionally, I presented Alpha-EM-Multimer, a groundbreaking method for automatically building full protein complexes from cryo-EM maps at intermediate resolution. Alpha-EM-Multimer employs a diffusion model to trace the protein backbone and subsequently fits the AlphaFold predicted single-chain structure to construct the complete protein complex. Notably, this method stands as the first to enable the modeling of protein complexes with more than 10,000 residues for cryo-EM maps at intermediate resolution, achieving an average TM-Score of predicted protein complexes above 0.8, which closely approximates the native structure. Furthermore, I addressed the recognition of local structural errors in predicted and experimental protein structures by proposing DAQ, an evaluation approach for experimental protein structure quality that utilizes detection probabilities derived from cryo-EM maps via a pretrained multi-task neural network. In the pursuit of evaluating protein complexes generated through computational methods, I developed GNN-DOVE and DOVE, leveraging convolutional neural networks and graph neural networks to assess the accuracy of predicted protein complex structures. These advancements in cryo-EM-based structural modeling and evaluation methodologies hold significant promise for advancing our understanding of complex macromolecular systems and their biological implications.</p>
119

Asymmetry Learning for Out-of-distribution Tasks

Chandra Mouli Sekar (18437814) 02 May 2024 (has links)
<p dir="ltr">Despite their astonishing capacity to fit data, neural networks have difficulties extrapolating beyond training data distribution. When the out-of-distribution prediction task is formalized as a counterfactual query on a causal model, the reason for their extrapolation failure is clear: neural networks learn spurious correlations in the training data rather than features that are causally related to the target label. This thesis proposes to perform a causal search over a known family of causal models to learn robust (maximally invariant) predictors for single- and multiple-environment extrapolation tasks.</p><p dir="ltr">First, I formalize the out-of-distribution task as a counterfactual query over a structural causal model. For single-environment extrapolation, I argue that symmetries of the input data are valuable for training neural networks that can extrapolate. I introduce Asymmetry learning, a new learning paradigm that is guided by the hypothesis that all (known) symmetries are mandatory even without evidence in training, unless the learner deems it inconsistent with the training data. Asymmetry learning performs a causal model search to find the simplest causal model defining a causal connection between the target labels and the symmetry transformations that affect the label. My experiments on a variety of out-of-distribution tasks on images and sequences show that proposed methods extrapolate much better than the standard neural networks.</p><p dir="ltr">Then, I consider multiple-environment out-of-distribution tasks in dynamical system forecasting that arise due to shifts in initial conditions or parameters of the dynamical system. I identify key OOD challenges in the existing deep learning and physics-informed machine learning (PIML) methods for these tasks. To mitigate these drawbacks, I combine meta-learning and causal structure discovery over a family of given structural causal models to learn the underlying dynamical system. In three simulated forecasting tasks, I show that the proposed approach is 2x to 28x more robust than the baselines.</p>
120

Size-Adaptive Convolutional Neural Network with Parameterized-Swish Activation for Enhanced Object Detection

Yashwanth Raj Venkata Krishnan (18322572) 03 June 2024 (has links)
<p> In computer vision, accurately detecting objects of varying sizes is essential for various applications, such as autonomous vehicle navigation and medical imaging diagnostics. Addressing the variance in object sizes presents a significant challenge requiring advanced computational solutions for reliable object recognition and processing. This research introduces a size-adaptive Convolutional Neural Network (CNN) framework to enhance detection performance across different object sizes. By dynamically adjusting the CNN’s configuration based on the observed distribution of object sizes, the framework employs statistical analysis and algorithmic decision-making to improve detection capabilities. Further innovation is presented through the Parameterized-Swish activation function. Distinguished by its dynamic parameters, this function is designed to better adapt to varying input patterns. It exceeds the performance of traditional activation functions by enabling faster model convergence and increasing detection accuracy, showcasing the effectiveness of adaptive activation functions in enhancing object detection systems. The implementation of this model has led to notable performance improvements: a 11.4% increase in mean Average Precision (mAP) and a 40.63% increase in frames per second (FPS) for small objects, demonstrating enhanced detection speed and accuracy. The model has achieved a 48.42% reduction in training time for medium-sized objects while still improving mAP, indicating significant efficiency gains without compromising precision. Large objects have seen a 16.9% reduction in training time and a 76.04% increase in inference speed, showcasing the model’s ability to expedite processing times substantially. Collectively, these advancements contribute to a more than 12% increase in detection efficiency and accuracy across various scenarios, highlighting the model’s robustness and adaptability in addressing the critical challenge of size variance in object detection. </p>

Page generated in 0.1082 seconds