121 |
<strong>A LARGE-SCALE UAV AUDIO DATASET AND AUDIO-BASED UAV CLASSIFICATION USING CNN</strong>Yaqin Wang (8797037) 17 July 2023 (has links)
<p>The growing popularity and increased accessibility of unmanned aerial vehicles (UAVs) have raised concerns about potential threats they may pose. In response, researchers have devoted significant efforts to developing UAV detection and classification systems, utilizing diverse methodologies such as computer vision, radar, radio frequency, and audio-based approaches. However, the availability of publicly accessible UAV audio datasets remains limited. Consequently, this research endeavor was undertaken to address this gap by undertaking the collection of a comprehensive UAV audio dataset, alongside the development of a precise and efficient audio-based UAV classification system.</p>
<p>This research project is structured into three distinct phases, each serving a unique purpose in data collection and training the proposed UAV classifier. These phases encompass data collection, dataset evaluation, the implementation of a proposed convolutional neural network, training procedures, as well as an in-depth analysis and evaluation of the obtained results. To assess the effectiveness of the model, several evaluation metrics are employed, including training accuracy, loss rate, the confusion matrix, and ROC curves.</p>
<p>The findings from this study conclusively demonstrate that the proposed CNN classi- fier exhibits nearly flawless performance in accurately classifying UAVs across 22 distinct categories.</p>
|
122 |
HIGHLY ACCURATE MACROMOLECULAR STRUCTURE COMPLEX DETECTION, DETERMINATION AND EVALUATION BY DEEP LEARNINGXiao Wang (17405185) 17 November 2023 (has links)
<p dir="ltr">In life sciences, the determination of macromolecular structures and their functions, particularly proteins and protein complexes, is of paramount importance, as these molecules play critical roles within cells. The specific physical interactions of macromolecules govern molecular and cellular functions, making the 3D structure elucidation of these entities essential for comprehending the mechanisms underlying life processes, diseases, and drug discovery. Cryo-electron microscopy (cryo-EM) has emerged as a promising experimental technique for obtaining 3D macromolecular structures. In the course of my research, I proposed CryoREAD, an innovative AI-based method for <i>de nov</i>o DNA/RNA structure modeling. This novel approach represents the first fully automated solution for DNA/RNA structure modeling from cryo-EM maps at near-atomic resolution. However, as the resolution decreases, structure modeling becomes significantly more challenging. To address this challenge, I introduced Emap2sec+, a 3D deep convolutional neural network designed to identify protein secondary structures, RNA, and DNA information from cryo-EM maps at intermediate resolutions ranging from 5-10 Å. Additionally, I presented Alpha-EM-Multimer, a groundbreaking method for automatically building full protein complexes from cryo-EM maps at intermediate resolution. Alpha-EM-Multimer employs a diffusion model to trace the protein backbone and subsequently fits the AlphaFold predicted single-chain structure to construct the complete protein complex. Notably, this method stands as the first to enable the modeling of protein complexes with more than 10,000 residues for cryo-EM maps at intermediate resolution, achieving an average TM-Score of predicted protein complexes above 0.8, which closely approximates the native structure. Furthermore, I addressed the recognition of local structural errors in predicted and experimental protein structures by proposing DAQ, an evaluation approach for experimental protein structure quality that utilizes detection probabilities derived from cryo-EM maps via a pretrained multi-task neural network. In the pursuit of evaluating protein complexes generated through computational methods, I developed GNN-DOVE and DOVE, leveraging convolutional neural networks and graph neural networks to assess the accuracy of predicted protein complex structures. These advancements in cryo-EM-based structural modeling and evaluation methodologies hold significant promise for advancing our understanding of complex macromolecular systems and their biological implications.</p>
|
123 |
Asymmetry Learning for Out-of-distribution TasksChandra Mouli Sekar (18437814) 02 May 2024 (has links)
<p dir="ltr">Despite their astonishing capacity to fit data, neural networks have difficulties extrapolating beyond training data distribution. When the out-of-distribution prediction task is formalized as a counterfactual query on a causal model, the reason for their extrapolation failure is clear: neural networks learn spurious correlations in the training data rather than features that are causally related to the target label. This thesis proposes to perform a causal search over a known family of causal models to learn robust (maximally invariant) predictors for single- and multiple-environment extrapolation tasks.</p><p dir="ltr">First, I formalize the out-of-distribution task as a counterfactual query over a structural causal model. For single-environment extrapolation, I argue that symmetries of the input data are valuable for training neural networks that can extrapolate. I introduce Asymmetry learning, a new learning paradigm that is guided by the hypothesis that all (known) symmetries are mandatory even without evidence in training, unless the learner deems it inconsistent with the training data. Asymmetry learning performs a causal model search to find the simplest causal model defining a causal connection between the target labels and the symmetry transformations that affect the label. My experiments on a variety of out-of-distribution tasks on images and sequences show that proposed methods extrapolate much better than the standard neural networks.</p><p dir="ltr">Then, I consider multiple-environment out-of-distribution tasks in dynamical system forecasting that arise due to shifts in initial conditions or parameters of the dynamical system. I identify key OOD challenges in the existing deep learning and physics-informed machine learning (PIML) methods for these tasks. To mitigate these drawbacks, I combine meta-learning and causal structure discovery over a family of given structural causal models to learn the underlying dynamical system. In three simulated forecasting tasks, I show that the proposed approach is 2x to 28x more robust than the baselines.</p>
|
124 |
Minimalism Yields Maximum Results: Deep Learning with Limited ResourceHaoyu Wang (19193416) 22 July 2024 (has links)
<p dir="ltr">Deep learning models have demonstrated remarkable success across diverse domains, including computer vision and natural language processing. These models heavily rely on resources, encompassing annotated data, computational power, and storage. However, mobile devices, particularly in scenarios like medical or multilingual contexts, often face constraints with computing power, making ample data annotation prohibitively expensive. Developing deep learning models for such resource-constrained scenarios presents a formidable challenge. Our primary goal is to enhance the efficiency of state-of-the-art neural network models tailored for resource-limited scenarios. Our commitment lies in crafting algorithms that not only mitigate annotation requirements but also reduce computational complexity and alleviate storage demands. Our dissertation focuses on two key areas: Parameter-efficient Learning and Data-efficient Learning. In Part 1, we present our studies on parameter-efficient learning. This approach targets the creation of lightweight models for efficient storage or inference. The proposed solutions are tailored for diverse tasks, including text generation, text classification, and text/image retrieval. In Part 2, we showcase our proposed methods for data-efficient learning, concentrating on cross-lingual and multi-lingual text classification applications. </p>
|
125 |
Size-Adaptive Convolutional Neural Network with Parameterized-Swish Activation for Enhanced Object DetectionYashwanth Raj Venkata Krishnan (18322572) 03 June 2024 (has links)
<p> In computer vision, accurately detecting objects of varying sizes is essential for various applications, such as autonomous vehicle navigation and medical imaging diagnostics. Addressing the variance in object sizes presents a significant challenge requiring advanced computational solutions for reliable object recognition and processing. This research introduces a size-adaptive Convolutional Neural Network (CNN) framework to enhance detection performance across different object sizes. By dynamically adjusting the CNN’s configuration based on the observed distribution of object sizes, the framework employs statistical analysis and algorithmic decision-making to improve detection capabilities. Further innovation is presented through the Parameterized-Swish activation function. Distinguished by its dynamic parameters, this function is designed to better adapt to varying input patterns. It exceeds the performance of traditional activation functions by enabling faster model convergence and increasing detection accuracy, showcasing the effectiveness of adaptive activation functions in enhancing object detection systems. The implementation of this model has led to notable performance improvements: a 11.4% increase in mean Average Precision (mAP) and a 40.63% increase in frames per second (FPS) for small objects, demonstrating enhanced detection speed and accuracy. The model has achieved a 48.42% reduction in training time for medium-sized objects while still improving mAP, indicating significant efficiency gains without compromising precision. Large objects have seen a 16.9% reduction in training time and a 76.04% increase in inference speed, showcasing the model’s ability to expedite processing times substantially. Collectively, these advancements contribute to a more than 12% increase in detection efficiency and accuracy across various scenarios, highlighting the model’s robustness and adaptability in addressing the critical challenge of size variance in object detection. </p>
|
126 |
ENHANCING PRECISION OF OBJECT DETECTORS: BRIDGING CLASSIFICATION AND LOCALIZATION GAPS FOR 2D AND 3D MODELSNIRANJAN RAVI (7013471) 03 June 2024 (has links)
<p dir="ltr">Artificial Intelligence (AI) has revolutionized and accelerated significant advancements in various fields such as healthcare, finance, education, agriculture and the development of autonomous vehicles. We are rapidly approaching Level 5 Autonomy due to recent developments in autonomous technology, including self-driving cars, robot navigation, smart traffic monitoring systems, and dynamic routing. This success has been made possible due to Deep Learning technologies and advanced Computer Vision (CV) algorithms. With the help of perception sensors such as Camera, LiDAR and RADAR, CV algorithms enable a self-driving vehicle to interact with the environment and make intelligent decisions. Object detection lays the foundations for various applications, such as collision and obstacle avoidance, lane detection, pedestrian and vehicular safety, and object tracking. Object detection has two significant components: image classification and object localization. In recent years, enhancing the performance of 2D and 3D object detectors has spiked interest in the research community. This research aims to resolve the drawbacks associated with localization loss estimation of 2D and 3D object detectors by addressing the bounding box regression problem, addressing the class imbalance issue affecting the confidence loss estimation, and finally proposing a dynamic cross-model 3D hybrid object detector with enhanced localization and confidence loss estimation.</p><p dir="ltr">This research aims to address challenges in object detectors through four key contributions. In the first part, we aim to address the problems associated with the image classification component of 2D object detectors. Class imbalance is a common problem associated with supervised training. Common causes are noisy data, a scene with a tiny object surrounded by background pixels, or a dense scene with too many objects. These scenarios can produce many negative samples compared to positive ones, affecting the network learning and reducing the overall performance. We examined these drawbacks and proposed an Enhanced Hard Negative Mining (EHNM) approach, which utilizes anchor boxes with 20% to 50% overlap and positive and negative samples to boost performance. The efficiency of the proposed EHNM was evaluated using Single Shot Multibox Detector (SSD) architecture on the PASCAL VOC dataset, indicating that the detection accuracy of tiny objects increased by 3.9% and 4% and the overall accuracy improved by 0.9%. </p><p dir="ltr">To address localization loss, our second approach investigates drawbacks associated with existing bounding box regression problems, such as poor convergence and incorrect regression. We analyzed various cases, such as when objects are inclusive of one another, two objects with the same centres, two objects with the same centres and similar aspect ratios. During our analysis, we observed existing intersections over Union (IoU) loss and its variant’s failure to address them. We proposed two new loss functions, Improved Intersection Over Union (IIoU) and Balanced Intersection Over Union (BIoU), to enhance performance and minimize computational efforts. Two variants of the YOLOv5 model, YOLOv5n6 and YOLOv5s, were utilized to demonstrate the superior performance of IIoU on PASCAL VOC and CGMU datasets. With help of ROS and NVIDIA’s devices, inference speed was observed in real-time. Extensive experiments were performed to evaluate the performance of BIoU on object detectors. The evaluation results indicated MASK_RCNN network trained on the COCO dataset, YOLOv5n6 network trained on SKU-110K and YOLOv5x trained on the custom e-scooter dataset demonstrated 3.70% increase on small objects, 6.20% on 55% overlap and 9.03% on 80% overlap.</p><p dir="ltr">In the earlier parts, we primarily focused on 2D object detectors. Owing to its success, we extended the scope of our research to 3D object detectors in the later parts. The third portion of our research aims to solve bounding box problems associated with 3D rotated objects. Existing axis-aligned loss functions suffer a performance gap if the objects are rotated. We enhanced the earlier proposed IIoU loss by considering two additional parameters: the objects’ Z-axis and rotation angle. These two parameters aid in localizing the object in 3D space. Evaluation was performed on LiDAR and Fusion methods on 3D KITTI and nuScenes datasets.</p><p dir="ltr">Once we addressed the drawbacks associated with confidence and localization loss, we further explored ways to increase the performance of cross-model 3D object detectors. We discovered from previous studies that perception sensors are volatile to harsh environmental conditions, sunlight, and blurry motion. In the final portion of our research, we propose a hybrid 3D cross-model detection network (MAEGNN) equipped with MaskedAuto Encoders 14 (MAE) and Graph Neural Networks (GNN) along with earlier proposed IIoU and ENHM. The performance evaluation on MAEGNN on the KITTI validation dataset and KITTI test set yielded a detection accuracy of 69.15%, 63.99%, 58.46% and 40.85%, 37.37% on 3D pedestrians with overlap of 50%. This developed hybrid detector overcomes the challenges of localization error and confidence estimation and outperforms many state-of-art 3D object detectors for autonomous platforms.</p>
|
127 |
Image Analysis and Deep Learning for Applications in MicroscopyIshaq, Omer January 2016 (has links)
Quantitative microscopy deals with the extraction of quantitative measurements from samples observed under a microscope. Recent developments in microscopy systems, sample preparation and handling techniques have enabled high throughput biological experiments resulting in large amounts of image data, at biological scales ranging from subcellular structures such as fluorescently tagged nucleic acid sequences to whole organisms such as zebrafish embryos. Consequently, methods and algorithms for automated quantitative analysis of these images have become increasingly important. These methods range from traditional image analysis techniques to use of deep learning architectures. Many biomedical microscopy assays result in fluorescent spots. Robust detection and precise localization of these spots are two important, albeit sometimes overlapping, areas for application of quantitative image analysis. We demonstrate the use of popular deep learning architectures for spot detection and compare them against more traditional parametric model-based approaches. Moreover, we quantify the effect of pre-training and change in the size of training sets on detection performance. Thereafter, we determine the potential of training deep networks on synthetic and semi-synthetic datasets and their comparison with networks trained on manually annotated real data. In addition, we present a two-alternative forced-choice based tool for assisting in manual annotation of real image data. On a spot localization track, we parallelize a popular compressed sensing based localization method and evaluate its performance in conjunction with different optimizers, noise conditions and spot densities. We investigate its sensitivity to different point spread function estimates. Zebrafish is an important model organism, attractive for whole-organism image-based assays for drug discovery campaigns. The effect of drug-induced neuronal damage may be expressed in the form of zebrafish shape deformation. First, we present an automated method for accurate quantification of tail deformations in multi-fish micro-plate wells using image analysis techniques such as illumination correction, segmentation, generation of branch-free skeletons of partial tail-segments and their fusion to generate complete tails. Later, we demonstrate the use of a deep learning-based pipeline for classifying micro-plate wells as either drug-affected or negative controls, resulting in competitive performance, and compare the performance from deep learning against that from traditional image analysis approaches.
|
128 |
Packaging Demand Forecasting in Logistics using Deep Neural NetworksBachu, Yashwanth January 2019 (has links)
Background: Logistics have a vital role in supply chain management and those logistics operations are dependent on the availability of packaging material for packing goods and material to be shipped. Forecasting packaging material demand for a long period of time will help organization planning to meet the demand. Using time-series data with Deep Neural Networks for long term forecasting is proposed for research. Objectives: This study is to identify the DNN used in forecasting packaging demand and in similar problems in terms of data, data similar to the available data with the organization (Volvo). Identifying the best-practiced approach for long-term forecasting and then combining the approach with identified and selected DNN for forecasting. The end objective of the thesis is to suggest the best DNN model for packaging demand forecasting. Methods: An experiment is conducted to evaluate the DNN models selected for demand forecasting. Three models are selected by a preliminary systematic literature review. Another Systematic literature review is performed in parallel for identifying metrics to evaluate the models to measure performance. Results from the preliminary literature review were instrumental in performing the experiment. Results: Three models observed in this study are performing well with considerable forecasting values. But based on the type and amount of historical data that models were given to learn, three models have a very slight difference in performance measures in terms of forecasting performance. Comparisons are made with different measures that are selected by the literature review. For a better understanding of the batch size impact on model performance, experimented three models were developed with two different batch sizes. Conclusions: Proposed models are performing considerable forecasting of packaging demand for planning the next 52 weeks (∼ 1 Year). Results show that by adopting DNN in forecasting, reliable packaging demand can be forecasted on time series data for packaging material. The combination of CNN-LSTM is better performing than the respective individual models by a small margin. By extending the forecasting at the granule level of the supply chain (Individual suppliers and plants) will benefit the organization by controlling the inventory and avoiding excess inventory.
|
129 |
Watermarking in Audio using Deep LearningTegendal, Lukas January 2019 (has links)
Watermarking is a technique used to used to mark the ownership in media such as audio or images by embedding a watermark, e.g. copyrights information, into the media. A good watermarking method should perform this embedding without affecting the quality of the media. Recent methods for watermarking in images uses deep learning to embed and extract the watermark in the images. In this thesis, we investigate watermarking in the hearable frequencies of audio using deep learning. More specifically, we try to create a watermarking method for audio that is robust to noise in the carrier, and that allows for the extraction of the embedded watermark from the audio after being played over-the-air. The proposed method consists of two deep convolutional neural network trained end-to-end on music with simulated noise. Experiments show that the proposed method successfully creates watermarks robust to simulated noise with moderate quality reductions, but it is not robust to the real world noise introduced after playing and recording the audio over-the-air.
|
130 |
Attributed Multi-Relational Attention Network for Fact-checking URL RecommendationYou, Di 11 July 2019 (has links)
To combat fake news, researchers mostly focused on detecting fake news and journalists built and maintained fact-checking sites (e.g., Snopes.com and Politifact.com). However, fake news dissemination has been greatly promoted by social media sites, and these fact-checking sites have not been fully utilized. To overcome these problems and complement existing methods against fake news, in this thesis, we propose a deep-learning based fact-checking URL recommender system to mitigate impact of fake news in social media sites such as Twitter and Facebook. In particular, our proposed framework consists of a multi-relational attentive module and a heterogeneous graph attention network to learn complex/semantic relationship between user-URL pairs, user-user pairs, and URL-URL pairs. Extensive experiments on a real-world dataset show that our proposed framework outperforms seven state-of-the-art recommendation models, achieving at least 3~5.3% improvement.
|
Page generated in 0.0721 seconds