Global ETD Search

121	Size-Adaptive Convolutional Neural Network with Parameterized-Swish Activation for Enhanced Object Detection Yashwanth Raj Venkata Krishnan (18322572) 03 June 2024 (has links) <p> In computer vision, accurately detecting objects of varying sizes is essential for various applications, such as autonomous vehicle navigation and medical imaging diagnostics. Addressing the variance in object sizes presents a significant challenge requiring advanced computational solutions for reliable object recognition and processing. This research introduces a size-adaptive Convolutional Neural Network (CNN) framework to enhance detection performance across different object sizes. By dynamically adjusting the CNN’s configuration based on the observed distribution of object sizes, the framework employs statistical analysis and algorithmic decision-making to improve detection capabilities. Further innovation is presented through the Parameterized-Swish activation function. Distinguished by its dynamic parameters, this function is designed to better adapt to varying input patterns. It exceeds the performance of traditional activation functions by enabling faster model convergence and increasing detection accuracy, showcasing the effectiveness of adaptive activation functions in enhancing object detection systems. The implementation of this model has led to notable performance improvements: a 11.4% increase in mean Average Precision (mAP) and a 40.63% increase in frames per second (FPS) for small objects, demonstrating enhanced detection speed and accuracy. The model has achieved a 48.42% reduction in training time for medium-sized objects while still improving mAP, indicating significant efficiency gains without compromising precision. Large objects have seen a 16.9% reduction in training time and a 76.04% increase in inference speed, showcasing the model’s ability to expedite processing times substantially. Collectively, these advancements contribute to a more than 12% increase in detection efficiency and accuracy across various scenarios, highlighting the model’s robustness and adaptability in addressing the critical challenge of size variance in object detection. </p> Deep learning Neural networks neural network algorithm Deep Learning Theory
122	ENHANCING PRECISION OF OBJECT DETECTORS: BRIDGING CLASSIFICATION AND LOCALIZATION GAPS FOR 2D AND 3D MODELS NIRANJAN RAVI (7013471) 03 June 2024 (has links) <p dir="ltr">Artificial Intelligence (AI) has revolutionized and accelerated significant advancements in various fields such as healthcare, finance, education, agriculture and the development of autonomous vehicles. We are rapidly approaching Level 5 Autonomy due to recent developments in autonomous technology, including self-driving cars, robot navigation, smart traffic monitoring systems, and dynamic routing. This success has been made possible due to Deep Learning technologies and advanced Computer Vision (CV) algorithms. With the help of perception sensors such as Camera, LiDAR and RADAR, CV algorithms enable a self-driving vehicle to interact with the environment and make intelligent decisions. Object detection lays the foundations for various applications, such as collision and obstacle avoidance, lane detection, pedestrian and vehicular safety, and object tracking. Object detection has two significant components: image classification and object localization. In recent years, enhancing the performance of 2D and 3D object detectors has spiked interest in the research community. This research aims to resolve the drawbacks associated with localization loss estimation of 2D and 3D object detectors by addressing the bounding box regression problem, addressing the class imbalance issue affecting the confidence loss estimation, and finally proposing a dynamic cross-model 3D hybrid object detector with enhanced localization and confidence loss estimation.</p><p dir="ltr">This research aims to address challenges in object detectors through four key contributions. In the first part, we aim to address the problems associated with the image classification component of 2D object detectors. Class imbalance is a common problem associated with supervised training. Common causes are noisy data, a scene with a tiny object surrounded by background pixels, or a dense scene with too many objects. These scenarios can produce many negative samples compared to positive ones, affecting the network learning and reducing the overall performance. We examined these drawbacks and proposed an Enhanced Hard Negative Mining (EHNM) approach, which utilizes anchor boxes with 20% to 50% overlap and positive and negative samples to boost performance. The efficiency of the proposed EHNM was evaluated using Single Shot Multibox Detector (SSD) architecture on the PASCAL VOC dataset, indicating that the detection accuracy of tiny objects increased by 3.9% and 4% and the overall accuracy improved by 0.9%. </p><p dir="ltr">To address localization loss, our second approach investigates drawbacks associated with existing bounding box regression problems, such as poor convergence and incorrect regression. We analyzed various cases, such as when objects are inclusive of one another, two objects with the same centres, two objects with the same centres and similar aspect ratios. During our analysis, we observed existing intersections over Union (IoU) loss and its variant’s failure to address them. We proposed two new loss functions, Improved Intersection Over Union (IIoU) and Balanced Intersection Over Union (BIoU), to enhance performance and minimize computational efforts. Two variants of the YOLOv5 model, YOLOv5n6 and YOLOv5s, were utilized to demonstrate the superior performance of IIoU on PASCAL VOC and CGMU datasets. With help of ROS and NVIDIA’s devices, inference speed was observed in real-time. Extensive experiments were performed to evaluate the performance of BIoU on object detectors. The evaluation results indicated MASK_RCNN network trained on the COCO dataset, YOLOv5n6 network trained on SKU-110K and YOLOv5x trained on the custom e-scooter dataset demonstrated 3.70% increase on small objects, 6.20% on 55% overlap and 9.03% on 80% overlap.</p><p dir="ltr">In the earlier parts, we primarily focused on 2D object detectors. Owing to its success, we extended the scope of our research to 3D object detectors in the later parts. The third portion of our research aims to solve bounding box problems associated with 3D rotated objects. Existing axis-aligned loss functions suffer a performance gap if the objects are rotated. We enhanced the earlier proposed IIoU loss by considering two additional parameters: the objects’ Z-axis and rotation angle. These two parameters aid in localizing the object in 3D space. Evaluation was performed on LiDAR and Fusion methods on 3D KITTI and nuScenes datasets.</p><p dir="ltr">Once we addressed the drawbacks associated with confidence and localization loss, we further explored ways to increase the performance of cross-model 3D object detectors. We discovered from previous studies that perception sensors are volatile to harsh environmental conditions, sunlight, and blurry motion. In the final portion of our research, we propose a hybrid 3D cross-model detection network (MAEGNN) equipped with MaskedAuto Encoders 14 (MAE) and Graph Neural Networks (GNN) along with earlier proposed IIoU and ENHM. The performance evaluation on MAEGNN on the KITTI validation dataset and KITTI test set yielded a detection accuracy of 69.15%, 63.99%, 58.46% and 40.85%, 37.37% on 3D pedestrians with overlap of 50%. This developed hybrid detector overcomes the challenges of localization error and confidence estimation and outperforms many state-of-art 3D object detectors for autonomous platforms.</p> Computer vision Deep learning neural network deep learning object detection 2D 3D IoU KITTI YOLO regression
123	Minimalism Yields Maximum Results: Deep Learning with Limited Resource Haoyu Wang (19193416) 22 July 2024 (has links) <p dir="ltr">Deep learning models have demonstrated remarkable success across diverse domains, including computer vision and natural language processing. These models heavily rely on resources, encompassing annotated data, computational power, and storage. However, mobile devices, particularly in scenarios like medical or multilingual contexts, often face constraints with computing power, making ample data annotation prohibitively expensive. Developing deep learning models for such resource-constrained scenarios presents a formidable challenge. Our primary goal is to enhance the efficiency of state-of-the-art neural network models tailored for resource-limited scenarios. Our commitment lies in crafting algorithms that not only mitigate annotation requirements but also reduce computational complexity and alleviate storage demands. Our dissertation focuses on two key areas: Parameter-efficient Learning and Data-efficient Learning. In Part 1, we present our studies on parameter-efficient learning. This approach targets the creation of lightweight models for efficient storage or inference. The proposed solutions are tailored for diverse tasks, including text generation, text classification, and text/image retrieval. In Part 2, we showcase our proposed methods for data-efficient learning, concentrating on cross-lingual and multi-lingual text classification applications. </p> Natural language processing Data mining and knowledge discovery Deep learning deep learning language model efficiency cross-lingual
124	Image Analysis and Deep Learning for Applications in Microscopy Ishaq, Omer January 2016 (has links) Quantitative microscopy deals with the extraction of quantitative measurements from samples observed under a microscope. Recent developments in microscopy systems, sample preparation and handling techniques have enabled high throughput biological experiments resulting in large amounts of image data, at biological scales ranging from subcellular structures such as fluorescently tagged nucleic acid sequences to whole organisms such as zebrafish embryos. Consequently, methods and algorithms for automated quantitative analysis of these images have become increasingly important. These methods range from traditional image analysis techniques to use of deep learning architectures. Many biomedical microscopy assays result in fluorescent spots. Robust detection and precise localization of these spots are two important, albeit sometimes overlapping, areas for application of quantitative image analysis. We demonstrate the use of popular deep learning architectures for spot detection and compare them against more traditional parametric model-based approaches. Moreover, we quantify the effect of pre-training and change in the size of training sets on detection performance. Thereafter, we determine the potential of training deep networks on synthetic and semi-synthetic datasets and their comparison with networks trained on manually annotated real data. In addition, we present a two-alternative forced-choice based tool for assisting in manual annotation of real image data. On a spot localization track, we parallelize a popular compressed sensing based localization method and evaluate its performance in conjunction with different optimizers, noise conditions and spot densities. We investigate its sensitivity to different point spread function estimates. Zebrafish is an important model organism, attractive for whole-organism image-based assays for drug discovery campaigns. The effect of drug-induced neuronal damage may be expressed in the form of zebrafish shape deformation. First, we present an automated method for accurate quantification of tail deformations in multi-fish micro-plate wells using image analysis techniques such as illumination correction, segmentation, generation of branch-free skeletons of partial tail-segments and their fusion to generate complete tails. Later, we demonstrate the use of a deep learning-based pipeline for classifying micro-plate wells as either drug-affected or negative controls, resulting in competitive performance, and compare the performance from deep learning against that from traditional image analysis approaches. Machine learning Deep learning Image analysis Quantitative microscopy Bioimaging
125	Packaging Demand Forecasting in Logistics using Deep Neural Networks Bachu, Yashwanth January 2019 (has links) Background: Logistics have a vital role in supply chain management and those logistics operations are dependent on the availability of packaging material for packing goods and material to be shipped. Forecasting packaging material demand for a long period of time will help organization planning to meet the demand. Using time-series data with Deep Neural Networks for long term forecasting is proposed for research. Objectives: This study is to identify the DNN used in forecasting packaging demand and in similar problems in terms of data, data similar to the available data with the organization (Volvo). Identifying the best-practiced approach for long-term forecasting and then combining the approach with identified and selected DNN for forecasting. The end objective of the thesis is to suggest the best DNN model for packaging demand forecasting. Methods: An experiment is conducted to evaluate the DNN models selected for demand forecasting. Three models are selected by a preliminary systematic literature review. Another Systematic literature review is performed in parallel for identifying metrics to evaluate the models to measure performance. Results from the preliminary literature review were instrumental in performing the experiment. Results: Three models observed in this study are performing well with considerable forecasting values. But based on the type and amount of historical data that models were given to learn, three models have a very slight difference in performance measures in terms of forecasting performance. Comparisons are made with different measures that are selected by the literature review. For a better understanding of the batch size impact on model performance, experimented three models were developed with two different batch sizes. Conclusions: Proposed models are performing considerable forecasting of packaging demand for planning the next 52 weeks (∼ 1 Year). Results show that by adopting DNN in forecasting, reliable packaging demand can be forecasted on time series data for packaging material. The combination of CNN-LSTM is better performing than the respective individual models by a small margin. By extending the forecasting at the granule level of the supply chain (Individual suppliers and plants) will benefit the organization by controlling the inventory and avoiding excess inventory. Deep Learning Forecasting Logistics Computer Sciences Datavetenskap (datalogi)
126	Watermarking in Audio using Deep Learning Tegendal, Lukas January 2019 (has links) Watermarking is a technique used to used to mark the ownership in media such as audio or images by embedding a watermark, e.g. copyrights information, into the media. A good watermarking method should perform this embedding without affecting the quality of the media. Recent methods for watermarking in images uses deep learning to embed and extract the watermark in the images. In this thesis, we investigate watermarking in the hearable frequencies of audio using deep learning. More specifically, we try to create a watermarking method for audio that is robust to noise in the carrier, and that allows for the extraction of the embedded watermark from the audio after being played over-the-air. The proposed method consists of two deep convolutional neural network trained end-to-end on music with simulated noise. Experiments show that the proposed method successfully creates watermarks robust to simulated noise with moderate quality reductions, but it is not robust to the real world noise introduced after playing and recording the audio over-the-air. Machine Learning Deep Learning Watermarking Signal Processing Signalbehandling
127	Attributed Multi-Relational Attention Network for Fact-checking URL Recommendation You, Di 11 July 2019 (has links) To combat fake news, researchers mostly focused on detecting fake news and journalists built and maintained fact-checking sites (e.g., Snopes.com and Politifact.com). However, fake news dissemination has been greatly promoted by social media sites, and these fact-checking sites have not been fully utilized. To overcome these problems and complement existing methods against fake news, in this thesis, we propose a deep-learning based fact-checking URL recommender system to mitigate impact of fake news in social media sites such as Twitter and Facebook. In particular, our proposed framework consists of a multi-relational attentive module and a heterogeneous graph attention network to learn complex/semantic relationship between user-URL pairs, user-user pairs, and URL-URL pairs. Extensive experiments on a real-world dataset show that our proposed framework outperforms seven state-of-the-art recommendation models, achieving at least 3~5.3% improvement. deep learning fact-checking graph neural network recommender system
128	3D Visualization of MPC-based Algorithms for Autonomous Vehicles Sörliden, Pär January 2019 (has links) The area of autonomous vehicles is an interesting research topic, which is popular in both research and industry worldwide. Linköping university is no exception and some of their research is based on using Model Predictive Control (MPC) for autonomous vehicles. They are using MPC to plan a path and control the autonomous vehicles. Additionally, they are using different methods (for example deep learning or likelihood) to calculate collision probabilities for the obstacles. These are very complex algorithms, and it is not always easy to see how they work. Therefore, it is interesting to study if a visualization tool, where the algorithms are presented in a three-dimensional way, can be useful in understanding them, and if it can be useful in the development of the algorithms. This project has consisted of implementing such a visualization tool, and evaluating it. This has been done by implementing a visualization using a 3D library, and then evaluating it both analytically and empirically. The evaluation showed positive results, where the proposed tool is shown to be helpful when developing algorithms for autonomous vehicles, but also showing that some aspects of the algorithm still would need more research on how they could be implemented. This concerns the neural networks, which was shown to be difficult to visualize, especially given the available data. It was found that more information about the internal variables in the network would be needed to make a better visualization of them. visualization 3d autonomous vehicles mpc deep learning Computer Systems Datorsystem
129	Skin lesion segmentation and classification using deep learning Unknown Date (has links) Melanoma, a severe and life-threatening skin cancer, is commonly misdiagnosed or left undiagnosed. Advances in artificial intelligence, particularly deep learning, have enabled the design and implementation of intelligent solutions to skin lesion detection and classification from visible light images, which are capable of performing early and accurate diagnosis of melanoma and other types of skin diseases. This work presents solutions to the problems of skin lesion segmentation and classification. The proposed classification approach leverages convolutional neural networks and transfer learning. Additionally, the impact of segmentation (i.e., isolating the lesion from the rest of the image) on the performance of the classifier is investigated, leading to the conclusion that there is an optimal region between “dermatologist segmented” and “not segmented” that produces best results, suggesting that the context around a lesion is helpful as the model is trained and built. Generative adversarial networks, in the context of extending limited datasets by creating synthetic samples of skin lesions, are also explored. The robustness and security of skin lesion classifiers using convolutional neural networks are examined and stress-tested by implementing adversarial examples. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection Melanoma Medical imaging Deep learning Skin diseases--Classification Image segmentation
130	Parallel Distributed Deep Learning on Cluster Computers Unknown Date (has links) Deep Learning is an increasingly important subdomain of arti cial intelligence. Deep Learning architectures, arti cial neural networks characterized by having both a large breadth of neurons and a large depth of layers, bene ts from training on Big Data. The size and complexity of the model combined with the size of the training data makes the training procedure very computationally and temporally expensive. Accelerating the training procedure of Deep Learning using cluster computers faces many challenges ranging from distributed optimizers to the large communication overhead speci c to a system with o the shelf networking components. In this thesis, we present a novel synchronous data parallel distributed Deep Learning implementation on HPCC Systems, a cluster computer system. We discuss research that has been conducted on the distribution and parallelization of Deep Learning, as well as the concerns relating to cluster environments. Additionally, we provide case studies that evaluate and validate our implementation. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection Deep learning. Neural networks (Computer science). Artificial intelligence. Machine learning.

Search results