• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 336
  • 42
  • 19
  • 13
  • 10
  • 8
  • 4
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 524
  • 524
  • 240
  • 202
  • 163
  • 129
  • 109
  • 108
  • 105
  • 86
  • 85
  • 77
  • 73
  • 73
  • 69
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

AUTONOMOUS SAFE LANDING ZONE DETECTION FOR UAVs UTILIZING MACHINE LEARNING

Nepal, Upesh 01 May 2022 (has links)
One of the main challenges of the integration of unmanned aerial vehicles (UAVs) into today’s society is the risk of in-flight failures, such as motor failure, occurring in populated areas that can result in catastrophic accidents. We propose a framework to manage the consequences of an in-flight system failure and to bring down the aircraft safely without causing any serious accident to people, property, and the UAV itself. This can be done in three steps: a) Detecting a failure, b) Finding a safe landing spot, and c) Navigating the UAV to the safe landing spot. In this thesis, we will look at part b. Specifically, we are working to develop an active system that can detect landing sites autonomously without any reliance on UAV resources. To detect a safe landing site, we are using a deep learning algorithm named "You Only Look Once" (YOLO) that runs on a Jetson Xavier NX computing module, which is connected to a camera, for image processing. YOLO is trained using the DOTA dataset and we show that it can detect landing spots and obstacles effectively. Then by avoiding the detected objects, we find a safe landing spot. The effectiveness of this algorithm will be shown first by comprehensive simulations. We also plan to experimentally validate this algorithm by flying a UAV and capturing ground images, and then applying the algorithm in real-time to see if it can effectively detect acceptable landing spots.
182

Methods and Algorithms for Efficient Programming of FPGA-based Heterogeneous Systems for Object Detection

Kalms, Lester 14 March 2023 (has links)
Nowadays, there is a high demand for computer vision applications in numerous application areas, such as autonomous driving or unmanned aerial vehicles. However, the application areas and scenarios are becoming increasingly complex, and their data requirements are growing. To meet these requirements, it needs increasingly powerful computing systems. FPGA-based heterogeneous systems offer an excellent solution in terms of energy efficiency, flexibility, and performance, especially in the field of computer vision. Due to complex applications and the use of FPGAs in combination with other architectures, efficient programming is becoming increasingly difficult. Thus, developers need a comprehensive framework with efficient automation, good usability, reasonable abstraction, and seamless integration of tools. It should provide an easy entry point, and reduce the effort to learn new concepts, programming languages and tools. Additionally, it needs optimized libraries for the user to focus on developing applications without getting involved with the underlying details. These should be well integrated, easy to use, and cover a wide range of possible use cases. The framework needs efficient algorithms to execute applications on heterogeneous architectures with maximum performance. These algorithms should distribute applications across various nodes with low fragmentation and communication overhead and find a near-optimal solution in a reasonable amount of time. This thesis addresses the research problem of an efficient implementation of object detection applications, their distribution across FPGA-based heterogeneous systems, and methods for automation and integration using toolchains. Within this, the three contributions are the HiFlipVX object detection library, the DECISION framework, and the APARMAP application distribution algorithm. HiFlipVX is an open-source HLS-based FPGA library optimized for performance and resource efficiency. It contains 66 highly parameterizable computer vision functions including neural networks, ideally for design space exploration. It extends the OpenVX standard for feature extraction, which is challenging due to unknown element size at design time. All functions are streaming capable to achieve maximum performance by increasing parallelism and reducing off-chip memory access. It does not require external or vendor libraries, which eases project integration, device coverage, and vendor portability, as shown for Intel. The library consumed on average 0.39% FFs and 0.32% LUTs for a set of image processing functions compared to a vendor library. A HiFlipVX implementation of the AKAZE feature detector computes between 3.56 and 4.13 times more pixels per second than the related work, while its resource consumption is comparable to optimized VHDL designs. Its neural network extension achieved a speedup of 3.23 for an AlexNet layer in comparison to a related work, while consuming 73% less on-chip memory. Furthermore, this thesis proposes an improved feature extraction implementation that achieves a repeatability of 72.57% when weighting complex cases, while the next best algorithm only achieves 62.99 %. DECISION is a framework consisting of two toolchains for the efficient programming of FPGA-based heterogeneous systems. Both integrate HiFlipVX and use a joint OpenVXbased frontend to implement computer vision applications. It abstracts the underlying hardware and algorithm details while covering a wide range of architectures and applications. The first toolchain targets x86-based systems consisting of CPUs, GPUs, and FPGAs using OpenCL (Open Computing Language). To create a heterogeneous schedule, it considers device profiles, kernel profiles and estimates, and FPGA dataflow characteristics. It manages synchronization, memory transfers and data coherence at design time. It creates a runtime optimized program which excels by its high parallelism and a low overhead. Additionally, this thesis looks at the integration of OpenCL-based libraries, automatic OpenCL kernel generation, and OpenCL kernel optimization and comparison for different architectures. The second toolchain creates an application specific and adaptive NoC-based architecture. The streaming-optimized architecture enables the reusability of vision functions by multiple applications to improve the resource efficiency while maintaining high performance. For a set of example applications, the resource consumption was more than halved, while its overhead was only 0.015% in terms of performance. APARMAP is an application distribution algorithm for partition-based and mesh-like FPGA topologies. It uses a NoC (Network-on-Chip) as communication infrastructure to connect reconfigurable regions and generate an application-specific hardware architecture. The algorithm uses load balancing techniques to find reasonable solutions within a predictable and scalable amount of time. It optimizes solutions using various heuristics, such as Simulated Annealing and Tabu Search. It uses a multithreaded grid-based approach to prevent threads from calculating the same solution and getting stuck in local minimums. Its constraints and objectives are the FPGA resource utilization, NoC bandwidth consumption, NoC hop count, and execution time of the proposed algorithm. The evaluation showed that the algorithm can deal with heterogeneous and irregular host graph topologies. The algorithm showed a good scalability in terms of computation time for an increasing number of nodes and partitions. It was able to achieve an optimal placement for a set of example graphs up to a size of 196 nodes on host graphs of up to 49 partitions. For a real application with 271 nodes and 441 edges, it was able to achieve a distribution with low resource fragmentation in an average time of 149 ms.
183

Automated Foreign Object Detection on Conveyor Belts

Sundelius, Kim January 2023 (has links)
Ore is transported using belt conveyor systems. The transported ore has various anomalous objects that must be removed to prevent damage to the system. Currently anomalies are detected manually using humans. This leads to increased costs of wages and damage to the system overmissed anomalies. The thesis aims to solve this problem via the use of trained neural networks which can run on relatively cheap systems with a greater accuracy than humans. A set of neural networks were trained on both the BCS dataset consisting of data collected from the belt conveyor system and on the MVTec dataset. The latter dataset was used as a way of checking the correctness of the implementation of the models. As training neural networks usually requires large datasets, this thesis also focuses on the effect of the portion of labelled versus unlabelled data on the models. Labelling data can be time consuming and expensive so investigating if or how much data can be unlabelled without any or minimal loss to accuracy could lead to further cost reductions. The convolutional autoencoder (CAE) performed best on the classification based task on the BCS dataset where it managed to classify most of the dataset correctly, with an F1-score of 0.94 on data without anomalies and an F1-score of 0.86 on data with anomalies, as long as suitable thresholds were set. ResNet performed somewhat well with a 0.91 F1-score in detecting anomaly free data and a 0.50 F1-score in detecting anomaly containing data. The SimCLR and SimCLRv2 models were unable to learn from the data and defaulted to always assuming the data contained anomalies. The CAE model trained using the L1 loss function performed best with an IoU of 0.272 and performed worst with the SSIM based loss function with an IoU of 0.160. The effect of labelled versus unlabelled data using the MVTec dataset was tested using the SimCLR and SimCLRv2 models and the models performed best with the fully labelled dataset which was expected. The SimCLR model was able to identify all categories with an F1-score greater than 0.67 whereas the other splits performed worse overall with two or more categories completely misclassified. The SimCLRv2 was able to classify six categories with an F1-score greater than 0.0 which was significantly better than all other labelled and unlabelled splits.
184

Object Detection and Classification Based on Point Separation Distance Features of Point Cloud Data

Ji, Jiajie 07 August 2023 (has links)
No description available.
185

Enhancing Object Detection Methods by Knowledge Distillation for Automotive Driving in Real-World Settings

Kian, Setareh 07 August 2023 (has links)
No description available.
186

Incorporating spatial relationship information in signal-to-text processing

Davis, Jeremy Elon 13 May 2022 (has links) (PDF)
This dissertation outlines the development of a signal-to-text system that incorporates spatial relationship information to generate scene descriptions. Existing signal-to-text systems generate accurate descriptions in regards to information contained in an image. However, to date, no signalto- text system incorporates spatial relationship information. A survey of related work in the fields of object detection, signal-to-text, and spatial relationships in images is presented first. Three methodologies followed by evaluations were conducted in order to create the signal-to-text system: 1) generation of object localization results from a set of input images, 2) derivation of Level One Summaries from an input image, and 3) inference of Level Two Summaries from the derived Level One Summaries. Validation processes are described for the second and third evaluations, as the first evaluation has been previously validated in the related original works. The goal of this research is to show that a signal-to-text system that incorporates spatial information results in more informative descriptions of the content contained in an image. An additional goal of this research is to demonstrate the signal-to-text system can be easily applied to additional data sets, other than the sets used to train the system, and achieve similar results to the training sets. To achieve this goal, a validation study was conducted and is presented to the reader.
187

Image Analysis For Plant Phenotyping

Enyu Cai (15533216) 17 May 2023 (has links)
<p>Plant phenotyping focuses on the measurement of plant characteristics throughout the growing season, typically with the goal of evaluating genotypes for plant breeding and management practices related to nutrient applications. Estimating plant characteristics is important for finding the relationship between the plant's genetic data and observable traits, which is also related to the environment and management practices. Recent machine learning approaches provide promising capabilities for high-throughput plant phenotyping using images. In this thesis, we focus on estimating plant traits for a field-based crop using images captured by Unmanned Aerial Vehicles (UAVs). We propose a method for estimating plant centers by transferring an existing model to a new scenario using limited ground truth data. We describe the use of transfer learning using a model fine-tuned for a single field or a single type of plant on a varied set of similar crops and fields. We introduce a method for rapidly counting panicles using images acquired by UAVs. We evaluate three different deep neural network structures for panicle counting and location. We propose a method for sorghum flowering time estimation using multi-temporal panicle counting. We present an approach that uses synthetic training images from generative adversarial networks for data augmentation to enhance the performance of sorghum panicle detection and counting. We reduce the amount of training data for sorghum panicle detection via semi-supervised learning. We create synthetic sorghum and maize images using diffusion models. We propose a method for tomato plant segmentation by color correction and color space conversion. We also introduce the methods for detecting and classifying bacterial tomato wilting from images.</p>
188

Feature Construction Using Evolution-COnstructed Features for General Object Recognition

Lillywhite, Kirt D. 05 March 2012 (has links) (PDF)
Object recognition is a well studied but extremely challenging field. Human detection is an especially important part of object recognition as it has played a role in machine and human interaction, biometrics, unmanned vehicles, as well as tracking and surveillance. We first present a hardware implementation of the successful Histograms of Oriented Gradients (HOG) method for human detection. The implementation significantly speeds up the method achieving 38 frames a second on VGA video while testing 11,160 sliding windows per frame. The accuracy remains comparable to the CPU implementation. Analysis of the HOG method and other popular object recognition methods led to a novel approach for object detection using a feature construction method called Evolution-COnstructed (ECO) features. Most other approaches rely on human experts to construct features for object recognition. ECO features are automatically constructed by uniquely employing a standard genetic algorithm to discover series of transforms that are highly discriminative. Using ECO features provides several advantages over other object detection algorithms including: no need for a human expert to build feature sets or tune their parameters, ability to generate specialized feature sets for different objects, and no limitations to certain types of image sources. We show in our experiments that ECO features perform better or comparable with state-of-the-art object recognition algorithms making it the first feature construction method to compete with features created by human experts at general object recognition. An analysis is given of ECO features which includes a visualization of ECO features and improvements made to the algorithm.
189

Comparison Of Object Detection Models - to detect recycle logos on tetra packs

Kamireddi, Sree Chandan January 2022 (has links)
Background: Manufacturing and production of daily used products using recyclable materials took a steep incline over the past few years. The recyclable packages that are being considered for this thesis are Tetra Packs. Tetra packs are widely used for packaging liquid foods. A few recyclable methods are being used to recycle such tetra packs which use the barcode behind them to scan and give which recyclable method the particular tetra pack has to go through. In some cases, the barcode might get worn off due to excessive usage leading to a problem. Therefore there needs to be a research that has to be carried out to address this problem and find a solution to the same.  Objectives: The objectives to address and fulfill the aim of this thesis are : To find/create the necessary data set containing clear pictures of the tetra packs with visible recyclable logos. To draw bounding boxes around the objects i.e., logos for training the models. To test the data set by applying all four Deep Learning models. To compare each of the models on speed and the performance metrics i.e, mAP and IoU and identify the best algorithm among them.  Methods: To answer the research question we have chosen one research methodol- ogy which is Experiment.Results: YOLOv5 is considered as the best algorithm among the four algorithms we are comparing. Speed of YOLOv5, SSD and Faster-RCNN were found to be similar i.e, 0.2 seconds whereas Mask-RCNN was the slowest with the detection speed of 1.0 seconds. The mAP score of SSD is 0.86 which is the highest among the four followed by YOLOv5 at 0.771, Faster-RCNN at 0.67 and Mask-RCNN at 0.62. IoU score of Faster-RCNN is 0.96 which is the highest among the four followed by YOLOv5 at 0.95, SSD at 0.50 and Mask-RCNN at 0.321. On comparing all the above results YOLOv5 is concluded as the best algorithm among the four as it is relatively fast and accurate without any major draw-backs in any category.  Conclusions: Amongst the four algorithms Faster-RCNN, YOLO, SSD and Mask- RCNN, YOLOv5 is declared as the best algorithm after comparing all the models based on speed and the performance metrics mAP, IoU. YOLOv5 is considered as the best algorithm among the four algorithms we are comparing.
190

End-to-End Tabular Information Extraction in Datasheets with Deep Learning

Kara, Ertugrul 09 July 2019 (has links)
The advent of Industry 4.0 phenomenon has been transforming the information management regarding the specifications of electronic components. This change affects many organizations, including global supply chains that optimizes many product chains, such as raw materials or electronic components. Supply chains consist of thousands of manufacturers and connect them to other organizations and end user, and they include billions of distinct components. The digitization of critical information has to be carried out automatically since there are millions of documents. Although the documents vary greatly in shape and style, the essential information is usually presented in the tables in a condensed format. Extracting the structured information from tables are done by human operators, which costs human effort, time and corporate resources. Based on the motivation that AI-based solutions are automating many processes, this thesis proposes to use deep learning-based solutions for three main problems: (i) table detection, (ii) table internal structure detection and (iii) End-to-End (E2E) tabular structure detection. To this end, deep learning models are trained mostly with public datasets, and a private dataset (after labelling 2000+ documents) which was provided to us by our industry partner. To achieve accurate table detection, we propose a method based on the successful Mask-Region-Based Convolutional Neural Network (Mask-RCNN) instance segmentation model. With some modifications to the training set labels, we have achieved state-of-the-art detection rates with 99% AP and 100% recall. We use the PASCAL Visual Object Classes (VOC) 11-point Average Precision (AP) metric to compare the evaluated deep learning-based methods. Detecting tables is the initial step towards semantic modelling of e-components. Therefore, the structure should also be detected in order to extract information. With this in mind, we introduce another method based on the Mask-RCNN model, which is able to detect structure at a with around 96% AP. Combining these two networks, or developing a new model is a necessity. To this end, inspired by the success of Mask-RCNN models, we introduce the following Mask-RCNN based models to realize E2E tabular structure detection: Stitched E2E model achieved by bridging the output of table detection model into the structure detection model, attained more than 77% AP on the difficult public UNLV dataset with various post-processing steps applied when bridging the two network. Single-pass E2E detection networks were able to attain a higher AP of 86% but with lower recall. This thesis concludes that deep learning-based object detection and instance segmentation networks can accomplish state-of-the-art performance.

Page generated in 0.0988 seconds