• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 348
  • 42
  • 20
  • 13
  • 10
  • 8
  • 5
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 541
  • 541
  • 253
  • 210
  • 173
  • 134
  • 113
  • 111
  • 108
  • 89
  • 87
  • 80
  • 75
  • 74
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

DRIVING-SCENE IMAGE CLASSIFICATION USING DEEP LEARNING NETWORKS: YOLOV4 ALGORITHM

Rahman, Muhammad Tamjid January 2022 (has links)
The objective of the thesis is to explore an approach of classifying and localizing different objects from driving-scene images using YOLOv4 algorithm trained on custom dataset.  YOLOv4, a one-stage object detection algorithm, aims to have better accuracy and speed. The deep learning (convolutional) network based classification model was trained and validated on a subject of SODA10M dataset annotated with six different classes of objects (Car, Cyclist, Truck, Bus, Pedestrian, and Tricycle), which are the most seen objects on the road. Another model based on YOLOv3 (the previous version of YOLOv4) will be trained on the same dataset and the performance will be compared with the YOLOv4 model. Both algorithms are fast but have difficulty detecting some objects, especially the small objects. Larger quantities of properly annotated training data can improve the algorithm's performance accuracy.
182

Camera Based Deep Learning Algorithms with Transfer Learning in Object Perception

Hu, Yujie January 2021 (has links)
The perception system is the key for autonomous vehicles to sense and understand the surrounding environment. As the cheapest and most mature sensor, monocular cameras create a rich and accurate visual representation of the world. The objective of this thesis is to investigate if camera-based deep learning models with transfer learning technique can achieve 2D object detection, License Plate Detection and Recognition (LPDR), and highway lane detection in real time. The You Only Look Once version 3 (YOLOv3) algorithm with and without transfer learning is applied on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset for cars, cyclists, and pedestrians detection. This application shows that objects could be detected in real time and the transfer learning boosts the detection performance. The Convolutional Recurrent Neural Network (CRNN) algorithm with a pre-trained model is applied on multiple License Plate (LP) datasets for real-time LP recognition. The optimized model is then used to recognize Ontario LPs and achieves high accuracy. The Efficient Residual Factorized ConvNet (ERFNet) algorithm with transfer learning and a cubic spline model are modified and implemented on the TuSimple dataset for lane segmentation and interpolation. The detection performance and speed are comparable with other state-of-the-art algorithms. / Thesis / Master of Applied Science (MASc)
183

AUTONOMOUS SAFE LANDING ZONE DETECTION FOR UAVs UTILIZING MACHINE LEARNING

Nepal, Upesh 01 May 2022 (has links)
One of the main challenges of the integration of unmanned aerial vehicles (UAVs) into today’s society is the risk of in-flight failures, such as motor failure, occurring in populated areas that can result in catastrophic accidents. We propose a framework to manage the consequences of an in-flight system failure and to bring down the aircraft safely without causing any serious accident to people, property, and the UAV itself. This can be done in three steps: a) Detecting a failure, b) Finding a safe landing spot, and c) Navigating the UAV to the safe landing spot. In this thesis, we will look at part b. Specifically, we are working to develop an active system that can detect landing sites autonomously without any reliance on UAV resources. To detect a safe landing site, we are using a deep learning algorithm named "You Only Look Once" (YOLO) that runs on a Jetson Xavier NX computing module, which is connected to a camera, for image processing. YOLO is trained using the DOTA dataset and we show that it can detect landing spots and obstacles effectively. Then by avoiding the detected objects, we find a safe landing spot. The effectiveness of this algorithm will be shown first by comprehensive simulations. We also plan to experimentally validate this algorithm by flying a UAV and capturing ground images, and then applying the algorithm in real-time to see if it can effectively detect acceptable landing spots.
184

Methods and Algorithms for Efficient Programming of FPGA-based Heterogeneous Systems for Object Detection

Kalms, Lester 14 March 2023 (has links)
Nowadays, there is a high demand for computer vision applications in numerous application areas, such as autonomous driving or unmanned aerial vehicles. However, the application areas and scenarios are becoming increasingly complex, and their data requirements are growing. To meet these requirements, it needs increasingly powerful computing systems. FPGA-based heterogeneous systems offer an excellent solution in terms of energy efficiency, flexibility, and performance, especially in the field of computer vision. Due to complex applications and the use of FPGAs in combination with other architectures, efficient programming is becoming increasingly difficult. Thus, developers need a comprehensive framework with efficient automation, good usability, reasonable abstraction, and seamless integration of tools. It should provide an easy entry point, and reduce the effort to learn new concepts, programming languages and tools. Additionally, it needs optimized libraries for the user to focus on developing applications without getting involved with the underlying details. These should be well integrated, easy to use, and cover a wide range of possible use cases. The framework needs efficient algorithms to execute applications on heterogeneous architectures with maximum performance. These algorithms should distribute applications across various nodes with low fragmentation and communication overhead and find a near-optimal solution in a reasonable amount of time. This thesis addresses the research problem of an efficient implementation of object detection applications, their distribution across FPGA-based heterogeneous systems, and methods for automation and integration using toolchains. Within this, the three contributions are the HiFlipVX object detection library, the DECISION framework, and the APARMAP application distribution algorithm. HiFlipVX is an open-source HLS-based FPGA library optimized for performance and resource efficiency. It contains 66 highly parameterizable computer vision functions including neural networks, ideally for design space exploration. It extends the OpenVX standard for feature extraction, which is challenging due to unknown element size at design time. All functions are streaming capable to achieve maximum performance by increasing parallelism and reducing off-chip memory access. It does not require external or vendor libraries, which eases project integration, device coverage, and vendor portability, as shown for Intel. The library consumed on average 0.39% FFs and 0.32% LUTs for a set of image processing functions compared to a vendor library. A HiFlipVX implementation of the AKAZE feature detector computes between 3.56 and 4.13 times more pixels per second than the related work, while its resource consumption is comparable to optimized VHDL designs. Its neural network extension achieved a speedup of 3.23 for an AlexNet layer in comparison to a related work, while consuming 73% less on-chip memory. Furthermore, this thesis proposes an improved feature extraction implementation that achieves a repeatability of 72.57% when weighting complex cases, while the next best algorithm only achieves 62.99 %. DECISION is a framework consisting of two toolchains for the efficient programming of FPGA-based heterogeneous systems. Both integrate HiFlipVX and use a joint OpenVXbased frontend to implement computer vision applications. It abstracts the underlying hardware and algorithm details while covering a wide range of architectures and applications. The first toolchain targets x86-based systems consisting of CPUs, GPUs, and FPGAs using OpenCL (Open Computing Language). To create a heterogeneous schedule, it considers device profiles, kernel profiles and estimates, and FPGA dataflow characteristics. It manages synchronization, memory transfers and data coherence at design time. It creates a runtime optimized program which excels by its high parallelism and a low overhead. Additionally, this thesis looks at the integration of OpenCL-based libraries, automatic OpenCL kernel generation, and OpenCL kernel optimization and comparison for different architectures. The second toolchain creates an application specific and adaptive NoC-based architecture. The streaming-optimized architecture enables the reusability of vision functions by multiple applications to improve the resource efficiency while maintaining high performance. For a set of example applications, the resource consumption was more than halved, while its overhead was only 0.015% in terms of performance. APARMAP is an application distribution algorithm for partition-based and mesh-like FPGA topologies. It uses a NoC (Network-on-Chip) as communication infrastructure to connect reconfigurable regions and generate an application-specific hardware architecture. The algorithm uses load balancing techniques to find reasonable solutions within a predictable and scalable amount of time. It optimizes solutions using various heuristics, such as Simulated Annealing and Tabu Search. It uses a multithreaded grid-based approach to prevent threads from calculating the same solution and getting stuck in local minimums. Its constraints and objectives are the FPGA resource utilization, NoC bandwidth consumption, NoC hop count, and execution time of the proposed algorithm. The evaluation showed that the algorithm can deal with heterogeneous and irregular host graph topologies. The algorithm showed a good scalability in terms of computation time for an increasing number of nodes and partitions. It was able to achieve an optimal placement for a set of example graphs up to a size of 196 nodes on host graphs of up to 49 partitions. For a real application with 271 nodes and 441 edges, it was able to achieve a distribution with low resource fragmentation in an average time of 149 ms.
185

Automated Foreign Object Detection on Conveyor Belts

Sundelius, Kim January 2023 (has links)
Ore is transported using belt conveyor systems. The transported ore has various anomalous objects that must be removed to prevent damage to the system. Currently anomalies are detected manually using humans. This leads to increased costs of wages and damage to the system overmissed anomalies. The thesis aims to solve this problem via the use of trained neural networks which can run on relatively cheap systems with a greater accuracy than humans. A set of neural networks were trained on both the BCS dataset consisting of data collected from the belt conveyor system and on the MVTec dataset. The latter dataset was used as a way of checking the correctness of the implementation of the models. As training neural networks usually requires large datasets, this thesis also focuses on the effect of the portion of labelled versus unlabelled data on the models. Labelling data can be time consuming and expensive so investigating if or how much data can be unlabelled without any or minimal loss to accuracy could lead to further cost reductions. The convolutional autoencoder (CAE) performed best on the classification based task on the BCS dataset where it managed to classify most of the dataset correctly, with an F1-score of 0.94 on data without anomalies and an F1-score of 0.86 on data with anomalies, as long as suitable thresholds were set. ResNet performed somewhat well with a 0.91 F1-score in detecting anomaly free data and a 0.50 F1-score in detecting anomaly containing data. The SimCLR and SimCLRv2 models were unable to learn from the data and defaulted to always assuming the data contained anomalies. The CAE model trained using the L1 loss function performed best with an IoU of 0.272 and performed worst with the SSIM based loss function with an IoU of 0.160. The effect of labelled versus unlabelled data using the MVTec dataset was tested using the SimCLR and SimCLRv2 models and the models performed best with the fully labelled dataset which was expected. The SimCLR model was able to identify all categories with an F1-score greater than 0.67 whereas the other splits performed worse overall with two or more categories completely misclassified. The SimCLRv2 was able to classify six categories with an F1-score greater than 0.0 which was significantly better than all other labelled and unlabelled splits.
186

Object Detection and Classification Based on Point Separation Distance Features of Point Cloud Data

Ji, Jiajie 07 August 2023 (has links)
No description available.
187

Enhancing Object Detection Methods by Knowledge Distillation for Automotive Driving in Real-World Settings

Kian, Setareh 07 August 2023 (has links)
No description available.
188

Incorporating spatial relationship information in signal-to-text processing

Davis, Jeremy Elon 13 May 2022 (has links) (PDF)
This dissertation outlines the development of a signal-to-text system that incorporates spatial relationship information to generate scene descriptions. Existing signal-to-text systems generate accurate descriptions in regards to information contained in an image. However, to date, no signalto- text system incorporates spatial relationship information. A survey of related work in the fields of object detection, signal-to-text, and spatial relationships in images is presented first. Three methodologies followed by evaluations were conducted in order to create the signal-to-text system: 1) generation of object localization results from a set of input images, 2) derivation of Level One Summaries from an input image, and 3) inference of Level Two Summaries from the derived Level One Summaries. Validation processes are described for the second and third evaluations, as the first evaluation has been previously validated in the related original works. The goal of this research is to show that a signal-to-text system that incorporates spatial information results in more informative descriptions of the content contained in an image. An additional goal of this research is to demonstrate the signal-to-text system can be easily applied to additional data sets, other than the sets used to train the system, and achieve similar results to the training sets. To achieve this goal, a validation study was conducted and is presented to the reader.
189

Image Analysis For Plant Phenotyping

Enyu Cai (15533216) 17 May 2023 (has links)
<p>Plant phenotyping focuses on the measurement of plant characteristics throughout the growing season, typically with the goal of evaluating genotypes for plant breeding and management practices related to nutrient applications. Estimating plant characteristics is important for finding the relationship between the plant's genetic data and observable traits, which is also related to the environment and management practices. Recent machine learning approaches provide promising capabilities for high-throughput plant phenotyping using images. In this thesis, we focus on estimating plant traits for a field-based crop using images captured by Unmanned Aerial Vehicles (UAVs). We propose a method for estimating plant centers by transferring an existing model to a new scenario using limited ground truth data. We describe the use of transfer learning using a model fine-tuned for a single field or a single type of plant on a varied set of similar crops and fields. We introduce a method for rapidly counting panicles using images acquired by UAVs. We evaluate three different deep neural network structures for panicle counting and location. We propose a method for sorghum flowering time estimation using multi-temporal panicle counting. We present an approach that uses synthetic training images from generative adversarial networks for data augmentation to enhance the performance of sorghum panicle detection and counting. We reduce the amount of training data for sorghum panicle detection via semi-supervised learning. We create synthetic sorghum and maize images using diffusion models. We propose a method for tomato plant segmentation by color correction and color space conversion. We also introduce the methods for detecting and classifying bacterial tomato wilting from images.</p>
190

Feature Construction Using Evolution-COnstructed Features for General Object Recognition

Lillywhite, Kirt D. 05 March 2012 (has links) (PDF)
Object recognition is a well studied but extremely challenging field. Human detection is an especially important part of object recognition as it has played a role in machine and human interaction, biometrics, unmanned vehicles, as well as tracking and surveillance. We first present a hardware implementation of the successful Histograms of Oriented Gradients (HOG) method for human detection. The implementation significantly speeds up the method achieving 38 frames a second on VGA video while testing 11,160 sliding windows per frame. The accuracy remains comparable to the CPU implementation. Analysis of the HOG method and other popular object recognition methods led to a novel approach for object detection using a feature construction method called Evolution-COnstructed (ECO) features. Most other approaches rely on human experts to construct features for object recognition. ECO features are automatically constructed by uniquely employing a standard genetic algorithm to discover series of transforms that are highly discriminative. Using ECO features provides several advantages over other object detection algorithms including: no need for a human expert to build feature sets or tune their parameters, ability to generate specialized feature sets for different objects, and no limitations to certain types of image sources. We show in our experiments that ECO features perform better or comparable with state-of-the-art object recognition algorithms making it the first feature construction method to compete with features created by human experts at general object recognition. An analysis is given of ECO features which includes a visualization of ECO features and improvements made to the algorithm.

Page generated in 0.0592 seconds