Spelling suggestions: "subject:"abject detection"" "subject:"6bject detection""
381 |
Air Reconnaissance Analysis using Convolutional Neural Network-based Object DetectionFasth, Niklas, Hallblad, Rasmus January 2020 (has links)
The Swedish armed forces use the Single Source Intelligent Cell (SSIC), developed by Saab, for analysis of aerial reconnaissance video and report generation. The analysis can be time-consuming and demanding for a human operator. In the analysis workflow, identifying vehicles is an important part of the work. Artificial Intelligence is widely used for analysis in many industries to aid or replace a human worker. In this paper, the possibility to aid the human operator with air reconnaissance data analysis is investigated, specifically, object detection for finding cars in aerial images. Many state-of-the-art object detection models for vehicle detection in aerial images are based on a Convolutional Neural Network (CNN) architecture. The Faster R-CNN- and SSD-based models are both based on this architecture and are implemented. Comprehensive experiments are conducted using the models on two different datasets, the open Video Verification of Identity (VIVID) dataset and a confidential dataset provided by Saab. The datasets are similar, both consisting of aerial images with vehicles. The initial experiments are conducted to find suitable configurations for the proposed models. Finally, an experiment is conducted to compare the performance of a human operator and a machine. The results from this work prove that object detection can be used to supporting the work of air reconnaissance image analysis regarding inference time. The current performance of the object detectors makes applications, where speed is more important than accuracy, most suitable.
|
382 |
APPLYING UAVS TO SUPPORT THE SAFETY IN AUTONOMOUS OPERATED OPEN SURFACE MINESHamren, Rasmus January 2021 (has links)
Unmanned aerial vehicle (UAV) is an expanding interest in numerous industries for various applications. Increasing development of UAVs is happening worldwide, where various sensor attachments and functions are being added. The multi-function UAV can be used within areas where they have not been managed before. Because of their accessibility, cheap purchase, and easy-to-use, they replace expensive systems such as helicopters- and airplane-surveillance. UAV are also being applied into surveillance, combing object detection to video-surveillance and mobility to finding an object from the air without interfering with vehicles or humans ground. In this thesis, we solve the problem of using UAV on autonomous sites, finding an object and critical situation, support autonomous site operators with an extra safety layer from UAVs camera. After finding an object on such a site, uses GPS-coordinates from the UAV to see and place the detected object on the site onto a gridmap, leaving a coordinate-map to the operator to see where the objects are and see if the critical situation can occur. Directly under the object detection, reporting critical situations can be done because of safety-distance-circle leaving warnings if objects come to close to each other. However, the system itself only supports the operator with extra safety and warnings, leaving the operator with the choice of pressing emergency stop or not. Object detection uses You only look once (YOLO) as main object detection Neural Network (NN), mixed with edge-detection for gaining accuracy during bird-eye-views and motion-detection for supporting finding all object that is moving on-site, even if UAV cannot find all the objects on site. Result proofs that the UAV-surveillance on autonomous site is an excellent way to add extra safety on-site if the operator is out of focus or finding objects on-site before startup since the operator can fly the UAV around the site, leaving an extra-safety-layer of finding humans on-site before startup. Also, moving the UAV to a specific position, where extra safety is needed, informing the operator to limit autonomous vehicles speed around that area because of humans operation on site. The use of single object detection limits the effects but gathered object detection methods lead to a promising result while printing those objects onto a global positions system (GPS) map has proposed a new field to study. It leaves the operator with a viewable interface outside of object detection libraries.
|
383 |
Detection of crack-like indications in digital radiography by global optimisation of a probabilistic estimation functionAlekseychuk, Oleksandr 10 May 2006 (has links)
A new algorithm for detection of longitudinal crack-like indications in radiographic images is developed in this work. Conventional local detection techniques give unsatisfactory results for this task due to the low signal to noise ratio (SNR ~ 1) of crack-like indications in radiographic images. The usage of global features of crack-like indications provides the necessary noise resistance, but this is connected with prohibitive computational complexities of detection and difficulties in a formal description of the indication shape. Conventionally, the excessive computational complexity of the solution is reduced by usage of heuristics. The heuristics to be used, are selected on a trial and error basis, are problem dependent and do not guarantee the optimal solution. Not following this way is a distinctive feature of the algorithm developed here. Instead, a global characteristic of crack-like indication (the estimation function) is used, whose maximum in the space of all possible positions, lengths and shapes can be found exactly, i.e. without any heuristics. The proposed estimation function is defined as a sum of a posteriori information gains about hypothesis of indication presence in each point along the whole hypothetical indication. The gain in the information about hypothesis of indication presence results from the analysis of the underlying image in the local area. Such an estimation function is theoretically justified and exhibits a desirable behaviour on changing signals. The developed algorithm is implemented in the C++ programming language and testet on synthetic as well as on real images. It delivers good results (high correct detection rate by given false alarm rate) which are comparable to the performance of trained human inspectors. / In dieser Arbeit wurde ein neuer Algorithmus zur Detektion rissartiger Anzeigen in der digitalen Radiographie entwickelt. Klassische lokale Detektionsmethoden versagen wegen des geringen Signal-Rausch-Verhältnisses (von ca. 1) der Rissanzeigen in den Radiographien. Die notwendige Resistenz gegen Rauschen wird durch die Benutzung von globalen Merkmalen dieser Anzeigen erzielt. Das ist aber mit einem undurchführbaren Rechenaufwand sowie Problemen bei der formalen Beschreibung der Rissform verbunden. Üblicherweise wird ein übermäßiger Rechenaufwand bei der Lösung vergleichbarer Probleme durch Anwendung von Heuristisken reduziert. Dazu benuzte Heuristiken werden mit der Versuchs-und-Irrtums-Methode ermittelt, sind stark problemangepasst und können die optimale Lösung nicht garantieren. Das Besondere dieser Arbeit ist anderer Lösungsansatz, der jegliche Heuristik bei der Suche nach Rissanzeigen vermeidet. Ein globales wahrscheinlichkeitstheoretisches Merkmal, hier Schätzfunktion genannt, wird konstruiert, dessen Maximum unter allen möglichen Formen, Längen und Positionen der Rissanzeige exakt (d.h. ohne Einsatz jeglicher Heuristik) gefunden werden kann. Diese Schätzfunktion wird als die Summe des a posteriori Informationsgewinns bezüglich des Vorhandenseins eines Risses im jeden Punkt entlang der hypothetischen Rissanzeige definiert. Der Informationsgewinn entsteht durch die Überprüfung der Hypothese der Rissanwesenheit anhand der vorhandenen Bildinformation. Eine so definierte Schätzfunktion ist theoretisch gerechtfertigt und besitzt die gewünschten Eigenschaften bei wechselnder Anzeigenintensität. Der Algorithmus wurde in der Programmiersprache C++ implementiert. Seine Detektionseigenschaften wurden sowohl mit simulierten als auch mit realen Bildern untersucht. Der Algorithmus liefert gute Ergenbise (hohe Detektionsrate bei einer vorgegebenen Fehlalarmrate), die jeweils vergleichbar mit den Ergebnissen trainierter menschlicher Auswerter sind.
|
384 |
Comparing CNN methods for detection and tracking of ships in satellite images / Jämförelse av CNN-baserad machine learning för detektion och spårning av fartyg i satellitbilderTorén, Rickard January 2020 (has links)
Knowing where ships are located is a key factor to support safe maritime transports, harbor management as well as preventing accidents and illegal activities at sea. Present international solutions for geopositioning in the maritime domain exist such as the Automatic Identification System (AIS). However, AIS requires the ships to constantly transmit their location. Real time imaginary based on geostationary satellites has recently been proposed to complement the existing AIS system making locating and tracking more robust. This thesis investigated and compared two machine learning image analysis approaches – Faster R-CNN and SSD with FPN – for detection and tracking of ships in satellite images. Faster R-CNN is a two stage model which first proposes regions of interest followed by detection based on the proposals. SSD is a one stage model which directly detects objects with the additional FPN for better detection of objects covering few pixels. The MAritime SATellite Imagery dataset (MASATI) was used for training and evaluation of the candidate models with 5600 images taken from a wide variety of locations. The TensorFlow Object Detection API was used for the implementation of the two models. The results for detection show that Faster R-CNN achieved a 30.3% mean Average Precision (mAP) while SSD with FPN achieved only 0.0005% mAP on the unseen test part of the dataset. This study concluded that Faster R-CNN is a candidate for identifying and tracking ships in satellite images. SSD with FPN seems less suitable for this task. It is also concluded that the amount of training and choice of hyper-parameters impacted the results.
|
385 |
Real-time Object Detection on Raspberry Pi 4 : Fine-tuning a SSD model using Tensorflow and Web ScrapingFerm, Oliwer January 2020 (has links)
Edge AI is a growing area. The use of deep learning on low cost machines, such as the Raspberry Pi, may be used more than ever due to the easy use, availability, and high performance. A quantized pretrained SSD object detection model was deployed to a Raspberry Pi 4 B to evaluate if the throughput is sufficient for doing real-time object recognition. With input size of 300x300, an inference time of 185 ms was obtained. This is an improvement as of the previous model; Raspberry Pi 3 B+, 238 ms with a input size of 96x96 which was obtained in a related study. Using a lightweight model is for the benefit of higher throughput as a trade-off for lower accuracy. To compensate for the loss of accuracy, using transfer learning and tensorflow, a custom object detection model has been trained by fine-tuning a pretrained SSD model. The fine-tuned model was trained on images scraped from the web with people in winter landscape. The pretrained model was trained to detect different objects, including people in various environments. Predictions shows that the custom model performs significantly better doing detections on people in snow. The conclusion from this is that web scraping can be used for fine-tuning a model. However, the images scraped is of bad quality and therefore it is important to thoroughly clean and select which images that is suitable to keep, given a specific application. / Användning av djupinlärning på lågkostnadsmaskiner, som Raspberry Pi, kan idag mer än någonsin användas på grund av enkel användning, tillgänglighet, och hög prestanda. En kvantiserad förtränad SSD-objektdetekteringsmodell har implementerats på en Raspberry Pi 4 B för att utvärdera om genomströmningen är tillräcklig för att utföra realtidsobjektigenkänning. Med en ingångsupplösning på 300x300 pixlar erhölls en periodtid på 185 ms. Detta är en stor förbättring med avseende på prestanda jämfört med den tidigare modellen; Raspberry Pi 3 B+, 238 ms med en ingångsupplösning på 96x96 som erhölls i en relaterad studie. Att använda en kvantiserad modell till förmån för hög genomströmning bidrar till lägre noggrannhet. För att kompensera för förlusten av noggrannhet har, med hjälp av överföringsinlärning och Tensorflow, en skräddarsydd modell tränats genom att finjustera en färdigtränad SSD-modell. Den finjusterade modellen tränas på bilder som skrapats från webben med människor i vinterlandskap. Den förtränade modellen var tränad att känna igen olika typer av objekt, inklusive människor i olika miljöer. Förutsägelser visar att den skräddarsydda modellen detekterar människor med bättre precision än den ursprungliga. Slutsatsen härifrån är att webbskrapning kan användas för att finjustera en modell. Skrapade bilder är emellertid av dålig kvalitet och därför är det viktigt att rengöra all data noggrant och välja vilka bilder som är lämpliga att behålla gällande en specifik applikation.
|
386 |
AI-vision som tillämpning i en stålindustri : Med inriktning på objektdetektering & bildklassificeringWenger, Jakob January 2020 (has links)
I takt med att industri 4.0 sveper över dagens industrier så utvecklas tillämpningsområden inom artificiell intelligens (AI). En relativt nyfunnen tillämpning som vanligen benämns AI-vision eller Computer-vision, inom detta arbete har benämningen AI-vision valts. Tillämpningen handlar om att datorer och maskiner upprättas med förmågan att tolka visuellt innehåll.I och med detta tränas en intelligent modell som klarar av att fatta beslut utifrån visuell data, såsom bild och video. Inriktningen i arbetet belyser inom AI-Vision teknikerna objektdetektering och bildklassificering. Objektdetektering innebär att ett eller flera specifika objekt upptäcks från en bild av flera komplexa linjer och former. Tekniken används inom en rad olika tillämpningar såsom t.ex. robotnavigering och automatisk fordonsstyrning. Syftet med bildklassificering ibland kallat bildigenkänning, handlar om att klassificera och kategorisera bilden genom att identifiera och sortera väsentlig data. Detta i försök att konstatera vad bilden i sig föreställer. För att forma och rama in detta arbete på ett lämpligt sätt ämnas huvudsakliga målet med arbetet beskriva hur tekniker såsom objektdetektering och bildklassificerings-modeller konstrueras. Så även redogöra kring bakomliggande intelligens i modellerna, samt vilka verktyg och metoder som används för att skapa dessa modeller. Arbetet syftar även till att presentera presumtiva tillämpningar inom en stålindustri, därför kommer förslag till applikationer framföras. I resultatdelen av arbetet presenteras i huvudsak uppbyggnaden av en objekdetekteringsapplikation som hanterar personsäkerhet och i diskussionsdelen framhävs vidare förslag till applikationer. Detta avses lägga grund för eventuell implementation i verkliga produktionsutrustningar i framtiden. / As Industry 4.0 sweeps across today's industries, applications within artificial intelligence (AI) are developing. A relatively new application that is commonly called AI-vision or sometimes Computer-vision, in this study the term AI-vision is used. The application is about making computers and machines visually inclined. With this, an intelligent model is trained that can make decisions based on visual data, such as image and video. The orientation in this study within AI-Vision, is to highlight object detection and image classification. Object detection defines as follows, one or more specific objects are detected from an image of several complex lines and shapes. The technology is used in a variety of applications such as robot navigation and automatic vehicle control. The purpose of image classification, sometimes called image recognition, is to classify and categorize the image by identifying and sorting essential data. This in attempt to ascertain what the image itself represents. In order to frame this work in an appropriate way, the main quest of this thesis is to describe how techniques such as Object Detection and Image Classification models are constructed. Explain the underlying intelligence in the models as well as what tools and methods are used to create these models. As the thesis also alludes to present prospective applications in a steel industry, proposals of specific applications will be presented. The results section mainly presents an Object Detection application that handles personal safety and drafts to applications is presented in the discussion section. This work intends to contribute for possible implementation in production equipment in the future.
|
387 |
A Smart Surveillance System Using Edge-Devices for Wildlife Preservation in Animal SanctuariesLinder, Johan, Olsson, Oscar January 2022 (has links)
The Internet of Things is a constantly developing field. With advancements of algorithms for object detection and classification for images and videos, the possibilities of what can be made with small and cost efficient edge-devices are increasing. This work presents how camera traps and deep learning can be utilized for surveillance in remote environments, such as animal sanctuaries in the African Savannah. The camera traps connect to a smart surveillance network where images and sensor-data are analysed. The analysis can then be used to produce valuable information, such as the location of endangered animals or unauthorized humans, to park rangers working to protect the wildlife in these animal sanctuaries. Different motion detection algorithms are tested and evaluated based on related research within the subject. The work made in this thesis builds upon two previous theses made within Project Ngulia. The implemented surveillance system in this project consists of camera sensors, a database, a REST API, a classification service, a FTP-server and a web-dashboard for displaying sensor data and resulting images. A contribution of this work is an end-to-end smart surveillance system that can use different camera sources to produce valuable information to stakeholders. The camera software developed in this work is targeting the ESP32 based M5Stack Timer Camera and runs a motion detection algorithm based on Self-Organizing Maps. This improves the selection of data that is fed to the image classifier on the server. This thesis also contributes with an algorithm for doing iterative image classifications that handles the issues of objects taking up small parts of an image, making them harder to classify correctly.
|
388 |
Training a computer vision model using semi-supervised learning and applying post-training quantizationsVedin, Albernn January 2022 (has links)
Electrical scooters have gained a lot of attention and popularity among commuters all around the world since they entered the market. After all, electrical scooters have shown to be efficient and cost-effective mode of transportation for commuters and travelers. As of today electrical scooters have firmly established themselves in the micromobility industry, with an increasing global demand. Although, as the industry is booming so are the accidents as well as getting into dangerous situations of riding electrical scooters. There is a growing concern regarding the safety of the scooters where more and more people are getting injured. This research focuses on training a computer vision model using semi-supervised learning to help detect traffic rule violations and also prevent collisions for people using electrical scooters. However, applying a computer vision model on an embedded system can be challenging due to the limited capabilities of the hardware. This is where the model can enable post-training quantizations. This thesis examines which post-training quantization has the best performance and if it can perform better compared to the non-quantized model. There are three post-training quantizations that are being applied to the model, dynamic range, full integer and float16 post-training quantizations. The results showed that the non-quantized model achieved a mean average precision (mAP) of 0.03894 with a mean average training and validation loss of 22.10 and 28.11. The non-quantized model was compared with the three post-training quantizations in terms of mAP where the dynamic range post-training quantization achieve the best performance with a mAP of 0.03933.
|
389 |
Cross-layer optimization for joint visual-inertial localization and object detection on resource-constrained devicesBaldassari, Elisa January 2021 (has links)
The expectations in performing high-performance cyber-physical applications in resource-constrained devices are continuously increasing. The available hardware is still a main limitation in this context, both in terms of computation capability and energy limits. On the other hand, one must ensure the robust and accurate execution of the applications deployed, since their failure may entail risks for humans and the surrounding environment. The limits and risks are enhanced when multiple applications are executed on the same device. The focus of this thesis is to provide a trade-off between the required performance and power consumption. The focus is on two fundamental applications in the mobile autonomous vehicles scenario: localization and object detection. The multi-objective optimization is performed in a cross-layer manner, exploring both applications and platform configurable parameters with Design Space Exploration (DSE). The focus is on localization and detection accuracy, detection latency and power consumption. Predictive models are designed to estimate the metrics of interest and ensure robust execution, excluding potential faulty configurations from the design space. The research is approached empirically, performing tests on the Nvidia Jetson AGX and NX platforms. Results show that optimal configurations for a single application are in general sub-optimal or faulty for the concurrent execution case, while the opposite is sometimes applicable. / Resursbegränsade enheter förväntas utföra mer och mer krävande cyberfysiska program. Hårdvaran är en av de huvudsakliga begränsningarna både vad gäller beräkningshastighet och energigränser. Samtidigt måste programmen som körs vara robusta och noggranna, eftersom ett fel kan påverka människor och deras omgivning. När flera program körs på samma enhet blir både begränsningar och risker större. Den här avhandlingen fokuserar på att göra en avvägning mellan krav på prestanda och energiförbrukning för två tillämpningar inom området autonoma fordon: lokalisering och objektigenkänning. Med hjälp av Design Space Exploration (DSE) utforskas parametrar både i applikationerna och på plattformen genom att utföra tvärlageroptimering med flera mål. Lokaliserings- och detekteringsnoggrannhet, fördröjning i igenkänning och energiförbrukning är egenskaper i fokus. Prediktiva modeller designas för att estimera måtten som är av intresse och garantera robust körning genom att utesluta potentiellt felaktiga konfigurationer. Empirisk forskning görs med tester på Nvidia Jetson AGXoch NX-plattformarna. Resultaten visar att de optimala konfigurationerna för ett enda program i allmänhet är suboptimala eller felaktiga vid körning av flera program samtidigt, medan motsatsen ibland är tillämplig.
|
390 |
Hybrid pool based deep active learning for object detection using intermediate network embeddingsMarbinah, Johan January 2021 (has links)
With the advancements in deep learning, object detection networks have become more robust. Nevertheless, a challenge with training deep networks is finding enough labelled training data for the model to perform well, due to constraints associated with acquiring relevant data. For this reason, active learning is used to minimize the cost by sampling the unlabeled samples that increase the performance the most. In the field of object detection, few works have been done in exploring effective hybrid active learning strategies that exploit the intermediate feature embeddings in neural networks. In this work, hybrid active learning methods are proposed and tested, using various uncertainty sampling techniques and the well-respected core-set method as the representative strategy. In addition, experiments are conducted with network embeddings to find a suitable strategy to model representation of all available samples. Experiments show mixed outcomes as to whether hybrid methods perform better than the core-set method used separately. / Med framstegen inom djupinlärning, har neurala nätverk för objektdetektering blivit mer robusta. En utmaning med att träna djupa neurala nätverk är att hitta en tillräcklig mängd träningsdata för att ett nätverk ska prestera bra, med tanke på de begränsningar som är förknippade med anskaffningen av relevant data. Av denna anledning används aktiv maskininlärning för att minimera kostnaden med att förvärva nya datapunkter, genom att göra kontinuerliga urval av de omärkta bilder som ökar prestandan mest. När det gäller objektsdetektering har få arbeten gjorts för att utforska effektiva hybridstrategier som utnyttjar de mellanliggande lagren som finns i ett neuralt nätverk. I det här arbetet föreslås och testas hybridmetoder i kontext av aktiv maskininlärning med hjälp av olika tekniker för att göra urval av datamängder baserade på osäkerhetsberäkningar men även beräkningar med hänsyn till representation (core-set-metoden). Dessutom utförs experiment med mellanliggande nätverksinbäddningar för att hitta en lämplig strategi för att modellera representation av alla tillgängliga bilder i datasetet. Experimenten visar blandade resultat när det gäller huruvida hybridmetoderna presterar bättre i jämförelse med seperata aktiv maskininlärning strategier där core-set metoden inte används.
|
Page generated in 0.4977 seconds