Global ETD Search

251	Objektföljning med roterbar kamera / Object tracking with rotatable camera Zetterlund, Joel January 2021 (has links) Idag är det vanligt att det sker filmning av evenemang utan att man använder sig av en professionell videofotograf. Det kan vara knatteligans fotbollsmatch, konferensmöten, undervisning eller YouTube-klipp. För att slippa ha en kameraman kan man använda sig av något som kallas för objektföljningskameror. Det är en kamera som kan följa ett objekts position över tid utan att en kameraman styr kameran. I detta examensarbete beskrivs hur objektföljning fungerar, samt görs en jämförelse mellan objektföljningskameror med datorseende och en kameraman. För att kunna jämföra de mot varandra har en prototyp byggts. Prototypen består av en Raspberry Pi 4B med MOSSE som är en objektföljningsalgoritm och SSD300 som är en detekteringsalgoritm inom datorseende. Styrningen består av en gimbal som består av tre borstlösa motorer som styr kameran med en regulator. Resultatet blev en prototyp som klarar av att följa en person som promenerar i maximalt 100 pixlar per sekund eller 1 meter per sekund i helbild, med en maxdistans på 11,4 meter utomhus. Medan en kameraman klarar av att följa en person i 300–800 pixlar per sekund eller 3 meter per sekund. Prototypen är inte lika bra som en ka-meraman men kan användas för att följa en person som undervisar och går långsamt. Under förutsättningen att prototypen är robust vilket inte är fallet. För att få bättre resultat behövs starkare processor och bättre algoritmer än som använts med prototypen. Då ett stort problem var att uppdateringshastigheten var låg för detekteringsalgoritmen. / Today, it is common for events to be filmed without the use of a professional video photographer. It can be the little league football game, conference meetings, teaching or YouTube clips. To film without a cameraman, you can use something called object tracking cameras. It is a camera that can follow an object's position without a cameraman.This thesis describes how object tracking works as well as comparison between ob-ject tracking cameras with computer vision and a cameraman. In order to compare them against each other, a prototype has been developed. The prototype consists of a Raspberry Pi 4B with MOSSE which is an object tracking algorithm and SSD300 which is a detection algorithm in computer vision. The steering consists of a gimbal consisting of three brushless motors that control the camera with a regulator. The result was a prototype capable of following a person walking at a maximum speed 100 pixels per second or 1 meter per second in full screen, with a maximum distance of 11.4 meters outdoors. While a cameraman managed to follow a person at 300-800 pixels per second or 3 meters per second. The prototype is not as good as a cameraman but can be used to follow a person who teaches and walks slowly. Under basis that the prototype is robust, which is not the case. To get better results, stronger processor and better algorithms are needed than used with the prototype. That’s because a big problem was that the refresh rate was low for the detection algorithm. tracking camera computer vision object detection objektföljning robotkamera datorseende objektdetektering
252	MACHINE LEARNING FOR RESILIENT AND SUSTAINABLE ENERGY SYSTEMS UNDER CLIMATE CHANGE Min Soo Choi (16790469) 07 August 2023 (has links) <p>Climate change is recognized as one of the most significant challenge of the 21st century. Anthropogenic activities have led to a substantial increase in greenhouse gases (GHGs) since the Industrial Revolution, with the energy sector being one the biggest contributors globally. The energy sector is now facing unique challenges not only due to decarbonization goals but also due to increased risks of climate extremes under climate change. </p><p>This dissertation focuses on leveraging machine learning, specifically utilizing unstructured data such as images, to address many of the unprecedented challenges faced by the energy systems. The dissertation begins (Chapter 1) by providing an overview of the risks posed by climate change to modern energy systems. It then explains how machine learning applications can help with addressing these risks. By harnessing the power of machine learning and unstructured data, this research aims to contribute to the development of more resilient and sustainable energy systems, as described briefly below. </p><p>Accurate forecasting of generation is essential for mitigating the risks associated with the increased penetration of intermittent and non-dispatchable variable renewable energy (VRE). In Chapters 2 and 3, deep learning techniques are proposed to predict solar irradiance, a crucial factor in solar energy generation, in order to address the uncertainty inherent in solar energy. Specifically, Chapter 2 introduces a cost-efficient fully exogenous solar irradiance forecasting model that effectively incorporates atmospheric cloud dynamics using satellite imagery. Building upon the work of Chapter 2, Chapter 3 extends the model to a fully probabilistic framework that not only forecasts the future point value of irradiance but also quantifies the uncertainty of the prediction. This is particularly important in the context of energy systems, as it relates to high-risk decision making.</p><p>While the energy system is a major contributor to GHG emissions, it is also vulnerable to climate change risks. Given the essential role of energy systems infrastructure in modern society, ensuring reliable and sustainable operations is of utmost importance. However, our understanding of reliability analysis in electricity transmission networks is limited due to the lack of access to large-scale transmission network topology datasets. Previous research has mostly relied on proxy or synthetic datasets. Chapter 4 addresses this research gap by proposing a novel deep learning-based object detection method that utilizes satellite images to construct a comprehensive large-scale transmission network dataset.</p> Industrial engineering Solar irradiance forecasting Deep learning Probabilistic forecasting Satellite image Object detection Power Transmission Network Energy system Machine learning
253	DETECTION AND SUB-PIXEL LOCALIZATION OF DIM POINT OBJECTS Mridul Gupta (15426011) 08 May 2023 (has links) <p>Detection of dim point objects plays an important role in many imaging applications such as early warning systems, surveillance, astronomy, and microscopy. In satellite imaging, natural phenomena, such as clouds, can confound object detection methods. We propose an object detection method that uses spatial, spectral, and temporal information to reject detections that are not consistent with a moving object and achieve a high probability of detection with a low false alarm rate. We propose another method for dim object detection using convolutional neural networks (CNN). The method augments a conventional space-based detection processing chain with a lightweight CNN to improve detection performance. For evaluation of the performance of our proposed methods,</p> <p>we used a set of curated satellite images and generated receiver operating characteristics (ROC).</p> <p><br></p> <p>Most satellite images have adequate spatial resolution and signal-to-noise ratio (SNR) for the detection and localization of common large objects, such as buildings. In many applications, the spatial resolution of the imaging system is not enough to localize a point object or two closely-spaced objects (CSOs) that are described by only a few pixels (or less than one pixel). A low signal-to-noise ratio (SNR) increases the difficulty such as when the objects are dim. We describe a method to estimate the objects’ amplitudes and spatial locations with sub-pixel accuracy using non-linear optimization and information from multiple spectral bands. We also propose a machine</p> <p>learning method that minimizes a cost function derived from the maximum likelihood estimation of the observed image to determine an object’s sub-pixel spatial location and amplitude. We derive the Cramer-Rao Lower Bound and compare the proposed estimators’ variance with this bound.</p> Machine Learning Object Localization Object Detection using CNN Infrared Small Target Detection
254	Multitask Deep Learning models for real-time deployment in embedded systems / Deep Learning-modeller för multitaskproblem, anpassade för inbyggda system i realtidsapplikationer Martí Rabadán, Miquel January 2017 (has links) Multitask Learning (MTL) was conceived as an approach to improve thegeneralization ability of machine learning models. When applied to neu-ral networks, multitask models take advantage of sharing resources forreducing the total inference time, memory footprint and model size. Wepropose MTL as a way to speed up deep learning models for applicationsin which multiple tasks need to be solved simultaneously, which is par-ticularly useful in embedded, real-time systems such as the ones foundin autonomous cars or UAVs.In order to study this approach, we apply MTL to a Computer Vi-sion problem in which both Object Detection and Semantic Segmenta-tion tasks are solved based on the Single Shot Multibox Detector andFully Convolutional Networks with skip connections respectively, usinga ResNet-50 as the base network. We train multitask models for twodifferent datasets, Pascal VOC, which is used to validate the decisionsmade, and a combination of datasets with aerial view images capturedfrom UAVs.Finally, we analyse the challenges that appear during the process of train-ing multitask networks and try to overcome them. However, these hinderthe capacity of our multitask models to reach the performance of the bestsingle-task models trained without the limitations imposed by applyingMTL. Nevertheless, multitask networks benefit from sharing resourcesand are 1.6x faster, lighter and use less memory compared to deployingthe single-task models in parallel, which turns essential when runningthem on a Jetson TX1 SoC as the parallel approach does not fit intomemory. We conclude that MTL has the potential to give superior per-formance as far as the object detection and semantic segmentation tasksare concerned in exchange of a more complex training process that re-quires overcoming challenges not present in the training of single-taskmodels. computer vision deep learning multitask learning object detection semantic segmentation embedded systems perception robotics autonomous driving Robotics Robotteknik och automation
255	Partially Observable Markov Decision Processes for Faster Object Recognition Olafsson, Björgvin January 2016 (has links) Object recognition in the real world is a big challenge in the field of computer vision. Given the potentially enormous size of the search space it is essential to be able to make intelligent decisions about where in the visual field to obtain information from to reduce the computational resources needed. In this report a POMDP (Partially Observable Markov Decision Process) learning framework, using a policy gradient method and information rewards as a training signal, has been implemented and used to train fixation policies that aim to maximize the information gathered in each fixation. The purpose of such policies is to make object recognition faster by reducing the number of fixations needed. The trained policies are evaluated by simulation and comparing them with several fixed policies. Finally it is shown that it is possible to use the framework to train policies that outperform the fixed policies for certain observation models. pomdp policy gradient optimal control object detection computer vision information rewards fixation policy observation model Computer Sciences Datavetenskap (datalogi)
256	A Path Planning Approach for Context Aware Autonomous UAVs used for Surveying Areas in Developing Regions / En Navigeringsstrategi för Autonoma Drönare för Utforskning av Utvecklingsregioner Kringberg, Fredrika January 2018 (has links) Developing regions are often characterized by large areas that are poorly reachable or explored. The mapping and census of roaming populations in these areas are often difficult and sporadic. A recent spark in the development of small aerial vehicles has made them the perfect tool to efficiently and accurately monitor these areas. This paper presents an approach to aid area surveying through the use of Unmanned Aerial Vehicles. The two main components of this approach are an efficient on-device deep learning object identification component to capture and infer images with acceptable performance (latency andaccuracy), and a dynamic path planning approach, informed by the object identification component. In particular, this thesis illustrates the development of the path planning component, which exploits potential field methods to dynamically adapt the path based on inputs from the vision system. It also describes the integration work that was performed to implement the approach on a prototype platform, with the aim to achieve autonomous flight with onboard computation. The path planning component was developed with the purpose of gaining information about the populations detected by the object identification component, while considering the limited resources of energy and computational power onboard a UAV. The developed algorithm was compared to navigation using a predefined path, where the UAV does not react to the environment. Results from the comparison show that the algorithm provide more information about the objects of interest, with a very small change in flight time. The integration of the object identification and the path planning components on the prototype platform was evaluated in terms of end-to-end latency, power consumption and resource utilization. Results show that the proposed approach is feasible for area surveying in developing regions. Parts of this work has been published in the workshop of DroNet, collocated with MobiSys, with the title Surveying Areas in Developing Regions Through Context Aware Drone Mobility. Thework was carried out in collaboration with Alessandro Montanari, Alice Valentini, Cecilia Mascoloand Amanda Prorok. / Utvecklingsländer är ofta karaktäriserade av vidsträcka områden som är svåråtkomliga och outforskade. Kartläggning och folkräkning av populationen i dessa områden är svåra uppgifter som sker sporadiskt. Nya framsteg i utvecklingen av små, luftburna fordon har gjort dem till perfekta verktyg för att effektivt och noggrant bevaka dessa områden. Den här rapporten presenterar en strategi för att underlätta utforskning av dessa områden med hjälp av drönare. De två huvudkomponenterna i denna strategi är en effektiv maskininlärningskomponent för objektidentifiering med acceptabel prestanda i avseende av latens och noggrannhet, samt en dynamisk navigeringskomponent som informeras av objektidentifieringskomponenten. I synnerhet illustrerar denna avhandling utvecklingen av navigeringskomponenten, som utnyttjar potentialfält för att dynamiskt anpassa vägen baserat på information från objektidentifieringssystemet. Dessutom beskrivs det integrationsarbete som utförts för att implementera strategin på en prototypplattform, med målet att uppnå autonom flygning med all beräkning utförd ombord. Navigeringskomponenten utvecklades i syfte att maximera informationen om de populationer som upptäckts av objektidentifieringskomponenten, med hänsyn till de begränsade resurserna av energi och beräkningskraft ombord på en drönare. Den utvecklade algoritmen jämfördes med navigering med en fördefinierad väg, där drönaren inte reagerar på omgivningen. Resultat från jämförelsen visar att algoritmen ger mer information om objekten av intresse, med en mycket liten förändring i flygtiden. Integreringen av objektidentifieringskomponenten och navigeringskomponenten på prototypplattformen utvärderades med avseende på latens, strömförbrukning och resursutnyttjande. Resultaten visar att den föreslagna strategin är genomförbar för kartläggning och utforskning av utvecklingsregioner. Delar av detta arbete har publicerats i DroNets workshop, samlokaliserad med MobiSys, med titeln Surveying Areas in Developing Regions Through Context Aware Drone Mobility. Arbetet utfördes i samarbete med Alessandro Montanari, Alice Valentini, Cecilia Mascolo och Amanda Prorok. Unmanned aerial vehicles Autonomous vehicles Area surveying Navigation Path planning Potential field Object detection Engineering and Technology Teknik och teknologier
257	Low-Cost UAV Swarm for Real-Time Object Detection Applications Valdovinos Miranda, Joel 01 June 2022 (has links) (PDF) With unmanned aerial vehicles (UAVs), also known as drones, becoming readily available and affordable, applications for these devices have grown immensely. One type of application is the use of drones to fly over large areas and detect desired entities. For example, a swarm of drones could detect marine creatures near the surface of the ocean and provide users the location and type of animal found. However, even with the reduction in cost of drone technology, such applications result costly due to the use of custom hardware with built-in advanced capabilities. Therefore, the focus of this thesis is to compile an easily customizable, low-cost drone design with the necessary hardware for autonomous behavior, swarm coordination, and on-board object detection capabilities. Additionally, this thesis outlines the necessary network architecture to handle the interconnection and bandwidth requirements of the drone swarm. The drone on-board system uses a PixHawk 4 flight controller to handle flight mechanics, a Raspberry Pi 4 as a companion computer for general-purpose computing power, and a NVIDIA Jetson Nano Developer Kit to perform object detection in real-time. The implemented network follows the 802.11s standard for multi-hop communications with the HWMP routing protocol. This topology allows drones to forward packets through the network, significantly extending the flight range of the swarm. Our experiments show that the selected hardware and implemented network can provide direct point-to-point communications at a range of up to 1000 feet, with extended range possible through message forwarding. The network also provides sufficient bandwidth for bandwidth intensive data such as live video streams. With an expected flight time of about 17 minutes, the proposed design offers a low-cost drone swarm solution for mid-range aerial surveillance applications. FANET Drone Swarm UAV Object detection Swarm Coordination Mesh Network Digital Communications and Networking Robotics
258	VL Tasks: Which Models Suit? : Investigate Different Models for Swedish Image-Text Relation Task / VL-uppgifter: Vilka modeller passar? : Undersök olika modeller för svensk bild-text relationsuppgift Gou, Meinan January 2022 (has links) In common sense, modality measures the number of areas a model covers. Multi-modal or cross-modal models can handle two or more areas simultaneously. Some common cross-models include Vision-Language models, Speech-Language models, and Vision-Speech models. A Vision-Language (VL) model is a network architecture that can interpret both textual and visual inputs, which has always been challenging. Driven by the interest in exploring such an area, this thesis implements several VL models and investigates their performance on a specific VL task: The Image-Text Relation Task. Instead of using English as the context language, the thesis focuses on other languages where the available resources are less. Swedish is chosen as a case study and the results can be extended to other languages. The experiments show that the Transformer style architecture efficiently handles both textual and visual inputs, even trained with simple loss functions. The work suggests an innovative way for future development in cross-modal models, especially for VL tasks. / I vanlig mening är modalitet ett mått på hur många områden en modell täcker. Multimodala eller tvärmodala modeller kan hantera två eller flera områden samtidigt. Några vanliga tvärmodala modeller är vision-språk-modeller, tal-språk-modeller och vision-språk-modeller. En Vision-Language-modell (VL-modell) är en nätverksarkitektur som kan tolka både text- och visuell input samtidigt, vilket alltid har varit en utmaning. I denna avhandling, som drivs av intresset för att utforska ett sådant område, implementeras flera VL-modeller och deras prestanda undersöks på en specifik VL-uppgift: Uppgiften bild-text-relation. I stället för att använda engelska som kontextspråk fokuserar avhandlingen på andra språk där de tillgängliga resurserna är mindre. Svenskan har valts som fallstudie och resultaten kan utvidgas till andra språk. Experimenten visar att arkitekturen i Transformer-stilen effektivt hanterar både text- och visuella indata, även om den tränas med enkla förlustfunktioner. Arbetet föreslår en innovativ väg för framtida utveckling av intermodala modeller, särskilt för VL-uppgifter. BERT Visual-Language Language Understanding Object Detection Multimodality BERT Visual-Language Språkförståelse Objektdetektion Multimodalitet Computer and Information Sciences Data- och informationsvetenskap
259	Federated Learning for edge computing : Real-Time Object Detection Memia, Ardit January 2023 (has links) In domains where data is sensitive or private, there is a great value in methods that can learn in a distributed manner without the data ever leaving the local devices. Federated Learning (FL) has recently emerged as a promising solution to collaborative machine learning challenges while maintaining data privacy. With FL, multiple entities, whether cross-device or cross-silo, can jointly train models without compromising the locality or privacy of their data. Instead of moving data to a central storage system or cloud for model training, code is moved to the data owners’ local sites, and incremental local updates are combined into a global model. In this way FL enhances data pri-vacy and reduces the probability of eavesdropping to a certain extent. In this thesis we have utilized the means of Federated Learning into a Real-Time Object Detection (RTOB) model in order to investigate its performance and privacy awareness towards a traditional centralized ML environment. Several object detection models have been built us-ing YOLO framework and training with a custom dataset for indoor object detection. Local tests have been performed and the most opti-mal model has been chosen by evaluating training and testing metrics and afterwards using NVIDIA Jetson Nano external device to train the model and integrate into a Federated Learning environment using an open-source FL framework. Experiments has been conducted through the path in order to choose the optimal YOLO model (YOLOv8) and the best fitted FL framework to our study (FEDn).We observed a gradual enhancement in balancing the APC factors (Accuracy-Privacy-Communication) as we transitioned from basic lo-cal models to the YOLOv8 implementation within the FEDn system, both locally and on the SSC Cloud production environment. Although we encountered technical challenges deploying the YOLOv8-FEDn system on the SSC Cloud, preventing it from reaching a finalized state, our preliminary findings indicate its potential as a robust foundation for FL applications in RTOB models at the edge. Federated Learning Artificial Intelligence Machine Learning Edge computing Object Detection Decentralized AI Other Computer and Information Science Annan data- och informationsvetenskap
260	Comparison and performance analysis of deep learning techniques for pedestrian detection in self-driving vehicles Botta, Raahitya, Aditya, Aditya January 2023 (has links) Background: Self-driving cars, also known as automated cars are a form of vehicle that can move without a driver or human involvement to control it. They employ numerous pieces of equipment to forecast the car’s navigation, and the car’s path is determined depending on the output of these devices. There are numerous methods available to anticipate the path of self-driving cars. Pedestrian detection is critical for autonomous cars to avoid fatalities and accidents caused by self-driving cars. Objectives: In this research, we focus on the algorithms in machine learning and deep learning to detect pedestrians on the roads. Also, to calculate the most accurate algorithm that can be used in pedestrian detection in automated cars by performing a literature review to select the algorithms. Methods: The methodologies we use are literature review and experimentation, literature review can help us to find efficient algorithms for pedestrian detection in terms of accuracy, computational complexity, etc. After performing the literature review we selected the most efficient algorithms for evaluation and comparison. The second methodology includes experimentation as it evaluates these algorithms under different conditions and scenarios. Through experimentation, we can monitor the different factors that affect the algorithms. Experimentation makes it possible for us to evaluate the algorithms using various metrics such as accuracy and loss which are mainly used to provide a quantitative measure of performance. Results: Based on the literature study, we focused on pedestrian detection deep learning models such as CNN, SSD, and RPN for this thesis project. After evaluating and comparing the algorithms using performance metrics, the outcomes of the experiments demonstrated that RPN was the highest and best-performing algorithm with 95.63% accuracy & loss of 0.0068 followed by SSD with 95.29% accuracy & loss of 0.0142 and CNN with 70.84% accuracy & loss of 0.0622. Conclusions: Among the three deep learning models evaluated for pedestrian identification, the CNN, RPN, and SSD, RPN is the most efficient model with the best performance based on the metrics assessed. Artificial Intelligence (AI) Dataset Deep learning Object detection Pedestrian detection Performance analysis Self-driving vehicles. Computer Sciences Datavetenskap (datalogi)

Search results