Spelling suggestions: "subject:"robotik"" "subject:"robotika""
161 |
Multi-Modal Deep Learning with Sentinel-1 and Sentinel-2 Data for Urban Mapping and Change DetectionHafner, Sebastian January 2022 (has links)
Driven by the rapid growth in population, urbanization is progressing at an unprecedented rate in many places around the world. Earth observation has become an invaluable tool to monitor urbanization on a global scale by either mapping the extent of cities or detecting newly constructed urban areas within and around cities. In particular, the Sentinel-1 (S1) Synthetic Aperture Radar (SAR) and Sentinel-2 (S2) MultiSpectral Instrument (MSI) missions offer new opportunities for urban mapping and urban Change Detection (CD) due to the capability of systematically acquiring wide-swath high-resolution images with frequent revisits globally. Current trends in both urban mapping and urban CD have shifted from employing traditional machine learning methods to Deep Learning (DL) models, specifically Convolutional Neural Networks (CNNs). Recent urban mapping efforts achieved promising results by training CNNs on available built-up data using S2 images. Likewise, DL models have been applied to urban CD problems using S2 data with promising results. However, the quality of current methods strongly depends on the availability of local reference data for supervised training, especially since CNNs applied to unseen areas often produce unsatisfactory results due to their insufficient across-region generalization ability. Since multitemporal reference data are even more difficult to obtain, unsupervised learning was suggested for urban CD. While unsupervised models may perform more consistently across different regions, they often perform considerably worse than their supervised counterparts. To alleviate these shortcomings, it is desirable to leverage Semi-Supervised Learning (SSL) that exploits unlabeled data to improve upon supervised learning, especially because satellite data is plentiful. Furthermore, the integration of SAR data into the current optical frameworks (i.e., data fusion) has the potential to produce models with better generalization ability because the representation of urban areas in SAR images is largely invariant across cities, while spectral signatures vary greatly. In this thesis, a novel Domain Adaptation (DA) approach using SSL is first presented. The DA approach jointly exploits Multi-Modal (MM) S1 SAR and S2 MSI to improve across-region generalization for built-up area mapping. Specifically, two identical sub-networks are incorporated into the proposed model to perform built-up area segmentation from SAR and optical images separately. Assuming that consistent built-up area segmentation should be obtained across data modalities, an unsupervised loss for unlabeled data that penalizes inconsistent segmentation from the two sub-networks was designed. Therefore, the use of complementary data modalities as real-world perturbations for Consistency Regularization (CR) is proposed. For the final prediction, the model takes both data modalities into account. Experiments conducted on a test set comprised of sixty representative sites across the world showed that the proposed DA approach achieves strong improvements (F1 score 0.694) upon supervised learning from S1 SAR data (F1 score 0.574), S2 MSI data (F1 score 0.580) and their input-level fusion (F1 score 0.651). The comparison with two state-of-the-art global human settlement maps, namely GHS-S2 and WSF2019, showed that our model is capable of producing built-up area maps with comparable or even better quality. For urban CD, a new network architecture for the fusion of SAR and optical data is proposed. Specifically, a dual stream concept was introduced to process different data modalities separately, before combining extracted features at a later decision stage. The individual streams are based on the U-Net architecture. The proposed strategy outperformed other U-Net-based approaches in combination with uni-modal data and MM data with feature level fusion. Furthermore, our approach achieved state-of-the-art performance on the problem posed by a popular urban CD dataset (F1 score 0.600). Furthermore, a new network architecture is proposed to adapt Multi-Modal Consistency Regularization (MMCR) for urban CD. Using bi-temporal S1 SAR and S2 MSI image pairs as input, the MM Siamese Difference (Siam-Diff) Dual-Task (DT) network not only predicts changes using a difference decoder, but also segments buildings for each image with a semantic decoder. The proposed network is trained in a semi-supervised fashion using the underlying idea of MMCR, namely that building segmentation across sensor modalities should be consistent, to learn more robust features. The proposed method was tested on an urban CD task using the 60 sites of the SpaceNet7 dataset. A domain gap was introduced by only using labels for sites located in the Western World, where geospatial data are typically less sparse than in the Global South. MMCR achieved an average F1 score of 0.444 when applied to sites located outside of the source domain, which is a considerable improvement to several supervised models (F1 scores between 0.107 and 0.424). The combined findings of this thesis contribute to the mapping and monitoring of cities on a global scale, which is crucial to support sustainable planning and urban SDG indicator monitoring. / Vår befolkningstillväxt ligger till stor grund för den omfattande urbanise-ringstakt som kan observeras runt om i världen idag. Jordobservationer harblivit ett betydelsefullt verktyg för att bevaka urbaniseringen på en globalskala genom att antingen kartlägga städernas omfattning eller upptäcka ny-byggda stadsområden inom eller runtom städer. Tillföljd av satellituppdragenSentinel-1 (S1) Synthetic Aperture Radar (SAR) och Sentinel-2 (S2) MultiS-pectral Instrument (MSI) och dess förmåga att systematiskt tillhandahållabreda och högupplösta bilder, har vi fått nya möjligheter att kartlägga urbanaområden och upptäcka förändringar inom dem, även på frekvent åter besöktaplatser. Samtida trender inom både urban kartläggning och för att upptäcka ur-bana förändringar har gått från att använda traditionella maskininlärnings-metoder till djupinlärning (DL), särskilt Convolutional Neural Nets (CNNs).De nytillkomna urbana kartläggningsmetoderna har gett lovande resultat ge-nom att träna CNNs med redan tillgänglig urban data och S2-bilder. Likasåhar DL-modeller, i kombination med S2-data, tillämpats på de problem somkan uppkomma vid analyser av urbana förändringar. Kvaliteten på de nuvarande metoderna beror dock i stor utsträckning påtillgången av lokal referensdata förövervakad träning. CNNs som tillämpaspå nya områden ger ofta otillräckliga resultat på grund av deras oförmågaatt generalisera över regioner. Eftersom multitemporala referensdata kan va-ra svåra att erhålla föreslås oövervakad inlärning för upptäckter av urbanaförändringar. även om oövervakade modeller kan prestera mer konsekvent iolika regioner, generar de ofta betydligt sämre än dess övervakade motsva-righeter. För att undvika de brister som kan uppkomma är det önskvärt attanvända semi-övervakad inlärning (SSL) som nyttjar omärkta data för attförbättraövervakad inlärning eftersom tillgången på satellitdata är så stor.Dessutom har integrationen av SAR-data i de nuvarande optiska ramverken(så kallad datafusion) potential att producera modeller med bättre generali-seringsförmåga då representationen av stadsområden i SAR-bilder är i stortsett oföränderlig mellan städer, medan spektrala signaturer varierar mycket. Denna avhandling presenterar först en ny metod för domänanpassning(DA) som använder SSL. Den DA-metoden som presenteras kombinerar Multi-Modal (MM) S1 SAR och S2 MSI för att förbättra generaliseringen av re-gioner som används vid kartläggning av bebyggda områden. Två identiskaundernätverk är inkorporerade i den föreslagna modellen för att få separataurbana kartläggningar från SAR och optiska data. För att erhålla en kon-sekvent segmentering av bebyggda områden över datamodalitet utformadesen oövervakad komponent för att motverka inkonsekvent segmentering frånde två undernätverken. Således föreslås användningen av kompletterande da-tamodaliteter som använder sig av verkliga störningar för konsistensregula-riseringar (CR). För det slutgiltiga resultatet tar modellen hänsyn till bådadatamodaliteterna. Experiment utförda på en testuppsättning bestående av60 representativa platseröver världen visar att den föreslagna DA-metodenuppnår starka förbättringar (F1 score 0,694) vidövervakad inlärning från S1SAR-data (F1 score 0,574), S2 MSI-data (F1 score 0,580) och deras samman-slagning på ingångsnivå (F1 score 0,651). I jämförelse med de två främstaglobala kartorna över mänskliga bosättningar, GHS-S2 och WSF2019, visadesig vår modell kapabel till att producera bebyggelsekartor med jämförbar ellerbättre kvalitet. Gällande metoder för upptäckter av urbana förändringar i städer föreslårdenna avhandling en ny nätverksarkitektur som sammanslår SAR och op-tisk data. Mer specifikt presenteras ett dubbelströmskoncept för att bearbetaolika datamodaliteter separat, innan de extraherade funktionerna kombine-ras i ett senare beslutsstadium. De enskilda strömmarna baseras på U-Netarkitektur. Strategin överträffade andra U-Net-baserade tillvägagångssätt ikombination med uni-modala data och MM-data med funktionsnivåfusion.Dessutom uppnådde tillvägagångssättet hög prestanda på problem som or-sakas vid en frekvent använd datauppsättning för urbana förändringar (F1score 0,600). Därtill föreslås en ny nätverksarkitektur som anpassar multi-modala kon-sistensregulariseringar (MMCR) för att upptäcka urbana förändringar. Ge-nom att använda bi-temporala S1 SAR- och S2 MSI-bildpar som indata,förutsäger nätverket MM Siamese Difference (Siam-Diff) Dual-Task (DT) intebara förändringar med hjälp av en skillnadsavkodare, utan kan även segmen-tera byggnader för varje bild med en semantisk avkodare. Nätverket tränaspå ett semi-övervakat sätt med hjälp av MMCR, nämligen att byggnadsseg-mentering över sensormodaliteter ska vara konsekvent, för att lära sig merrobusta funktioner. Den föreslagna metoden testades på en CD-uppgift medanvändning av de 60 platserna i SpaceNet7-datauppsättningen. Ett domängapintroducerades genom att endast använda etiketter för platser i västvärlden,där geospatiala data vanligtvis är mindre glest än i Globala Syd. MMCRuppnådde ett genomsnittligt F1 score på 0,444 när det applicerades på plat-ser utanför källdomänen, vilket är en avsevärd förbättring för flera övervakademodeller (F1 score mellan 0,107 och 0,424).Samtliga resultat från avhandlingen bidrar till kartläggning och över-vakning av städer på en global skala, vilket är väsentligt för att kunna bedrivahållbar stadsplanering och övervakning av FN:s globala mål för hållbar ut-veckling. / <p>QC220530</p>
|
162 |
Öppen källkodslösning för datorseende : Skapande av testmiljö och utvärdering av OpenCV / Open source solution for computer vision : Creating a test environment and evaluating OpenCVLokkin, Caj, Bragd, Sebastian January 2021 (has links)
Datorseende är ett område inom datavetenskap som har utvecklats i många år och funktionaliteten är mer tillgängligt nu än någonsin. Det kan bland annat användas för beröringfri mätning så som att hitta, verifiera och identifiera defekter för objekt. Frågan är om det går att utforma en öppen källkodslösning för datorseende som ger motsvarande prestanda som tillgängliga kommersiella. Med andra ord, kan ett företag som använder ett kommersiellt program med stängd källkod istället använda ett gratis öppet källkodsblibotek och få motsvarande resultat? I denna rapport beskriver vi designen av en prototyp som använder det öppna källkodsbiblioteket för datorseende, OpenCV. I syfte att utvärdera vår prototyp låter vi den identifiera block i ett torn på en bild i en serie testfall. Vi jämför resultaten från prototypen med de resultat som erhålls med en kommersiell lösning, skapad med programmet 'Vision Builder for Automated Inspection'. Resultat av det som testats visar att OpenCV tycks ha prestanda och funktionalitet som motsvarar den kommersiella lösningen men har begränsningar. Då OpenCV är fokus är på programmatisk utveckling av datorseenden lösningar är resultatet av lösningar som skapas beroende på användarens kompetens inom programmering och programmdesign. Utifrån de tester som genomfördes anser vi att OpenCV kan ersätta ett licensierat kommersiellt program men licensenkostnaderna kan komma att ersättas av andra utvecklingskostnader. / Computer vision is a subject in computer science that have evolved over many years and the functionality is more accessible then ever. Among other things, it can be used for non-contact measurement to locate, verify, and detect defects of objects. The question is if it is possible to create an open source solution for computer vision equivalent to a closed source solution. In other words, can a company using a closed source commercial program instead use a free open source code library and produce equivalent results? In this report we describe the design of a prototype that uses the open source library for computer vision, OpenCV. In order to evaluate our prototype, we let it identify block in a tower on image in a series of test cases. We compare the results from the prototype with the results obtained with a commercial solution, created with the program ''Vision Builder for Automated Inspection''. Results from the cases tested show that OpenCV seems to have performance and functionality equivalent to the commercial solution but has some limitations. As OpenCV's focus is on programmatic development of computer vision solutions, the result is dependent on the user's skills in programming and program design. Based on the tests that we have performed, we believe that OpenCV can replace a licensed commerical program, but the license cost may come to be replaced by other development costs.
|
163 |
Implementering av objektföljning med hjälp av maskinseendeRönnholm, Robin January 2020 (has links)
Objektföljning kan användas inom allt från större sportevenemang, smarta bilar och kameraövervakning till bekämpning av malariainfekterade myggor. Tekniken fungerar på så vis att en datoranalys utförs på uppfångade bilder, vilka i de flesta fall har tagits av en digitalkamera. Under analysen görs en bedömning om objektet man vill följa finns i bilden och var i bilden det befinner sig. Det ger datorn ett sätt att se och identifiera objekt, så kallad maskinseende. I den här studien undersöks objektidentifiering när metoden färgidentifikation och kantdetektering tillsammans med Hough circle transform implementeras i en kompakt enkortsdator av modell Raspberry Pi. Enkortsdatorn ska även skicka objektets centrumposition till en mikrokontroller av typ Arduino, som är fristående från det här pappret men den ska rikta in en laserpekare mot centrumpositionen på en måltavla. Utöver implementeringen av objektföljningen skall även en plattform konstrueras för att kunna användas av Syntronic AB i demonstreringssyfte. Plattformen består av tre måltavlor med fotoresistorer och en digitalkamera. Digitalkameran är tillsammans med en laserpekare monterad på en liten servomotorstyrd plattform som kan panorera och luta.
|
164 |
Evaluating DCNN architecturesfor multinomial area classicationusing satellite data / Utvärdering av DCNN arkitekturer för multinomial arealklassi-cering med hjälp av satellit dataWojtulewicz, Karol, Agbrink, Viktor January 2020 (has links)
The most common approach to analysing satellite imagery is building or object segmentation,which expects an algorithm to find and segment objects with specific boundaries thatare present in the satellite imagery. The company Vricon takes satellite imagery analysisfurther with the goal of reproducing the entire world into a 3D mesh. This 3D reconstructionis performed by a set of complex algorithms excelling in different object reconstructionswhich need sufficient labeling in the original 2D satellite imagery to ensure validtransformations. Vricon believes that the labeling of areas can be used to improve the algorithmselection process further. Therefore, the company wants to investigate if multinomiallarge area classification can be performed successfully using the satellite image data availableat the company. To enable this type of classification, the company’s gold-standarddataset containing labeled objects such as individual buildings, single trees, roads amongothers, has been transformed into an large area gold-standard dataset in an unsupervisedmanner. This dataset was later used to evaluate large area classification using several stateof-the-art Deep Convolutional Neural Network (DCNN) semantic segmentation architectureson both RGB as well as RGB and Digital Surface Model (DSM) height data. Theresults yield close to 63% mIoU and close to 80% pixel accuracy on validation data withoutusing the DSM height data in the process. This thesis additionally contributes with a novelapproach for large area gold-standard creation from existing object labeled datasets.
|
165 |
Detect obstacles for forest machinery from laser scanned dataSöderström, Simon January 2020 (has links)
No description available.
|
166 |
Improved Data Association for Multi-Pedestrian Tracking Using Image InformationFlodin, Frida January 2020 (has links)
Multi-pedestrian tracking (MPT) is the task of localizing and following the trajectory of pedestrians in a sequence. Using an MPT algorithm is an important part in preventing pedestrian-vehicle collisions in Automated Driving (AD) and Advanced Driving Assistance Systems (ADAS). It has benefited greatly from the advances in computer vision and machine learning in the last decades. Using a pedestrian detector, the tracking consists of associating the detections between frames and maintaining pedestrian identities throughout the sequence. This can be a challenging task due to occlusions, missed detections and complex scenes. The number of pedestrians is unknown, and it varies with time. Finding new methods for improving MPT is an active research field and there are many approaches found in the literature. This work focuses on improving the detection-to-track association, the data association, with the help of extracted color features for each pedestrian. Utilizing the recent improvements in object detection this work shows that classical color features still is relevant in pedestrian tracking for real time applications with limited computational resources. The appearance is not only used in the data association but also integrated in a new proposed method to avoid tracking errors due to missed detections. The results show that even with simple models the color appearance can be used to improve the tracking results. Evaluation on the commonly used Multi-Object Tracking-benchmark shows an improvement in the Multi-Object Tracking Accuracy and identity switches, while keeping other measures essentially unchanged.
|
167 |
Geometry measurements using a smartphoneWiklund, Joakim January 2020 (has links)
Quality assurance is an important part of many industrial processes that involves different methods of determining the quality of a product. One of these methods is deflectometry, a method that uses a display to show patterns and a camera to capture the reflection of these patterns through a specular object which is being measured. The goal and end result of these measurements is a height profile of the object's surface. While there are many ways of performing deflectometry through using different types of patterns and setups, this project focuses on using only a single smartphone to capture all data required for measurements. This involves showing a sequence of patterns on the smartphone's display and using its front-facing camera to capture the reflection through the specular object. The patterns chosen for this purpose are binary checkerboard patterns that are simple enough for the camera to capture in a good way and efficient enough to perform the calculations in a reasonable time frame. Using this method, the ability of a smartphone to perform deflectometric measurements was evaluated by testing on several different types of mirrors as well as on a real car body. The method can produce results that closely replicate the real world object and can calculate quantities that are used to measure the quality of car assembly, often with an accuracy of well under 1 mm. The method can handle a large variety of reflective objects at varying distances and form while only requiring some known parameters of the smartphone used in testing.
|
168 |
Creating a self-driving terrain vehicle in a simulated environmentMarkgren, Jonas January 2020 (has links)
Outside of the city environment, there are many unstructured and rough environments that are challenging in vehicle navigation tasks. In these environments, vehicle vibrations caused by rough terrain can be harmful for humans. In addition, a human operator can not work around the clock. A promising solution is to use artificial intelligence to replace human operators. I test this by using the artificial intelligence technique know as reinforcement learning, with the algorithm Proximal Policy Optimization, to perform some basic locomotion tasks in a simulated environment with a simple terrain vehicle. The terrain vehicle consists of two chassis, each having two wheels attached, connected to each other with an articulation joint that can rotate to turn the vehicle. I show that a trained model can learn to operate the terrain vehicle and complete basic tasks, such as finding and following a path while avoiding obstacles. I tested robustness by evaluating performance on sloped terrains with a model trained to operate on flat ground. The results from the tests with different slopes show that, for most environments, the trained model could handle slopes up to around 7.5-10 degrees without much issue, even though it had no way of detecting the slope. This tells us that the models can perform their tasks quite well even when disturbances are introduced, as long as these disturbances doesn't require them to significantly change their behaviors.
|
169 |
Detection and Tracking in Thermal Infrared ImageryBerg, Amanda January 2016 (has links)
Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as it is possible to measure a temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy. This thesis addresses the problem of detection and tracking in thermal infrared imagery. Visual detection and tracking of objects in video are research areas that have been and currently are subject to extensive research. Indications oftheir popularity are recent benchmarks such as the annual Visual Object Tracking (VOT) challenges, the Object Tracking Benchmarks, the series of workshops on Performance Evaluation of Tracking and Surveillance (PETS), and the workshops on Change Detection. Benchmark results indicate that detection and tracking are still challenging problems. A common belief is that detection and tracking in thermal infrared imagery is identical to detection and tracking in grayscale visual imagery. This thesis argues that the preceding allegation is not true. The characteristics of thermal infrared radiation and imagery pose certain challenges to image analysis algorithms. The thesis describes these characteristics and challenges as well as presents evaluation results confirming the hypothesis. Detection and tracking are often treated as two separate problems. However, some tracking methods, e.g. template-based tracking methods, base their tracking on repeated specific detections. They learn a model of the object that is adaptively updated. That is, detection and tracking are performed jointly. The thesis includes a template-based tracking method designed specifically for thermal infrared imagery, describes a thermal infrared dataset for evaluation of template-based tracking methods, and provides an overview of the first challenge on short-term,single-object tracking in thermal infrared video. Finally, two applications employing detection and tracking methods are presented.
|
170 |
On Pose Estimation in Room-Scaled EnvironmentsNyqvist, Hanna E. January 2016 (has links)
Pose (position and orientation) tracking in room-scaled environments is an enabling technique for many applications. Today, virtual reality (vr) and augmented reality (ar) are two examples of such applications, receiving high interest both from the public and the research community. Accurate pose tracking of the vr or ar equipment, often a camera or a headset, or of different body parts is crucial to trick the human brain and make the virtual experience realistic. Pose tracking in room-scaled environments is also needed for reference tracking and metrology. This thesis focuses on an application to metrology. In this application, photometric models of a photo studio are needed to perform realistic scene reconstruction and image synthesis. Pose tracking of a dedicated sensor enables creation of these photometric models. The demands on the tracking system used in this application is high. It must be able to provide sub-centimeter and sub-degree accuracy and at same time be easy to move and install in new photo studios. The focus of this thesis is to investigate and develop methods for a pose tracking system that satisfies the requirements of the intended metrology application. The Bayesian filtering framework is suggested because of its firm theoretical foundation in informatics and because it enables straightforward fusion of measurements from several sensors. Sensor fusion is in this thesis seen as a way to exploit complementary characteristics of different sensors to increase tracking accuracy and robustness. Four different types of measurements are considered; inertialmeasurements, images from a camera, range (time-of-flight) measurements from ultra wide band (uwb) radio signals, and range and velocity measurements from echoes of transmitted acoustic signals. A simulation study and a study of the Cramér-Rao lower filtering bound (crlb) show that an inertial-camera system has the potential to reach the required tracking accuracy. It is however assumed that known fiducial markers, that can be detected and recognized in images, are deployed in the environment. The study shows that many markers are required. This makes the solution more of a stationary solution and the mobility requirement is not fulfilled. A simultaneous localization and mapping (slam) solution, where naturally occurring features are used instead of known markers, are suggested solve this problem. Evaluation using real data shows that the provided inertial-camera slam filter suffers from drift but that support from uwb range measurements eliminates this drift. The slam solution is then only dependent on knowing the position of very few stationary uwb transmitters compared to a large number of known fiducial markers. As a last step, to increase the accuracy of the slam filter, it is investigated if and how range measurements can be complemented with velocity measurement obtained as a result of the Doppler effect. Especially, focus is put on analyzing the correlation between the range and velocity measurements and the implications this correlation has for filtering. The investigation is done in a theoretical study of reflected known signals (compare with radar and sonar) where the crlb is used as an analyzing tool. The theory is validated on real data from acoustic echoes in an indoor environment.
|
Page generated in 0.0367 seconds