• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 15
  • 15
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Using Satellite Images and Deep Learning to Detect Water Hidden Under the Vegetation : A cross-modal knowledge distillation-based method to reduce manual annotation work / Användning Satellitbilder och Djupinlärning för att Upptäcka Vatten Gömt Under Vegetationen : En tvärmodal kunskapsdestillationsbaserad metod för att minska manuellt anteckningsarbete

Cristofoli, Ezio January 2024 (has links)
Detecting water under vegetation is critical to tracking the status of geological ecosystems like wetlands. Researchers use different methods to estimate water presence, avoiding costly on-site measurements. Optical satellite imagery allows the automatic delineation of water using the concept of the Normalised Difference Water Index (NDWI). Still, optical imagery is subject to visibility conditions and cannot detect water under the vegetation, a typical situation for wetlands. Synthetic Aperture Radar (SAR) imagery works under all visibility conditions. It can detect water under vegetation but requires deep network algorithms to segment water presence, and manual annotation work is required to train the deep models. This project uses DEEPAQUA, a cross-modal knowledge distillation method, to eliminate the manual annotation needed to extract water presence from SAR imagery with deep neural networks. In this method, a deep student model (e.g., UNET) is trained to segment water in SAR imagery. The student model uses the NDWI algorithm as the non-parametric, cross-modal teacher. The key prerequisite is that NDWI works on the optical imagery taken from the exact location and simultaneously as the SAR. Three different deep architectures are tested in this project: UNET, SegNet, and UNET++, and the Otsu method is used as the baseline. Experiments on imagery from Swedish wetlands in 2020-2022 show that cross-modal distillation consistently achieved better segmentation performances across architectures than the baseline. Additionally, the UNET family of algorithms performed better than SegNet with a confidence of 95%. The UNET++ model achieved the highest Intersection Over Union (IOU) performance. However, no statistical evidence emerged that UNET++ performs better than UNET, with a confidence of 95%. In conclusion, this project shows that cross-modal knowledge distillation works well across architectures and removes tedious and expensive manual work hours when detecting water from SAR imagery. Further research could evaluate performances on other datasets and student architectures. / Att upptäcka vatten under vegetation är avgörande för att hålla koll på statusen på geologiska ekosystem som våtmarker. Forskare använder olika metoder för att uppskatta vattennärvaro vilket undviker kostsamma mätningar på plats. Optiska satellitbilder tillåter automatisk avgränsning av vatten med hjälp av konceptet Normalised Difference Water Index (NDWI). Optiska bilder fortfarande beroende av siktförhållanden och kan inte upptäcka vatten under vegetationen, en typisk situation för våtmarker. Synthetic Aperture Radar (SAR)-bilder fungerar under alla siktförhållanden. Den kan detektera vatten under vegetation men kräver djupa nätverksalgoritmer för att segmentera vattennärvaro, och manuellt anteckningsarbete krävs för att träna de djupa modellerna. Detta projekt använder DEEPAQUA, en cross-modal kunskapsdestillationsmetod, för att eliminera det manuella annoteringsarbete som behövs för att extrahera vattennärvaro från SAR-bilder med djupa neurala nätverk. I denna metod tränas en djup studentmodell (t.ex. UNET) att segmentera vatten i SAR-bilder semantiskt. Elevmodellen använder NDWI, som fungerar på de optiska bilderna tagna från den exakta platsen och samtidigt som SAR, som den icke-parametriska, cross-modal lärarmodellen. Tre olika djupa arkitekturer testas i detta examensarbete: UNET, SegNet och UNET++, och Otsu-metoden används som baslinje. Experiment på bilder tagna på svenska våtmarker 2020-2022 visar att cross-modal destillation konsekvent uppnådde bättre segmenteringsprestanda över olika arkitekturer jämfört med baslinjen. Dessutom presterade UNET-familjen av algoritmer bättre än SegNet med en konfidens på 95%. UNET++-modellen uppnådde högsta prestanda för Intersection Over Union (IOU). Det framkom dock inga statistiska bevis för att UNET++ presterar bättre än UNET, med en konfidens på 95%. Sammanfattningsvis visar detta projekt att cross-modal kunskapsdestillation fungerar bra över olika arkitekturer och tar bort tidskrävande och kostsamma manuella arbetstimmar vid detektering av vatten från SAR-bilder. Ytterligare forskning skulle kunna utvärdera prestanda på andra datamängder och studentarkitekturer.
12

Machine Learning for Automatic Annotation and Recognition of Demographic Characteristics in Facial Images / Maskininlärning för Automatisk Annotering och Igenkänning av Demografiska Egenskaper hos Ansiktsbilder

Gustavsson Roth, Ludvig, Rimér Högberg, Camilla January 2024 (has links)
Recent increase in widespread use of facial recognition technologies have accelerated the utilization of demographic information, as extracted from facial features, yet it is accompanied by ethical concerns. It is therefore crucial, for ethical reasons, to ensure that algorithms like face recognition algorithms employed in legal proceedings are equitable and thoroughly documented across diverse populations. Accurate classification of demographic traits are therefore essential for enabling a comprehensive understanding of other algorithms. This thesis explores how classical machine learning algorithms compare to deep-learning models in predicting sex, age and skin color, concluding that the more compute-heavy deep-learning models, where the best performing models achieved an MCC of 0.99, 0.48 and 0.85 for sex, age and skin color respectively, significantly outperform their classical machine learning counterparts which achieved an MCC of 0.57, 0.22 and 0.54 at best. Once establishing that the deep-learning models are superior, further methods such as semi-supervised learning, a multi-characteristic classifier, sex-specific age classifiers and using tightly cropped facial images instead of upper-body images were employed to try and improve the deep-learning results. Throughout all deep-learning experiments the state of the art vision transformer and convolutional neural network were compared. Whilst the different architectures performed remarkably alike, a slight edge was seen for the convolutional neural network. The results further show that using cropped facial images generally improve the model performance and that more specialized models achieve modest improvements as compared to their less specialized counterparts. Semi-supervised learning showed potential in slightly improving the models further. The predictive performances achieved in this thesis indicate that the deep-learning models can reliably predict demographic features close to, or surpassing, a human.
13

Event-Cap – Event Ranking and Transformer-based Video Captioning / Event-Cap – Event rankning och transformerbaserad video captioning

Cederqvist, Gabriel, Gustafsson, Henrik January 2024 (has links)
In the field of video surveillance, vast amounts of data are gathered each day. To be able to identify what occurred during a recorded session, a human annotator has to go through the footage and annotate the different events. This is a tedious and expensive process that takes up a large amount of time. With the rise of machine learning and in particular deep learning, the field of both image and video captioning has seen large improvements. Contrastive Language-Image Pretraining is capable of efficiently learning a multimodal space, thus able to merge the understanding of text and images. This enables visual features to be extracted and processed into text describing the visual content. This thesis presents a system for extracting and ranking important events from surveillance videos as well as a way of automatically generating a description of the event. By utilizing the pre-trained models X-CLIP and GPT-2 to extract visual information from the videos and process it into text, a video captioning model was created that requires very little training. Additionally, the ranking system was implemented to extract important parts in video, utilizing anomaly detection as well as polynomial regression. Captions were evaluated using the metrics BLEU, METEOR, ROUGE and CIDEr, and the model receives scores comparable to other video captioning models. Additionally, captions were evaluated by experts in the field of video surveillance, who rated them on accuracy, reaching up to 62.9%, and semantic quality, reaching 99.2%. Furthermore the ranking system was also evaluated by the experts, where they agree with the ranking system 78% of the time. / Inom videoövervakning samlas stora mängder data in varje dag. För att kunna identifiera vad som händer i en inspelad övervakningsvideo så måste en människa gå igenom och annotera de olika händelserna. Detta är en långsam och dyr process som tar upp mycket tid. Under de senaste åren har det setts en enorm ökning av användandet av olika maskininlärningsmodeller. Djupinlärningsmodeller har fått stor framgång när det kommer till att generera korrekt och trovärdig text. De har också använts för att generera beskrivningar för både bilder och video. Contrastive Language-Image Pre-training har gjort det möjligt att träna en multimodal rymd som kombinerar förståelsen av text och bild. Detta gör det möjligt att extrahera visuell information och skapa textbeskrivningar. Denna master uppsatts beskriver ett system som kan extrahera och ranka viktiga händelser i en övervakningsvideo samt ett automatiskt sätt att generera beskrivningar till dessa. Genom att använda de förtränade modellerna X-CLIP och GPT-2 för att extrahera visuell information och textgenerering, har en videobeskrivningsmodell skapats som endast behöver en liten mängd träning. Dessutom har ett rankingsystem implementerats för att extrahera de viktiga delarna i en video genom att använda anomalidetektion och polynomregression. Video beskrivningarna utvärderades med måtten BLEU, METOER, ROUGE och CIDEr, där modellerna får resultat i klass med andra videobeskrivningsmodeller. Fortsättningsvis utvärderades beskrivningarna också av experter inom videoövervakningsområdet där de fick besvara hur bra beskrivningarna var i måtten: beskrivningsprecision som uppnådde 62.9% och semantisk kvalité som uppnådde 99.2%. Ranknignssystemet utvärderades också av experterna. Deras åsikter överensstämde till 78% med rankningssystemet.
14

Anotação automática de dados geográficos baseada em bancos de dados abertos e interligados. / Automatic annotation of spatial data based on open and interconnected databases.

HENRIQUES, Hamon Barros. 07 May 2018 (has links)
Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-05-07T16:21:38Z No. of bitstreams: 1 HAMON BARROS HENRIQUES - DISSERTAÇÃO PPGCC 2015..pdf: 3136584 bytes, checksum: a73ddf1f3aa24a230079e12abc8cee00 (MD5) / Made available in DSpace on 2018-05-07T16:21:38Z (GMT). No. of bitstreams: 1 HAMON BARROS HENRIQUES - DISSERTAÇÃO PPGCC 2015..pdf: 3136584 bytes, checksum: a73ddf1f3aa24a230079e12abc8cee00 (MD5) Previous issue date: 2015-08-31 / Recentemente, infraestruturas de dados espaciais (IDE) têm se popularizado como uma importante solução para facilitar a interoperabilidade de dados geográficos oferecidos por diferentes organizações. Um importante desafio que precisa ser superado por estas infraestruturas consiste em permitir que seus clientes possam localizar facilmente os dados e serviços que se encontram disponíveis. Atualmente, esta tarefa é implementada a partir de serviços de catálogo. Embora tais serviços tenham representado um importante avanço para a recuperação de dados geográficos, estes ainda possuem limitações importantes. Algumas destas limitações surgem porque os serviços de catálogo resolvem suas consultas com base nas informações contidas em seus registros de metadados, que normalmente descrevem as características do serviço como um todo. Além disso, muitos catálogos atuais resolvem consultas com restrições temáticas apenas com base em palavras-chaves, e não possuem meios formais para descrever a semântica dos recursos disponíveis. Para resolver a falta de semântica, esta dissertação apresenta uma solução para a anotação semântica automática das camadas e dos seus respectivos atributos disponibilizados em uma IDE. Com isso, motores de busca, que utilizam ontologias como insumo para a resolução de suas consultas, irão encontrar os dados geográficosqueestãorelacionadossemanticamenteaumdeterminadotema pesquisado. Também foi descrita nesta pesquisa uma avaliação do desempenho da solução proposta sobre uma amostra de serviços Web Feature Service. / Recently, Spatial Data Infrastructure (SDI) has become popular as an important solution for easing the interoperability if geographic data offered by different organizations. An important challenge that must be overcome by such infrastructures consists in allowing their users to easily locating the available data and services. Presently, this task is implemented by means of catalog services. Although such services represent an important advance for retrieval of geographic data, they still have serious limitations. Some of these limitations arise because the catalog service resolves their queries based on information contained in their metadata records, which normally describes the characteristics of the service as a whole. In addition, many current catalogs solve queries with thematic restrictions based only on keywords, and have no formal means for describing the semantics of available resources. To resolve the lack of semantics, this dissertation presents a solution for automatic semantic annotation of feature types and their attributes available in an IDE.With this, search engines, which use ontologies as input for solving their queries will find the geographic data that are semantically related to a particular topic searched. Also has described in this research an evaluation of the performance of the proposed solution on a sample of Web Feature Service services.
15

Dataset Generation in a Simulated Environment Using Real Flight Data for Reliable Runway Detection Capabilities

Tagebrand, Emil, Gustafsson Ek, Emil January 2021 (has links)
Implementing object detection methods for runway detection during landing approaches is limited in the safety-critical aircraft domain. This limitation is due to the difficulty that comes with verification of the design and the ability to understand how the object detection behaves during operation. During operation, object detection needs to consider the aircraft's position, environmental factors, different runways and aircraft attitudes. Training such an object detection model requires a comprehensive dataset that defines the features mentioned above. The feature's impact on the detection capabilities needs to be analysed to ensure the correct distribution of images in the dataset. Gathering images for these scenarios would be costly and needed due to the aviation industry's safety standards. Synthetic data can be used to limit the cost and time required to create a dataset where all features occur. By using synthesised data in the form of generating datasets in a simulated environment, these features could be applied to the dataset directly. The features could also be implemented separately in different datasets and compared to each other to analyse their impact on the object detections capabilities. By utilising this method for the features mentioned above, the following results could be determined. For object detection to consider most landing cases and different runways, the dataset needs to replicate real flight data and generate additional extreme landing cases. The dataset also needs to consider landings at different altitudes, which can differ at a different airport. Environmental conditions such as clouds and time of day reduce detection capabilities far from the runway, while attitude and runway appearance reduce it at close range. Runway appearance did also affect the runway at long ranges but only for darker runways.

Page generated in 0.1149 seconds