Global ETD Search

101	Towards visual urban scene understanding for autonomous vehicle path tracking using GPS positioning data. / Vers l'interprétation de scènes urbaines pour le suivi de trajectoires pour véhicule autonome en utilisant les positions GPS. Gamez serna, Citlalli 29 April 2019 (has links) Cette thèse de doctorat s’intéresse au suivi de trajectoire basé sur la perception visuelle et la localisation en milieu urbain. L'approche proposée comprend deux systèmes. Le premier concerne la perception de l'environnement. Cette tâche est effectuée en utilisant des techniques d'apprentissage profond pour extraire automatiquement les caractéristiques visuelles 2D et utiliser ces derniers pour apprendre à distinguer les différents objets dans les scénarios de conduite. Trois techniques d'apprentissage approfondi sont adoptées : la segmentation sémantique pour assigner chaque pixel d’une image à une classe, la segmentation d'instance pour identifier les instances séparées de la même classe et la classification d'image pour reconnaître davantage les étiquettes spécifiques des instances. Ici, notre système considère 15 classes d'objets et reconnaît les panneaux de signalisation. Le deuxième système fait référence au suivi de chemin numérisé. Dans un premier temps, le véhicule équipé enregistre d'abord l'itinéraire avec un système de vision stéréo et un récepteur GPS (étape d'apprentissage ou numérisation du chemin). Ensuite, le système proposé analyse hors ligne la trajectoire GPS et identifie exactement les emplacements des courbes dangereuses (brusques) et les limitation de vitesse via les données visuelles. Enfin, une fois que le véhicule est capable de se localiser lui-même durant la phase de suivi de chemin, le module de contrôle du véhicule piloté avec notre algorithme de négociation de vitesse, prend en compte les informations extraites et calcule la vitesse idéale à exécuter. Grâce aux résultats expérimentaux des deux systèmes, nous prouvons que le premier est capable de détecter et de reconnaître précisément les objets d'intérêt dans les scénarios urbains, tandis que le suivi de trajectoire réduit significativement les erreurs latérales entre le trajet appris et le trajet parcouru. Nous soutenons que la fusion des deux systèmes améliorera le suivi de chemin pour prévenir les accidents ou assurer la conduite autonome. / This PhD thesis focuses on developing a path tracking approach based on visual perception and localization in urban environments. The proposed approach comprises two systems. The first one concerns environment perception. This task is carried out using deep learning techniques to automatically extract 2D visual features and use them to learn in order to distinguish the different objects in the driving scenarios. Three deep learning techniques are adopted: semantic segmentation to assign each image pixel to a class, instance segmentation to identify separated instances of the same class and, image classification to further recognize the specific labels of the instances. Here our system segments 15 object classes and performs traffic sign recognition. The second system refers to path tracking. In order to follow a path, the equipped vehicle first travels and records the route with a stereo vision system and a GPS receiver (learning step). The proposed system analyses off-line the GPS path and identifies exactly the locations of dangerous (sharp) curves and speed limits. Later after the vehicle is able to localize itself, the vehicle control module together with our speed negotiation algorithm, takes into account the information extracted and computes the ideal speed to execute. Through experimental results of both systems, we prove that, the first one is capable to detect and recognize precisely objects of interest in urban scenarios, while the path tracking one reduces significantly the lateral errors between the learned and traveled path. We argue that the fusion of both systems will ameliorate the tracking approach for preventing accidents or implementing autonomous driving. Conduite autonome Perception de l'environnement urbain Panneaux de signalisation Cnn Suivi de trajectoire Autonomous driving Urban environment perception Traffic signs Cnn Path tracking 004 620
102	Semantic Segmentation : Using Convolutional Neural Networks and Sparse dictionaries Andersson, Viktor January 2017 (has links) The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance? The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation. convolution neural network sparse dictionaries cnn computer vision machine learning road scene artificial intelligence neuronnät maskininlärning datorseende aetificiell intelligens cnn Signal Processing Signalbehandling
103	Evaluation of CNN in ESM Data Classification by Perspective of Military Utility / Utvärdering av convolutional neural networks för ESM-dataklassifikation genom perspektivet av militär nytta Johansson, Jimmy January 2020 (has links) Modern society has seen an increase in automation using AI in a variety of applications. To keep up with recent development, it is therefore logical to investigate the application of AI programs to military tasks. The great advantage with automation lies in the possible increase in efficiency and possible relocation of resources of personnel to other tasks. Therefore, this study aims to evaluate the use of Convolutional Neural Networks (CNN) in classification of communication and radar emitters based on collected Electronic Support Measures (ESM) data and to estimate to what extent human analysts could be replaced. The evaluation was performed by applying the concept of military Utility as a framework for evaluation with the addition of Technology Readiness Level (TRL) to survey how far the technology has developed. Data was collected using two methods: Firstly, through a literature review of research done on the application of CNNs in classifying information such as spectrograms and images. Secondly, by interviewing a subject matter expert from SAAB, who mainly helped estimate the TRL of the technology’s components. The study found that CNN appears suitable to apply on the proposed task and that the program could potentially replace human analysts to a great extent, at least when doing routine classifications. Full automation seems unlikely as analysts would be required with more challenging classifications, especially those outside the range of the training data used in teaching the CNN. Finally, challenges involved with deep learning programs inherent structure, demands and application to military tasks are discussed and subjects for future research are proposed. / Det moderna samhället har sett en ökad automatisering med AI i en mängd olika applikationer och för att hålla jämna steg med den senaste utvecklingen är det därför logiskt att undersöka tillämpningen av AI-program på militära uppgifter. Den stora fördelen med automatisering ligger i den möjliga ökningen av effektivitet och möjlig flytt av personalresurser till andra uppgifter. Därför syftar denna studie till att utvärdera användningen av convolutional neural networks (CNN) vid klassificering av kommunikations- och radarsändare baserat på insamlade data från elektronisk stödverksamhet (sv. ES motsvara eng. ESM) och att uppskatta i vilken utsträckning mänskliga analytiker kan ersättas. Utvärderingen genomfördes genom att använda konceptet militär nytta som ett ramverk för utvärdering med tillägg av technology readiness level (TRL) för att kartlägga hur långt tekniken har utvecklats. Data samlades in med två metoder: För det första genom en litteraturöversikt av forskning som gjorts om tillämpningen av CNN för att klassificera information såsom spektrogram och bilder. För det andra genom att intervjua en ämnesexpert från SAAB, som främst hjälpte till att uppskatta TRL för teknikens komponenter. Studien fann att CNN verkar lämplig att använda till den föreslagna uppgiften och att programmet potentiellt skulle kunna ersätta mänskliga analytiker i stor utsträckning, åtminstone for rutinklassificeringar. En fullständig automatisering verkar osannolik eftersom analytiker skulle krävas med mer utmanande klassificeringar, särskilt de som ligger utanför utbildningsdata som används för att lära upp programmet. Slutligen diskuteras utmaningar kopplade till djup-inlärningsprogrammens struktur, krav och tillämpning på militära uppgifter samt att ämnen för framtida forskning föreslås. CNN emitter classification ESM spectrogram military utility CNN sändarklassifikation ES ESM spektrogram militär nytta Social Sciences Interdisciplinary
104	Weed Detection in UAV Images of Cereal Crops with Instance Segmentation Gromova, Arina January 2021 (has links) Modern weeding is predominantly carried out by spraying whole fields with toxic pesticides, a process that accomplishes the main goal of eliminating weeds, but at a cost of the local environment. Weed management systems based on AI solutions enable more targeted actions, such as site-specific spraying, which is essential in reducing the need for chemicals. To introduce sustainable weeding in Swedish farmlands, we propose implementing a state-of-the-art Deep Learning (DL) algorithm capable of instance segmentation for remote sensing of weeds, before coupling an automated sprayer vehicle. Cereals have been chosen as the target crop in this study as they are among the most commonly cultivated plants in Northern Europe. We used Unmanned Aerial Vehicles (UAV) to capture images from several fields and trained a Mask R-CNN computer vision framework to accurately recognize and localize unique instances of weeds among plants. Moreover, we evaluated three different backbones (ResNet-50, ResNet101, ResNeXt-101) pre-trained on the MS COCO dataset and through transfer learning tuned the model towards our classification task. Some well-reported limitations in building an accurate model include occlusion among instances as well as the high similarity between weeds and crops. Our system handles these challenges fairly well. We achieved a precision of 0.82, recall of 0.61, and F1 score of 0.70. Still, improvements can be made in data preparation and pre-processing to further improve the recall rate. All and all, the main outcome of this study is the system pipeline which, together with post-processing using geographical field coordinates, could serve as a detector for half of the weeds in an end-to-end weed removal system. / Site-specific Weed Control in Swedish Agriculture computer vision deep learning CNN mask R-CNN weed detection Agricultural Science Jordbruksvetenskap
105	Action Recognition in Still Images and Inference of Object Affordances Girish, Deeptha S. 15 October 2020 (has links) No description available. Electrical Engineering Action recognition in still images object affordances human object interaction rank based dimensionality reduction unsupervised understanding of CNN layers
106	Multi-Task Convolutional Learning for Flame Characterization Ur Rehman, Obaid January 2020 (has links) This thesis explores multi-task learning for combustion flame characterization i.e to learn different characteristics of the combustion flame. We propose a multi-task convolutional neural network for two tasks i.e. PFR (Pilot fuel ratio) and fuel type classification based on the images of stable combustion. We utilize transfer learning and adopt VGG16 to develop a multi-task convolutional neural network to jointly learn the aforementioned tasks. We also compare the performance of the individual CNN model for two tasks with multi-task CNN which learns these two tasks jointly by sharing visual knowledge among the tasks. We share the effectiveness of our proposed approach to a private company’s dataset. To the best of our knowledge, this is the first work being done for jointly learning different characteristics of the combustion flame. / <p>This wrok as done with Siemens, and we have applied for a patent which is still pending.</p> Multi task learning multi task convolutional learning transfer learning VGG16 CNN convolutional neural networks MTL MTL CNN Computer Systems Datorsystem Probability Theory and Statistics Sannolikhetsteori och statistik
107	News article segmentation using multimodal input : Using Mask R-CNN and sentence transformers / Artikelsegmentering med multimodala artificiella neuronnätverk : Med hjälp av Mask R-CNN och sentence transformers Henning, Gustav January 2022 (has links) In this century and the last, serious efforts have been made to digitize the content housed by libraries across the world. In order to open up these volumes to content-based information retrieval, independent elements such as headlines, body text, bylines, images and captions ideally need to be connected semantically as article-level units. To query on facets such as author, section, content type or other metadata, further processing of these documents is required. Even though humans have shown exceptional ability to segment different types of elements into related components, even in languages foreign to them, this task has proven difficult for computers. The challenge of semantic segmentation in newspapers lies in the diversity of the medium: Newspapers have vastly different layouts, covering diverse content, from news articles to ads to weather reports. State-of-the-art object detection and segmentation models have been trained to detect and segment real-world objects. It is not clear whether these architectures can perform equally well when applied to scanned images of printed text. In the domain of newspapers, in addition to the images themselves, we have access to textual information through Optical Character Recognition. The recent progress made in the field of instance segmentation of real-world objects using deep learning techniques begs the question: Can the same methodology be applied in the domain of newspaper articles? In this thesis we investigate one possible approach to encode the textual signal into the image in an attempt to improve performance. Based on newspapers from the National Library of Sweden, we investigate the predictive power of visual and textual features and their capacity to generalize across different typographic designs. Results show impressive mean Average Precision scores (>0:9) for test sets sampled from the same newspaper designs as the training data when using only the image modality. / I detta och det förra århundradet har kraftiga åtaganden gjorts för att digitalisera traditionellt medieinnehåll som tidigare endast tryckts i pappersformat. För att kunna stödja sökningar och fasetter i detta innehåll krävs bearbetning påsemantisk nivå, det vill säga att innehållet styckas upp påartikelnivå, istället för per sida. Trots att människor har lätt att dela upp innehåll påsemantisk nivå, även påett främmande språk, fortsätter arbetet för automatisering av denna uppgift. Utmaningen i att segmentera nyhetsartiklar återfinns i mångfalden av utseende och format. Innehållet är även detta mångfaldigt, där man återfinner allt ifrån faktamässiga artiklar, till debatter, listor av fakta och upplysningar, reklam och väder bland annat. Stora framsteg har gjorts inom djupinlärning just för objektdetektering och semantisk segmentering bara de senaste årtiondet. Frågan vi ställer oss är: Kan samma metodik appliceras inom domänen nyhetsartiklar? Dessa modeller är skapta för att klassificera världsliga ting. I denna domän har vi tillgång till texten och dess koordinater via en potentiellt bristfällig optisk teckenigenkänning. Vi undersöker ett sätt att utnyttja denna textinformation i ett försök att förbättra resultatet i denna specifika domän. Baserat pådata från Kungliga Biblioteket undersöker vi hur väl denna metod lämpar sig för uppstyckandet av innehåll i tidningar längsmed tidsperioder där designen förändrar sig markant. Resultaten visar att Mask R-CNN lämpar sig väl för användning inom domänen nyhetsartikelsegmentering, även utan texten som input till modellen. Historical newspapers Image segmentation Multimodal learning Deep learning Digital humanities Mask R-CNN Historiska tidningar Bildsegmentering Multimodal inlärning Djupinlärning Digital humaniora Mask R-CNN Computer Sciences Datavetenskap (datalogi)
108	Dataset Evaluation Method for Vehicle Detection Using TensorFlow Object Detection API / Utvärderingsmetod för dataset inom fordonsigenkänning med användning avTensorFlow Object Detection API Furundzic, Bojan, Mathisson, Fabian January 2021 (has links) Recent developments in the field of object detection have highlighted a significant variation in quality between visual datasets. As a result, there is a need for a standardized approach of validating visual dataset features and their performance contribution. With a focus on vehicle detection, this thesis aims to develop an evaluation method utilized for comparing visual datasets. This method was utilized to determine the dataset that contributed to the detection model with the greatest ability to detect vehicles. The visual datasets compared in this research were BDD100K, KITTI and Udacity, each one being trained on individual models. Applying the developed evaluation method, a strong indication of BDD100K's performance superiority was determined. Further analysis and feature extraction of dataset size, label distribution and average labels per image was conducted. In addition, real-world experimental conduction was performed in order to validate the developed evaluation method. It could be determined that all features and experimental results pointed to BDD100K's superiority over the other datasets, validating the developed evaluation method. Furthermore, the TensorFlow Object Detection API's ability to improve performance gain from a visual dataset was studied. Through the use of augmentations, it was concluded that the TensorFlow Object Detection API serves as a great tool to increase performance gain for visual datasets. / Inom fältet av objektdetektering har ny utveckling demonstrerat stor kvalitetsvariation mellan visuella dataset. Till följd av detta finns det ett behov av standardiserade valideringsmetoder för att jämföra visuella dataset och deras prestationsförmåga. Detta examensarbete har, med ett fokus på fordonsigenkänning, som syfte att utveckla en pålitlig valideringsmetod som kan användas för att jämföra visuella dataset. Denna valideringsmetod användes därefter för att fastställa det dataset som bidrog till systemet med bäst förmåga att detektera fordon. De dataset som användes i denna studien var BDD100K, KITTI och Udacity, som tränades på individuella igenkänningsmodeller. Genom att applicera denna valideringsmetod, fastställdes det att BDD100K var det dataset som bidrog till systemet med bäst presterande igenkänningsförmåga. En analys av dataset storlek, etikettdistribution och genomsnittliga antalet etiketter per bild var även genomförd. Tillsammans med ett experiment som genomfördes för att testa modellerna i verkliga sammanhang, kunde det avgöras att valideringsmetoden stämde överens med de fastställda resultaten. Slutligen studerades TensorFlow Object Detection APIs förmåga att förbättra prestandan som erhålls av ett visuellt dataset. Genom användning av ett modifierat dataset, kunde det fastställas att TensorFlow Object Detection API är ett lämpligt modifieringsverktyg som kan användas för att öka prestandan av ett visuellt dataset. Deep Learning Vehicle Detection Machine Learning Dataset Evaluation Method Artificial Intelligence TensorFlow Object Detection SSD Faster R-CNN CNN Neural Networks Engineering and Technology Teknik och teknologier
109	A comparative analysis of CNN and LSTM for music genre classification / En jämförande analys av CNN och LSTM för klassificering av musikgenrer Gessle, Gabriel, Åkesson, Simon January 2019 (has links) The music industry has seen a great influx of new channels to browse and distribute music. This does not come without drawbacks. As the data rapidly increases, manual curation becomes a much more difficult task. Audio files have a plethora of features that could be used to make parts of this process a lot easier. It is possible to extract these features, but the best way to handle these for different tasks is not always known. This thesis compares the two deep learning models, convolutional neural network (CNN) and long short-term memory (LSTM), for music genre classification when trained using mel-frequency cepstral coefficients (MFCCs) in hopes of making audio data as useful as possible for future usage. These models were tested on two different datasets, GTZAN and FMA, and the results show that the CNN had a 56.0% and 50.5% prediction accuracy, respectively. This outperformed the LSTM model that instead achieved a 42.0% and 33.5% prediction accuracy. / Musikindustrin har sett en stor ökning i antalet sätt att hitta och distribuera musik. Det kommer däremot med sina nackdelar, då mängden data ökar fort så blir det svårare att hantera den på ett bra sätt. Ljudfiler har mängder av information man kan extrahera och därmed göra den här processen enklare. Det är möjligt att använda sig av de olika typer av information som finns i filen, men bästa sättet att hantera dessa är inte alltid känt. Den här rapporten jämför två olika djupinlärningsmetoder, convolutional neural network (CNN) och long short-term memory (LSTM), tränade med mel-frequency cepstral coefficients (MFCCs) för klassificering av musikgenre i hopp om att göra ljuddata lättare att hantera inför framtida användning. Modellerna testades på två olika dataset, GTZAN och FMA, där resultaten visade att CNN:et fick en träffsäkerhet på 56.0% och 50.5% tränat på respektive dataset. Denna utpresterade LSTM modellen som istället uppnådde en träffsäkerhet på 42.0% och 33.5%. Bachelor thesis music genre classification GTZAN FMA CNN LSTM Kandidatexamensarbete klassificering av musikgenrer GTZAN FMA CNN LSTM Computer and Information Sciences Data- och informationsvetenskap
110	Estimation of Water Depth from Multispectral Drone Imagery : A suitability assessment of CNN models for bathymetry retrieval in shallow water areas / Uppskattning av vattendjup från multispektrala drönarbilder : En lämplighetsbedömning av CNN-modeller för att hämta batymetri i grunda vattenområden. Shen, Qianyao January 2022 (has links) Aedes aegypti and Aedes albopictus are the main vector species for dengue disease and zika, two arboviruses that affect a substantial fraction of the global population. These mosquitoes breed in very slow-moving or standing pools of water, so detecting and managing these potential breeding habitats is a crucial step in preventing the spread of these diseases. Using high-resolution images collected by unmanned aerial vehicles (UAV) and their multispectral mapping data, this paper investigated bathymetry retrieval model in shallow water areas to help improve the habitat detection accuracy. While previous studies have found some success with shallow water bathymetry inversion on satellite imagery, accurate centimeter-level water depth regression from high-resolution, drone multispectral imagery still remains a challenge. Unlike previous retrieval methods generally relying on retrieval factor extraction and linear regression, this thesis introduced CNN methods, considering the nonlinear relationship between image pixel reflectance values and water depth. In order to look into CNN’s potential to retrieve shallow water depths from multispectral images captured by a drone, this thesis conducts a variety of case studies to respectively specify a proper CNN architecture, compare its performance in different datasets, band combinations, depth ranges and with other general bathymetry retrieval algorithms. In summary, the CNN-based model achieves the best regression accuracy of overall root mean square error lower than 0.5, in comparison with another machine learning algorithm, random forest, and 2 other semi-empirical methods, linear and ratio model, suggesting this thesis’s practical significance. / Aedes aegypti och Aedes albopictus är de viktigaste vektorarterna för dengue och zika, två arbovirus som drabbar en stor del av den globala befolkningen. Dessa myggor förökar sig i mycket långsamt rörliga eller stillastående vattensamlingar, så att upptäcka och hantera dessa potentiella förökningsmiljöer är ett avgörande steg för att förhindra spridningen av dessa sjukdomar. Med hjälp av högupplösta bilder som samlats in av obemannade flygfarkoster (UAV) och deras multispektrala kartläggningsdata undersöktes i den här artikeln en modell för att hämta batymetri i grunda vattenområden för att förbättra noggrannheten i upptäckten av livsmiljöer. Även om tidigare studier har haft viss framgång med inversion av bathymetri på grunt vatten med hjälp av satellitbilder, är det fortfarande en utmaning att göra en exakt regression av vattendjupet på centimeternivå från högupplösta, multispektrala bilder från drönare. Till skillnad från tidigare metoder som i allmänhet bygger på extrahering av återvinningsfaktorer och linjär regression, infördes i denna avhandling CNN-metoder som tar hänsyn till det icke-linjära förhållandet mellan bildpixlarnas reflektionsvärden och vattendjupet. För att undersöka CNN:s potential att hämta grunda vattendjup från multispektrala bilder som tagits av en drönare genomförs i denna avhandling en rad fallstudier för att specificera en lämplig CNN-arkitektur, jämföra dess prestanda i olika datamängder, bandkombinationer, djupintervall och med andra allmänna algoritmer för att hämta batymetri. Sammanfattningsvis uppnår den CNN-baserade modellen den bästa regressionsnoggrannheten med ett totalt medelkvadratfel som är lägre än 0,5, i jämförelse med en annan maskininlärningsalgoritm, random forest, och två andra halvempiriska metoder, linjär och kvotmodell, vilket tyder på den praktiska betydelsen av denna avhandling. Bathymetry Retrieval Multispectral Imagery Convolutional Neural Network (CNN) Hämtning Av Batymetri Multispektrala Bilder Konvolutionellt Neuralt Nätverk (CNN) Elektroteknik och elektronik

Search results