Global ETD Search

31	UNCERTAINTY, EDGE, AND REVERSE-ATTENTION GUIDED GENERATIVE ADVERSARIAL NETWORK FOR AUTOMATIC BUILDING DETECTION IN REMOTELY SENSED IMAGES Somrita Chattopadhyay (12210671) 18 April 2022 (has links) Despite recent advances in deep-learning based semantic segmentation, automatic building detection from remotely sensed imagery is still a challenging problem owing to large variability in the appearance of buildings across the globe. The errors occur mostly around the boundaries of the building footprints, in shadow areas, and when detecting buildings whose exterior surfaces have reflectivity properties that are very similar to those of the surrounding regions. To overcome these problems, we propose a generative adversarial network based segmentation framework with uncertainty attention unit and refinement module embedded in the generator. The refinement module, composed of edge and reverse attention units, is designed to refine the predicted building map. The edge attention enhances the boundary features to estimate building boundaries with greater precision, and the reverse attention allows the network to explore the features missing in the previously estimated regions. The uncertainty attention unit assists the network in resolving uncertainties in classification. As a measure of the power of our approach, as of January 5, 2022, it ranks at the second place on DeepGlobe’s public leaderboard despite the fact that main focus of our approach — refinement of the building edges — does not align exactly with the metrics used for leaderboard rankings. Our overall F1-score on DeepGlobe’s challenging dataset is 0.745. We also report improvements on the previous-best results for the challenging INRIA Validation Dataset for which our network achieves an overall IoU of 81.28% and an overall accuracy of 97.03%. Along the same lines, for the official INRIA Test Dataset, our network scores 77.86% and 96.41% in overall IoU and accuracy. We have also improved upon the previous best results on two other datasets: For the WHU Building Dataset, our network achieves 92.27% IoU, 96.73% precision, 95.24% recall and 95.98% F1-score. And, finally, for the Massachusetts Buildings Dataset, our network achieves 96.19% relaxed IoU score and 98.03% relaxed F1-score over the previous best scores of 91.55% and 96.78% respectively, and in terms of non-relaxed F1 and IoU scores, our network outperforms the previous best scores by 2.77% and 3.89% respectively. Computer Vision Semantic Segmentation Attention Deep Learning Generative Adversarial Networks (GANs) Remote Sensing
32	Computer vision-based systems for environmental monitoring applications Porto Marques, Tunai 12 April 2022 (has links) Environmental monitoring refers to a host of activities involving the sampling or sensing of diverse properties from an environment in an effort to monitor, study and overall better understand it. While potentially rich and scientifically valuable, these data often create challenging interpretation tasks because of their volume and complexity. This thesis explores the efficiency of Computer Vision-based frameworks towards the processing of large amounts of visual environmental monitoring data. While considering every potential type of visual environmental monitoring measurement is not possible, this thesis elects three data streams as representatives of diverse monitoring layouts: visual out-of-water stream, visual underwater stream and active acoustic underwater stream. Detailed structure, objectives, challenges, solutions and insights from each of them are presented and used to assess the feasibility of Computer Vision within the environmental monitoring context. This thesis starts by providing an in-depth analysis of the definition and goals of environmental monitoring, as well as the Computer Vision systems typically used in conjunction with it. The document continues by studying the visual out-of-water stream via the design of a novel system employing a contrast-guided approach towards the enhancement of low-light underwater images. This enhancement system outperforms multiple state-of-the-art methods, as supported by a group of commonly-employed metrics. A pair of detection frameworks capable of identifying schools of herring, salmon, hake and swarms of krill are also presented in this document. The inputs used in their development, echograms, are visual representations of acoustic backscatter data from echosounder instruments, thus contemplating the active acoustic underwater stream. These detectors use different Deep Learning paradigms to account for the unique challenges presented by each pelagic species. Specifically, the detection of krill and finfish is accomplish with a novel semantic segmentation network (U-MSAA-Net) capable of leveraging local and contextual information from feature maps of multiple scales. In order to explore the out-of-water visual data stream, we examine a large dataset composed by years-worth of images from a coastal region with strong marine vessels traffic, which has been associated with significant anthropogenic footprints upon marine environments. A novel system that involves ``traditional'' Computer Vision and Deep Learning is proposed for the identification of such vessels under diverse visual appearances on this monitoring imagery. Thorough experimentation shows that this system is able to efficiently detect vessels of diverse sizes, shapes, colors and levels of visibility. The results and reflections presented in this thesis reinforce the hypothesis that Computer Vision offers an extremely powerful set of methods for the automatic, accurate, time- and space-efficient interpretation of large amounts of visual environmental monitoring data, as detailed in the remainder of this work. / Graduate Computer Vision Environmental Monitoring Deep Learning Machine Learning Object Detection Instance Segmentaton Semantic Segmentation
33	Apprentissage de nouvelles représentations pour la sémantisation de nuages de points 3D / Learning new representations for 3D point cloud semantic segmentation Thomas, Hugues 19 November 2019 (has links) Aujourd’hui, de nouvelles technologies permettent l’acquisition de scènes 3D volumineuses et précises sous la forme de nuages de points. Les nouvelles applications ouvertes par ces technologies, comme les véhicules autonomes ou la maintenance d'infrastructure, reposent sur un traitement efficace des nuages de points à grande échelle. Les méthodes d'apprentissage profond par convolution ne peuvent pas être utilisées directement avec des nuages de points. Dans le cas des images, les filtres convolutifs ont permis l’apprentissage de nouvelles représentations, jusqu’alors construites « à la main » dans les méthodes de vision par ordinateur plus anciennes. En suivant le même raisonnement, nous présentons dans cette thèse une étude des représentations construites « à la main » utilisées pour le traitement des nuages de points. Nous proposons ainsi plusieurs contributions, qui serviront de base à la conception d’une nouvelle représentation convolutive pour le traitement des nuages de points. Parmi elles, une nouvelle définition de voisinages sphériques multi-échelles, une comparaison avec les k plus proches voisins multi-échelles, une nouvelle stratégie d'apprentissage actif, la segmentation sémantique des nuages de points à grande échelle, et une étude de l'influence de la densité dans les représentations multi-échelles. En se basant sur ces contributions, nous introduisons la « Kernel Point Convolution » (KPConv), qui utilise des voisinages sphériques et un noyau défini par des points. Ces points jouent le même rôle que les pixels du noyau des convolutions en image. Nos réseaux convolutionnels surpassent les approches de segmentation sémantique de l’état de l’art dans presque toutes les situations. En plus de ces résultats probants, nous avons conçu KPConv avec une grande flexibilité et une version déformable. Pour conclure notre réflexion, nous proposons plusieurs éclairages sur les représentations que notre méthode est capable d'apprendre. / In the recent years, new technologies have allowed the acquisition of large and precise 3D scenes as point clouds. They have opened up new applications like self-driving vehicles or infrastructure monitoring that rely on efficient large scale point cloud processing. Convolutional deep learning methods cannot be directly used with point clouds. In the case of images, convolutional filters brought the ability to learn new representations, which were previously hand-crafted in older computer vision methods. Following the same line of thought, we present in this thesis a study of hand-crafted representations previously used for point cloud processing. We propose several contributions, to serve as basis for the design of a new convolutional representation for point cloud processing. They include a new definition of multiscale radius neighborhood, a comparison with multiscale k-nearest neighbors, a new active learning strategy, the semantic segmentation of large scale point clouds, and a study of the influence of density in multiscale representations. Following these contributions, we introduce the Kernel Point Convolution (KPConv), which uses radius neighborhoods and a set of kernel points to play the role of the kernel pixels in image convolution. Our convolutional networks outperform state-of-the-art semantic segmentation approaches in almost any situation. In addition to these strong results, we designed KPConv with a great flexibility and a deformable version. To conclude our argumentation, we propose several insights on the representations that our method is able to learn. 3D Apprentissage profond Nuage de points Convolution Sémantisation 3D Deep Learning Point clouds Convolution Semantic segmentation 629.89
34	Adversarial Framework with Temperature as a Regularizer for Semantic Segmentation Kim, Chanho 14 January 2022 (has links) Semantic Segmentation processes RGB scenes and classifies pixels collectively as an object. Recent deep learning methods have shown promising results in the accuracy and the speed of semantic segmentation. However, it is inevitable for the deep learning models to fall in overfitting to data used in training due to its nature of data-centric approaches. There have been numerous Regularization methods to overcome an overfitting problem, such as data augmentation, additional loss methods such as Euclidean or Least-Square terms, and structure-related methods by adding or modifying layers like Dropout and DropConnect in a network. Among those methods, penalizing a model via an additional loss or a weight constraint does not require memory increase. With this sight, our work purposes to improve a given segmentation model through temperatures and a lightweight discriminator. Temperatures have the role of generating different versions of probability maps through the division in softmax calculations. On top of probability maps from temperatures, we concatenate a simple discriminator after the segmentation network for the competition between groundtruth feature maps and modified feature maps. We pass the additional loss calculated from those probability maps into the principal network. Our contribution consists of two parts. Firstly, we use the adversarial loss as the regularization loss in the segmentation networks and validate that it can substitute the L2 regularization loss with better validation results. Also, we apply temperatures in segmentation probability maps for providing different information without using additional convolutional layers. The experiments indicate that the spiking temperature in a generator with keeping an original probability map in a discriminator provides the model improvement in terms of pixel accuracy and mean Intersection-of-Union (mIoU). Our framework shows that the segmentation model can be improved with a small increase in training time and the number of parameters. Semantic Segmentation Machine Learning Convolutional Neural Network (CNN) Adversarial Framework Temperature
35	Multi-Task Learning SegNet Architecture for Semantic Segmentation Sorg, Bradley R. January 2018 (has links) No description available. Computer Engineering Electrical Engineering Artificial Intelligence semantic segmentation SegNet scene labeling boundary detection multi-task learning
36	Semi Supervised Learning for Accurate Segmentation of Roughly Labeled Data Rajan, Rachel 01 September 2020 (has links) No description available. Computer Engineering
37	Improved U-Net architecture for Crack Detection in Sand Moulds Ahmed, Husain, Bajo, Hozan January 2023 (has links) The detection of cracks in sand moulds has long been a challenge for both safety and maintenance purposes. Traditional image processing techniques have been employed to identify and quantify these defects but have often proven to be inefficient, labour-intensive, and time-consuming. To address this issue, we sought to develop a more effective approach using deep learning techniques, specifically semantic segmentation. We initially examined three different architectures—U-Net, SegNet, and DeepCrack—to evaluate their performance in crack detection. Through testing and comparison, U-Net emerged as the most suitable choice for our project. To further enhance the model's accuracy, we combined U-Net with VGG-19, VGG-16, and ResNet architectures. However, these combinations did not yield the expected improvements in performance. Consequently, we introduced a new layer to the U-Net architecture, which significantly increased its accuracy and F1 score, making it more efficient for crack detection. Throughout the project, we conducted extensive comparisons between models to better understand the effects of various techniques such as batch normalization and dropout. To evaluate and compare the performance of the different models, we employed the loss function, accuracy, Adam optimizer, and F1 score as evaluation metrics. Some tables and figures explain the differences between models by using image comparison and evaluation metrics comparison; to show which model is better than the other. The conducted evaluations revealed that the U-Net architecture, when enhanced with an extra layer, proved superior to other models, demonstrating the highest scores and accuracy. This architecture has shown itself to be the most effective model for crack detection, thereby laying the foundation for a more cost-efficient and trustworthy approach to detecting and monitoring structural deficiencies. U-Net Architecture Semantic Segmentation Convolutional Neural Networks Crack Detection Computer Systems Datorsystem
38	Terrain Classification to find Drivable Surfaces using Deep Neural Networks : Semantic segmentation for unstructured roads combined with the use of Gabor filters to determine drivable regions trained on a small dataset Guin, Agneev January 2018 (has links) Autonomous vehicles face various challenges under difficult terrain conditions such as marginally rural or back-country roads, due to the lack of lane information, road signs or traffic signals. In this thesis, we investigate a novel approach of using Deep Neural Networks (DNNs) to classify off-road surfaces into the types of terrains with the aim of supporting autonomous navigation in unstructured environments. For example, off-road surfaces can be classified as asphalt, gravel, grass, mud, snow, etc. Images from the camera mounted on a mining truck were used to perform semantic segmentation and to classify road surface types. Camera images were segmented manually for training into sets of 16 and 9 classes, for all relevant classes and the drivable classes respectively. A small but diverse dataset of 100 images was augmented and compiled along with nearby frames from the video clips to expand this dataset. Neural networks were used to test the performance for the classification under these off-road conditions. Pre-trained AlexNet was compared to the networks without pre-training. Gabor filters, known to distinguish textured surfaces, was further used to improve the results of the neural network. The experiments show that pre-trained networks perform well with small datasets and many classes. A combination of Gabor filters with pre-trained networks can establish a dependable navigation path under difficult terrain conditions. While the results seem positive for images similar to the training image scenes, the networks fail to perform well in other situations. Though the tests imply that larger datasets are required for dependable results, this is a step closer to making the autonomous vehicles drivable under off-road conditions. / Autonoma fordon står inför olika utmaningar under svåra terrängförhållanden som landsbygds- eller skogsvägar på grund av bristen av körfältinformation, vägskyltar och trafikljus. I denna avhandling undersöker vi ett nytt tillvägagångssätt att använda Djupa Neurala Nätverk (DNN) för att klassificera terrängytor utifrån deras körbarhet i syfte att stödja autonom navigering i ostrukturerade miljöer.Till exempel kan terrängytor klassificeras som asfalt, grus, gräs, lera, snö etc. Bilder från kameran monterad på en gruvbil användes för att utföra semantisk segmentering och klassificera vägytor. Bilderna delades manuellt upp i träningsset på 16 samt 9 klasser för alla relevanta klasser respektive körbara klasser. Ett litet men mångsidigt dataset med 100 bilder förstärktes med närliggande bilder från videoklippen för att expandera detta dataset. Neurala nätverk användes för att testa prestandan hos klassificeringen under dessa terrängförhållanden. Det förtränade nätverket AlexNet jämfördes med nätverken utan träning. Gaborfilter, kända för att särskilja texturerade ytor, användes vidare för att förbättra resultaten av det neurala nätverket. Experimenten visar att förtränade nätverk presterar bra med små dataset och många klasser. En kombination av Gaborfilter med förtränade nätverk kan skapa en pålitlig navigationsväg under svåra terrängförhållanden. Även om resultaten verkar positiva för bilder som liknar träningsbildscenen presterar nätverken inte bra i andra situationer. Även om testen tyder på att stora dataset krävs för tillförlitliga resultat, är detta ett steg närmare att göra de autonoma bilarna körbara i svåra terrängförhållanden. Semantic segmentation Deep learning Gabor filters Drivable surfaces Robotics Robotteknik och automation
39	The "What"-"Where" Network: A Tool for One-Shot Image Recognition and Localization Hurlburt, Daniel 06 January 2021 (has links) One common shortcoming of modern computer vision is the inability of most models to generalize to new classes—one/few shot image recognition. We propose a new problem formulation for this task and present a network architecture and training methodology to solve this task. Further, we provide insights into how careful focus on how not just the data, but the way data presented to the model can have significant impact on performance. Using these method, we achieve high accuracy in few-shot image recognition tasks. computer vision semantic segmentation few-shot learning one-shot learning embedding Physical Sciences and Mathematics
40	Scene Recognition and Collision Avoidance System for Robotic Combine Harvesters Based on Deep Learning / 深層学習に基づくロボットコンバインハーベスタのためのシーン認識および衝突回避システム Li, Yang 23 September 2020 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(農学) / 甲第22784号 / 農博第2427号 / 新制\|\|農\|\|1081(附属図書館) / 学位論文\|\|R2\|\|N5304(農学部図書室) / 京都大学大学院農学研究科地域環境科学専攻 / (主査)教授飯田訓久, 教授近藤直, 教授中嶋洋 / 学位規則第4条第1項該当 / Doctor of Agricultural Science / Kyoto University / DFAM Robotic combine harvesters Collision avoidance Deep learning Scene recognition Collision avoidance Semantic segmentation 610

Search results