Global ETD Search

41	Automated Gland Detection in Colorectal Histopathological Images Al Zorgani, Maisun M., Mehmood, Irfan, Ugail, Hassan 25 March 2022 (has links) No / Clinical morphological analysis of histopathological specimens is a successful manner for diagnosing benign and malignant diseases. Analysis of glandular architecture is a major challenge for colon histopathologists as a result of the difficulty of identifying morphological structures in glandular malignant tumours due to the distortion of glands boundaries, furthermore the variation in the appearance of staining specimens. For reliable analysis of colon specimens, several deep learning methods have exhibited encouraging performance in the glands automatic segmentation despite the challenges. In the histopathology field, the vast number of annotation images for training the deep learning algorithms is the major challenge. In this work, we propose a trainable Convolutional Neural Network (CNN) from end to end for detecting the glands automatically. More specifically, the Modified Res-U-Net is employed for segmenting the colorectal glands in Haematoxylin and Eosin (H&E) stained images for challenging Gland Segmentation (GlaS) dataset. The proposed Res-U-Net outperformed the prior methods that utilise U-Net architecture on the images of the GlaS dataset. Colon gland semantic segmentation Colorectal adenocarcinoma Deep learning Histopathological image analysis
42	Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization Christie, Gordon A. 05 January 2017 (has links) Autonomous robot missions in unknown environments are challenging. In many cases, the systems involved are unable to use a priori information about the scene (e.g. road maps). This is especially true in disaster response scenarios, where existing maps are now out of date. Areas without GPS are another concern, especially when the involved systems are tasked with navigating a path planned by a remote base station. Scene understanding via robots' perception data (e.g. images) can greatly assist in overcoming these challenges. This dissertation makes three contributions that help overcome these challenges, where there is a focus on the application of autonomously searching for radiation sources with unmanned aerial vehicles (UAV) and unmanned ground vehicles (UGV) in unknown and unstructured environments. The three main contributions of this dissertation are: (1) An approach to overcome the challenges associated with simultaneously trying to understand 2D and 3D information about the environment. (2) Algorithms and experiments involving scene understanding for real-world autonomous search tasks. The experiments involve a UAV and a UGV searching for potentially hazardous sources of radiation is an unknown environment. (3) An approach to the registration of a UGV in areas without GPS using 2D image data and 3D data, where localization is performed in an overhead map generated from imagery captured in the air. / Ph. D. Scene Understanding Semantic Segmentation Unmanned Systems Drone aircraft UGV Path Planning
43	Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data : Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data He, Linbo January 2019 (has links) Semantic segmentation is a key approach to comprehensive image data analysis. It can be applied to analyze 2D images, videos, and even point clouds that contain 3D data points. On the first two problems, CNNs have achieved remarkable progress, but on point cloud segmentation, the results are less satisfactory due to challenges such as limited memory resource and difficulties in 3D point annotation. One of the research studies carried out by the Computer Vision Lab at Linköping University was aiming to ease the semantic segmentation of 3D point cloud. The idea is that by first projecting 3D data points to 2D space and then focusing only on the analysis of 2D images, we can reduce the overall workload for the segmentation process as well as exploit the existing well-developed 2D semantic segmentation techniques. In order to improve the performance of CNNs for 2D semantic segmentation, the study has used input data derived from different modalities. However, how different modalities can be optimally fused is still an open question. Based on the above-mentioned study, this thesis aims to improve the multistream framework architecture. More concretely, we investigate how different singlestream architectures impact the multistream framework with a given fusion method, and how different fusion methods contribute to the overall performance of a given multistream framework. As a result, our proposed fusion architecture outperformed all the investigated traditional fusion methods. Along with the best singlestream candidate and few additional training techniques, our final proposed multistream framework obtained a relative gain of 7.3\% mIoU compared to the baseline on the semantic3D point cloud test set, increasing the ranking from 12th to 5th position on the benchmark leaderboard. deep learning multimodal fusion multimodality semantic segmentation point cloud segmentation
44	Semantic Segmentation of Iron Ore Pellets with Neural Networks Svensson, Terese January 2019 (has links) This master’s thesis evaluates five existing Convolutional Neural Network (CNN) models for semantic segmentation of optical microscopy images of iron ore pellets. The models are PSPNet, FC-DenseNet, DeepLabv3+, BiSeNet and GCN. The dataset used for training and evaluation contains 180 microscopy images of iron ore pellets collected from LKAB’s experimental blast furnace in Luleå, Sweden. This thesis also investigates the impact of the dataset size and data augmentation on performance. The best performing CNN model on the task was PSPNet, which had an average accuracy of 91.7% on the dataset. Simple data augmentation techniques, horizontal and vertical flipping, improved the models’ average accuracy performance with 3.4% on average. From the results in this thesis, it was concluded that there are benefits to using CNNs for analysis of iron ore pellets, with time-saving and improved analysis as the two notable areas. Convolutional Neural Networks Iron Ore Pellets Semantic Segmentation Övrig annan teknik
45	Discriminative hand-object pose estimation from depth images using convolutional neural networks Goudie, Duncan January 2018 (has links) This thesis investigates the task of estimating the pose of a hand interacting with an object from a depth image. The main contribution of this thesis is the development of our discriminative one-shot hand-object pose estimation system. To the best of our knowledge, this is the first attempt at a one-shot hand-object pose estimation system. It is a two stage system consisting of convolutional neural networks. The first stage segments the object out of the hand from the depth image. This hand-minus-object depth image is combined with the original input depth image to form a 2-channel image for use in the second stage, pose estimation. We show that using this 2-channel image produces better pose estimation performance than a single stage pose estimation system taking just the input depth map as input. We also believe that we are amongst the first to research hand-object segmentation. We use fully convolutional neural networks to perform hand-object segmentation from a depth image. We show that this is a superior approach to random decision forests for this task. Datasets were created to train our hand-object pose estimator stage and hand-object segmentation stage. The hand-object pose labels were estimated semi-automatically with a combined manual annotation and generative approach. The segmentation labels were inferred automatically with colour thresholding. To the best of our knowledge, there were no public datasets for these two tasks when we were developing our system. These datasets have been or are in the process of being publicly released. 004
46	Localisation précise d'un véhicule par couplage vision/capteurs embarqués/systèmes d'informations géographiques / Localisation of a vehicle through low-cost sensors and geographic information systems fusion Salehi, Achkan 11 April 2018 (has links) La fusion entre un ensemble de capteurs et de bases de données dont les erreurs sont indépendantes est aujourd’hui la solution la plus fiable et donc la plus répandue de l’état de l’art au problème de la localisation. Les véhicules semi-autonomes et autonomes actuels, ainsi que les applications de réalité augmentée visant les contextes industriels exploitent des graphes de capteurs et de bases de données de tailles considérables, dont la conception, la calibration et la synchronisation n’est, en plus d’être onéreuse, pas triviale. Il est donc important afin de pouvoir démocratiser ces technologies, d’explorer la possibilité de l’exploitation de capteurs et bases de données bas-coûts et aisément accessibles. Cependant, ces sources d’information sont naturellement plus incertaines, et plusieurs obstacles subsistent à leur utilisation efficace en pratique. De plus, les succès récents mais fulgurants des réseaux profonds dans des tâches variées laissent penser que ces méthodes peuvent représenter une alternative peu coûteuse et efficace à certains modules des systèmes de SLAM actuels. Dans cette thèse, nous nous penchons sur la localisation à grande échelle d’un véhicule dans un repère géoréférencé à partir d’un système bas-coût. Celui-ci repose sur la fusion entre le flux vidéo d’une caméra monoculaire, des modèles 3d non-texturés mais géoréférencés de bâtiments,des modèles d’élévation de terrain et des données en provenance soit d’un GPS bas-coût soit de l’odométrie du véhicule. Nos travaux sont consacrés à la résolution de deux problèmes. Le premier survient lors de la fusion par terme barrière entre le VSLAM et l’information de positionnement fournie par un GPS bas-coût. Cette méthode de fusion est à notre connaissance la plus robuste face aux incertitudes du GPS, mais est plus exigeante en matière de ressources que la fusion via des fonctions de coût linéaires. Nous proposons une optimisation algorithmique de cette méthode reposant sur la définition d’un terme barrière particulier. Le deuxième problème est le problème d’associations entre les primitives représentant la géométrie de la scène(e.g. points 3d) et les modèles 3d des bâtiments. Les travaux précédents se basent sur des critères géométriques simples et sont donc très sensibles aux occultations en milieu urbain. Nous exploitons des réseaux convolutionnels profonds afin d’identifier et d’associer les éléments de la carte correspondants aux façades des bâtiments aux modèles 3d. Bien que nos contributions soient en grande partie indépendantes du système de SLAM sous-jacent, nos expériences sont basées sur l’ajustement de faisceaux contraint basé images-clefs. Les solutions que nous proposons sont évaluées sur des séquences de synthèse ainsi que sur des séquence urbaines réelles sur des distances de plusieurs kilomètres. Ces expériences démontrent des gains importants en performance pour la fusion VSLAM/GPS, et une amélioration considérable de la robustesse aux occultations dans la définition des contraintes. / The fusion between sensors and databases whose errors are independant is the most re-liable and therefore most widespread solution to the localization problem. Current autonomousand semi-autonomous vehicles, as well as augmented reality applications targeting industrialcontexts exploit large sensor and database graphs that are difficult and expensive to synchro-nize and calibrate. Thus, the democratization of these technologies requires the exploration ofthe possiblity of exploiting low-cost and easily accessible sensors and databases. These infor-mation sources are naturally tainted by higher uncertainty levels, and many obstacles to theireffective and efficient practical usage persist. Moreover, the recent but dazzling successes ofdeep neural networks in various tasks seem to indicate that they could be a viable and low-costalternative to some components of current SLAM systems.In this thesis, we focused on large-scale localization of a vehicle in a georeferenced co-ordinate frame from a low-cost system, which is based on the fusion between a monocularvideo stream, 3d non-textured but georeferenced building models, terrain elevation models anddata either from a low-cost GPS or from vehicle odometry. Our work targets the resolutionof two problems. The first one is related to the fusion via barrier term optimization of VS-LAM and positioning measurements provided by a low-cost GPS. This method is, to the bestof our knowledge, the most robust against GPS uncertainties, but it is more demanding in termsof computational resources. We propose an algorithmic optimization of that approach basedon the definition of a novel barrier term. The second problem is the data association problembetween the primitives that represent the geometry of the scene (e.g. 3d points) and the 3d buil-ding models. Previous works in that area use simple geometric criteria and are therefore verysensitive to occlusions in urban environments. We exploit deep convolutional neural networksin order to identify and associate elements from the map that correspond to 3d building mo-del façades. Although our contributions are for the most part independant from the underlyingSLAM system, we based our experiments on constrained key-frame based bundle adjustment.The solutions that we propose are evaluated on synthetic sequences as well as on real urbandatasets. These experiments show important performance gains for VSLAM/GPS fusion, andconsiderable improvements in the robustness of building constraints to occlusions. SLAM VSLAM Fusion de capteurs Apprentissage profond Segmentation sémantique GPS SLAM VSLAM Sensor fusion Deep learning Semantic segmentation GPS
47	Compréhension de scènes urbaines par combinaison d'information 2D/3D / Urban scenes understanding by combining 2D/3D information Bauda, Marie-Anne 13 June 2016 (has links) Cette thèse traite du problème de segmentation sémantique d'une séquence d'images calibrées acquises dans un environnement urbain. Ce problème consiste, plus précisément, à partitionner chaque image en régions représentant les objets de la scène (façades, routes, etc.). Ainsi, à chaque région est associée une étiquette sémantique. Dans notre approche, l'étiquetage s'opère via des primitives visuelles de niveau intermédiaire appelés super-pixels, lesquels regroupent des pixels similaires au sens de différents critères proposés dans la littérature, qu'ils soient photométriques (s'appuyant sur les couleurs) ou géométriques (limitant la taille des super-pixels formés). Contrairement à l'état de l'art, où les travaux récents traitant le même problème s'appuient en entrée sur une sur-segmentation initiale sans la remettre en cause, notre idée est de proposer, dans un contexte multi-vues, une nouvelle approche de constructeur de superpixels s'appuyant sur une analyse tridimensionnelle de la scène et, en particulier, de ses structures planes. Pour construire de «meilleurs» superpixels, une mesure de planéité locale, qui quantifie à quel point la zone traitée de l'image correspond à une surface plane de la scène, est introduite. Cette mesure est évaluée à partir d'une rectification homographique entre deux images proches, induites par un plan candidat au support des points 3D associés à la zone traitée. Nous analysons l'apport de la mesure UQI (Universal Quality Image) et montrons qu'elle se compare favorablement aux autres métriques qui ont le potentiel de détecter des structures planes. On introduit ensuite un nouvel algorithme de construction de super-pixels, fondé sur l'algorithme SLIC (Simple Linear Iterative Clustering) dont le principe est de regrouper les plus proches voisins au sens d'une distance fusionnant similarités en couleur et en distance, et qui intègre cette mesure de planéité. Ainsi la sur-segmentation obtenue, couplée à la cohérence interimages provenant de la validation de la contrainte de planéité locale de la scène, permet d'attribuer une étiquette à chaque entité et d'obtenir ainsi une segmentation sémantique qui partitionne l'image en objets plans. / This thesis deals with the semantic segmentation problem of a calibrated sequence of images acquired in an urban environment. The problem is, specifically, to partition each image into regions representing the objects in the scene such as facades, roads, etc. Thus, each region is associated with a semantic tag. In our approach, the labelling is done through mid-level visual features called super-pixels, which are groups of similar pixels within the meaning of some criteria proposed in research such as photometric criteria (based on colour) or geometrical criteria thus limiting the size of super-pixel formed. Unlike the state of the art, where recent work addressing the same problem are based on an initial over-segmentation input without calling it into question, our idea is to offer, in a multi-view environment, another super-pixel constructor approach based on a three-dimensional scene analysis and, in particular, an analysis of its planar structures. In order to construct "better" super-pixels, a local flatness measure is introduced which quantifies at which point the zone of the image in question corresponds to a planar surface of the scene. This measure is assessed from the homographic correction between two close images, induced by a candidate plan as support to the 3D points associated with the area concerned. We analyze the contribution of the UQI measure (Universal Image Quality) and demonstrate that it compares favorably with other metrics which have the potential to detect planar structures. Subsequently we introduce a new superpixel construction algorithm based on the SLIC (Simple Linear Iterative Clustering) algorithm whose principle is to group the nearest neighbors in terms of a distance merging similarities in colour and distance, and which includes this local planarity measure. Hence the over-segmentation obtained, coupled with the inter-image coherence as a result of the validation of the local flatness constraint related to the scene, allows assigning a label to each entity and obtaining in this way a semantic segmentation which divides the image into planar objects. Segmentation sémantique Superpixels Multi-vues Mesures de cohérence photométrique Planéité et homographie Semantic segmentation Superpixels Multi-view Photo-consistency measure Flatness and homography
48	Semantic segmentation of terrain and road terrain for advanced driver assistance systems Gheorghe, I. V. January 2015 (has links) Modern automobiles and particularly those with off-road lineage possess subsystems that can be configured to better negotiate certain terrain types. Different terrain classes amount to different adherence (or surface grip) and compressibility properties that impact vehicle ma-noeuvrability and should therefore incur a tailored throttle response, suspension stiffness and so on. This thesis explores prospective terrain recognition for an anticipating terrain response driver assistance system. Recognition of terrain and road terrain is cast as a semantic segmen-tation task whereby forward driving images or point clouds are pre-segmented into atomic units and subsequently classified. Terrain classes are typically of amorphous spatial extent con-taining homogenous or granularly repetitive patterns. For this reason, colour and texture ap-pearance is the saliency of choice for monocular vision. In this work, colour, texture and sur-face saliency of atomic units are obtained with a bag-of-features approach. Five terrain classes are considered, namely grass, dirt, gravel, shrubs and tarmac. Since colour can be ambiguous among terrain classes such as dirt and gravel, several texture flavours are explored with scalar and structured output learning in a bid to devise an appropriate visual terrain saliency and predictor combination. Texture variants are obtained using local binary patters (LBP), filter responses (or textons) and dense key-point descriptors with daisy. Learning algorithms tested include support vector machine (SVM), random forest (RF) and logistic regression (LR) as scalar predictors while a conditional random field (CRF) is used for structured output learning. The latter encourages smooth labelling by incorporating the prior knowledge that neighbouring segments with similar saliency are likely segments of the same class. Once a suitable texture representation is devised the attention is shifted from monocular vision to stereo vision. Sur-face saliency from reconstructed point clouds can be used to enhance terrain recognition. Pre-vious superpixels span corresponding supervoxels in real world coordinates and two surface saliency variants are proposed and tested with all predictors: one using the height coordinates of point clouds and the other using fast point feature histograms (FPFH). Upon realisation that road recognition and terrain recognition can be assumed as equivalent problems in urban en-vironments, the top most accurate models consisting of CRFs are augmented with composi-tional high order pattern potentials (CHOPP). This leads to models that are able to strike a good balance between smooth local labelling and global road shape. For urban environments the label set is restricted to road and non-road (or equivalently tarmac and non-tarmac). Ex-periments are conducted using a proprietary terrain dataset and a public road evaluation da-taset. 629.28
49	Deep neural networks for semantic segmentation Bojja, Abhishake Kumar 28 April 2020 (has links) Segmenting image into multiple meaningful regions is an essential task in Computer Vision. Deep Learning has been highly successful for segmentation, benefiting from the availability of the annotated datasets and deep neural network architectures. However, depth-based hand segmentation, an important application area of semantic segmentation, has yet to benefit from rich and large datasets. In addition, while deep methods provide robust solutions, they are often not efficient enough for low-powered devices. In this thesis, we focus on these two problems. To tackle the problem of lack of rich data, we propose an automatic method for generating high-quality annotations and introduce a large scale hand segmentation dataset. By exploiting the visual cues given by an RGBD sensor and a pair of colored gloves, we automatically generate dense annotations for two-hand segmentation. Our automatic annotation method lowers the cost/complexity of creating high-quality datasets and makes it easy to expand the dataset in the future. To reduce the computational requirement and allow real-time segmentation on low power devices, we propose a new representation and architecture for deep networks that predict segmentation maps based on Voronoi Diagrams. Voronoi Diagrams split space into discrete regions based on proximity to a set of points making them a powerful representation of regions, which we can then use to represent our segmentation outcomes. Specifically, we propose to estimate the location and class for these sets of points, which are then rasterized into an image. Notably, we use a differentiable definition of the Voronoi Diagram based on the softmax operator, enabling its use as a decoder layer in an end-to-end trainable network. As rasterization can take place at any given resolution, our method especially excels at rendering high-resolution segmentation maps, given a low-resolution image. We believe that our new HandSeg dataset will open new frontiers in Hand Segmentation research, and our cost-effective automatic annotation pipeline can benefit other relevant labeling tasks. Our newly proposed segmentation network enables high-quality segmentation representations that are not practically possible on low power devices using existing approaches. / Graduate Deep Learning Computer Vision Semantic Segmentation Dataset Hands Hand Segmentation Automatic Labelling Voronoi Implicit Representation Rendering Cityscapes HandSeg
50	AUTOMATIC ASSESSMENT OF BURN INJURIES USING ARTIFICIAL INTELLIGENCE Daniela Chanci Arrubla (11154033) 20 July 2021 (has links) <p>Accurate assessment of burn injuries is critical for the correct management of such wounds. Depending on the total body surface area affected by the burn, and the severity of the injury, the optimal treatment and the surgical requirements are selected. However, such assessment is considered a clinical challenge. In this thesis, to address this challenge, an automatic framework to segment the burn using RGB images, and classify the injury based on the severity using ultrasound images is proposed and implemented. With the use this framework, the conventional assessment approach, which relies exclusively on a physical and visual examination of the injury performed by medical practitioners, could be complemented and supported, yielding accurate results. The ultrasound data enables the assessment of internal structures of the body, which can provide complementary and useful information. It is a noninvasive imaging modality that provides access to internal body structures that are not visible during the typical physical examination of the burn. The semantic segmentation module of the proposed approach was evaluated through one experiment. Similarly, the classification module was evaluated through two experiments. The second experiment assessed the effects of incorporating texture features as extra features for the classification task. Experimental results and evaluation metrics demonstrated the satisfactory results obtained with the proposed framework for the segmentation and classification problem. Therefore, this work acts as a first step towards the creation of a Computer-Aided Diagnosis and Detection system for burn injury assessment.</p> Burn Injuries Ultrasound Artificial Intelligence Texture Computer Vision Semantic Segmentation Burn Depth Classification Burn Size

Search results