241 |
Computer Vision for Quarry ApplicationsChristie, Gordon A. 11 June 2013 (has links)
This thesis explores the use of computer vision to facilitate three different processes of a quarry's operation. The first is the blasting process. This is where operators determine where to drill in order to execute an efficient and safe blast. Having an operator manually determine the drilling angles and positions can lead to inefficient and dangerous blasts. By using two cameras, oriented vertically, and separated by a fixed baseline, Structure from Motion techniques can be used to create a scaled 3D model of a bench. This can then be analyzed to provide operators with borehole locations and drilling angles in relation to fixed reference targets.
The second process explored is the crushing process, where the rocks pass through different crushers that reduce the rocks into smaller sizes. The crushed rocks are then dropped onto a moving conveyor belt. The maximum dimension of the rocks exiting the crushers should not exceed size thresholds that are specific to each crusher. This thesis presents a 2D vision system capable of estimating the size distribution of the rocks by attempting to segment the rocks in each image. The size distribution, based on the maximum dimension of each rock, is estimated by finding the maximum dimension in the image in pixels and converting that to inches.
The third process of the quarry operations explored is where the final product is piled up to form stockpiles. For inventory purposes, operators often carry out a manual estimation of the size of a the stockpile. This thesis presents a vision system capable of providing a more accurate estimate for the size of the stockpile by using Structure from Motion techniques to create a 3D reconstruction. User interaction helps to find the points that are relevant to the stockpile in the resulting point cloud, which are then used to estimate the volume. / Master of Science
|
242 |
Experiments in Image Segmentation for Automatic US License Plate RecognitionDiaz Acosta, Beatriz 09 July 2004 (has links)
License plate recognition/identification (LPR/I) applies image processing and character recognition technology to identify vehicles by automatically reading their license plates. In the United States, however, each state has its own standard-issue plates, plus several optional styles, which are referred to as special license plates or varieties. There is a clear absence of standardization and multi-colored, complex backgrounds are becoming more frequent in license plates. Commercially available optical character recognition (OCR) systems generally fail when confronted with textured or poorly contrasted backgrounds, therefore creating the need for proper image segmentation prior to classification. The image segmentation problem in LPR is examined in two stages: license plate region detection and license plate character extraction from background. Three different approaches for license plate detection in a scene are presented: region distance from eigenspace, border location by edge detection and the Hough transform, and text detection by spectral analysis. The experiments for character segmentation involve the RGB, HSV/HSI and 1976 CIE L*a*b* color spaces as well as their Karhunen-Loéve transforms. The segmentation techniques applied include multivariate hierarchical agglomerative clustering and minimum-variance color quantization. The trade-off between accuracy and computational expense is used to select a final reliable algorithm for license plate detection and character segmentation. The spectral analysis approach together with the K-L L*a*b* transformed color quantization are found experimentally as the best alternatives for the two identified image segmentation stages for US license plate recognition. / Master of Science
|
243 |
COUNTING SORGHUM LEAVES FROM RGB IMAGES BY PANOPTIC SEGMENTATIONIan Ostermann (15321589) 19 April 2023 (has links)
<p dir="ltr">Meeting the nutritional requirements of an increasing population in a changing climate is the foremost concern of agricultural research in recent years. A solution to some of the many questions posed by this existential threat is breeding crops that more efficiently produce food with respect to land and water use. A key aspect to this optimization is geometric aspects of plant physiology such as canopy architecture that, while based in the actual 3D structure of the organism, does not necessarily require such a representation to measure. Although deep learning is a powerful tool to answer phenotyping questions that do not require an explicit intermediate 3D representation, training a network traditionally requires a large number of hand-segmented ground truth images. To bypass the enormous time and expense of hand- labeling datasets, we utilized a procedural sorghum image pipeline from another student in our group that produces images similar enough to the ground truth images from the phenotyping facility that the network can be directly used on real data while training only on automatically generated data. The synthetic data was used to train a deep segmentation network to identify which pixels correspond to which leaves. The segmentations were then processed to find the number of leaves identified in each image to use for the leaf-counting task in high-throughput phenotyping. Overall, our method performs comparably with human annotation accuracy by correctly predicting within a 90% confidence interval of the true leaf count in 97% of images while being faster and cheaper. This helps to add another expensive- to-collect phenotypic trait to the list of those that can be automatically collected.</p>
|
244 |
Using deep learning for IoT-enabled smart camera: a use case of flood monitoringMishra, Bhupesh K., Thakker, Dhaval, Mazumdar, S., Simpson, Sydney, Neagu, Daniel 15 July 2019 (has links)
Yes / In recent years, deep learning has been increasingly used for several applications such as object analysis, feature extraction and image classification. This paper explores the use of deep learning in a flood monitoring application in the context of an EC-funded project, Smart Cities and Open Data REuse (SCORE). IoT sensors for detecting blocked gullies and drainages are notoriously hard to build, hence we propose a novel technique to utilise deep learning for building an IoT-enabled smart camera to address this need. In our work, we apply deep leaning to classify drain blockage images to develop an effective image classification model for different severity of blockages. Using this model, an image can be analysed and classified in number of classes depending upon the context of the image. In building such model, we explored the use of filtering in terms of segmentation as one of the approaches to increase the accuracy of classification by concentrating only into the area of interest within the image. Segmentation is applied in data pre-processing stage in our application before the training. We used crowdsourced publicly available images to train and test our model. Our model with segmentation showed an improvement in the classification accuracy. / Research presented in this paper is funded by the European Commission Interreg project Smart Cities and Open Data REuse (SCORE).
|
245 |
Image Analysis for Sliding Motility of <i>Clostridium perfringens</i>Chopdekar, Nidhi 07 May 2024 (has links)
The research investigates the sliding motility of Clostridium perfringens by employing machine learning-based image segmentation techniques and tracking to extract key quantitative characteristics of the movement of the bacteria. C. perfringens cells maintain end-to-end connections after cell divisions and form elongated chains that expand in a one-dimensional fashion. Cells in the elongating chains are pushed by each other to achieve a sliding movement at potentially high speeds. However, these chains are susceptible to breakage due to stress accumulation from rapid growth, which would undermine efficiency of the passive sliding motility. Utilizing AI-powered image analysis, this research aims to obtain detailed quantification of these dynamics and generate data for future mechanistic studies of the sliding motility. Results from this work highlight the effectiveness of machine learning in detecting individual cells from microscopy images. The accurately segmented cells enable enhanced tracking and detailed analysis of bacterial motility. The results generate useful quantitative data such as growth rate, velocity, and division frequency of C. perfringens. / Master of Science / The research project explores the movement of Clostridium perfringens, a bacterium often responsible for food poisoning, by using machine learning techniques to observe and analyze how each bacterial cell moves within its colony. These bacteria form long, chain-like structures that help them move more rapidly. However, these chains can break when they become too long and undergo too much stress. By applying artificial intelligence-based tools to automatically detect and track cells in time-lapse microscopy videos, the project provides useful data of how these bacteria slide, grow and divide. These data will help us understand the chain-based bacterial sliding in C. perfringens and its underlying mechanism.
|
246 |
Addressing Occlusion in Panoptic SegmentationSarkaar, Ajit Bhikamsingh 20 January 2021 (has links)
Visual recognition tasks have witnessed vast improvements in performance since the advent of deep learning. Despite the gains in performance, image understanding algorithms are still not completely robust to partial occlusion. In this work, we propose a novel object classification method based on compositional modeling and explore its effect in the context of the newly introduced panoptic segmentation task. The panoptic segmentation task combines both semantic and instance segmentation to perform labelling of the entire image. The novel classification method replaces the object detection pipeline in UPSNet, a Mask R-CNN based design for panoptic segmentation. We also discuss an issue with the segmentation mask prediction of Mask R-CNN that affects overlapping instances. We perform extensive experiments and showcase results on the complex COCO and Cityscapes datasets. The novel classification method shows promising results for object classification on occluded instances in complex scenes. / Master of Science / Visual recognition tasks have witnessed vast improvements in performance since the advent of deep learning. Despite making significant improvements, algorithms for these tasks still do not perform well at recognizing partially visible objects in the scene. In this work, we propose a novel object classification method that uses compositional models to perform part based detection. The method first looks at individual parts of an object in the scene and then makes a decision about its identity. We test the proposed method in the context of the recently introduced panoptic segmentation task. The panoptic segmentation task combines both semantic and instance segmentation to perform labelling of the entire image. The novel classification method replaces the object detection module in UPSNet, a Mask R-CNN based algorithm for panoptic segmentation. We also discuss an issue with the segmentation mask prediction of Mask R-CNN that affects overlapping instances. After performing extensive experiments and evaluation, it can be seen that the novel classification method shows promising results for object classification on occluded instances in complex scenes.
|
247 |
Pruning of U-Nets : For Faster and Smaller Machine Learning Models in Medical Image SegmentationHassler, Ture January 2024 (has links)
Accurate medical image segmentation is crucial for safely and effectively administering radiation therapy in cancer treatment. State of the art methods for automatic segmentation of 3D images are currently based on the U-net machine learning architecture. The current U-net models are large, often containing millions of parameters. However, the size of these machine learning models can be decreased by removing parts of the models, in what is called pruning. One algorithm, called simultaneous training and pruning (STAMP) has shown capable of reducing the model sizes upwards of 80% while keeping similar or higher levels of performance for medical image segmentation tasks. This thesis investigates the impact of using the STAMP algorithm to reduce model size and inference time for medical image segmentation on 3D images, using one MRI and two CT datasets. Surprisingly, we show that pruning convolutional filters randomly achieves performance comparable, if not better than STAMP, provided that the filters are always removed from the largest parts of the U-net. Inspired by these results, a modified "Flat U-net" is proposed, where an equal number of convolutional filters are used in all parts of the U-net, similar to what was achieved after pruning with our simplified pruning algorithm. The modified U-net achieves similar levels of test dice score as both a regular U-net and the STAMP pruning algorithm, on multiple datasets while avoiding pruning altogether. In addition to this the proposed modification reduces the model size by more than a factor of 12, and the number of computations by around 35%, compared to a normal U-net with the same number of input-layer convolutional filters.
|
248 |
Segmentation d'image par intégration itérative de connaissances / Image segmentation by iterative knowledge integrationChaibou salaou, Mahaman Sani 02 July 2019 (has links)
Le traitement d’images est un axe de recherche très actif depuis des années. L’interprétation des images constitue une de ses branches les plus importantes de par ses applications socio-économiques et scientifiques. Cependant cette interprétation, comme la plupart des processus de traitements d’images, nécessite une phase de segmentation pour délimiter les régions à analyser. En fait l’interprétation est un traitement qui permet de donner un sens aux régions détectées par la phase de segmentation. Ainsi, la phase d’interprétation ne pourra analyser que les régions détectées lors de la segmentation. Bien que l’objectif de l’interprétation automatique soit d’avoir le même résultat qu’une interprétation humaine, la logique des techniques classiques de ce domaine ne marie pas celle de l’interprétation humaine. La majorité des approches classiques d’interprétation d’images séparent la phase de segmentation et celle de l’interprétation. Les images sont d’abord segmentées puis les régions détectées sont interprétées. En plus, au niveau de la segmentation les techniques classiques parcourent les images de manière séquentielle, dans l’ordre de stockage des pixels. Ce parcours ne reflète pas nécessairement le parcours de l’expert humain lors de son exploration de l’image. En effet ce dernier commence le plus souvent par balayer l’image à la recherche d’éventuelles zones d’intérêts. Dans le cas échéant, il analyse les zones potentielles sous trois niveaux de vue pour essayer de reconnaitre de quel objet s’agit-il. Premièrement, il analyse la zone en se basant sur ses caractéristiques physiques. Ensuite il considère les zones avoisinantes de celle-ci et enfin il zoome sur toute l’image afin d’avoir une vue complète tout en considérant les informations locales à la zone et celles de ses voisines. Pendant son exploration, l’expert, en plus des informations directement obtenues sur les caractéristiques physiques de l’image, fait appel à plusieurs sources d’informations qu’il fusionne pour interpréter l’image. Ces sources peuvent inclure les connaissent acquises grâce à son expérience professionnelle, les contraintes existantes entre les objets de ce type d’images, etc. L’idée de l’approche présentée ici est que simuler l’activité visuelle de l’expert permettrait une meilleure compatibilité entre les résultats de l’interprétation et ceux de l’expert. Ainsi nous retenons de cette analyse trois aspects importants du processus d’interprétation d’image que nous allons modéliser dans l’approche proposée dans ce travail : 1. Le processus de segmentation n’est pas nécessairement séquentiel comme la plus part des techniques de segmentations qu’on rencontre, mais plutôt une suite de décisions pouvant remettre en cause leurs prédécesseurs. L’essentiel étant à la fin d’avoir la meilleure classification des régions. L’interprétation ne doit pas être limitée par la segmentation. 2. Le processus de caractérisation d’une zone d’intérêt n’est pas strictement monotone i.e. que l’expert peut aller d’une vue centrée sur la zone à vue plus large incluant ses voisines pour ensuite retourner vers la vue contenant uniquement la zone et vice-versa. 3. Lors de la décision plusieurs sources d’informations sont sollicitées et fusionnées pour une meilleure certitude. La modélisation proposée de ces trois niveaux met particulièrement l’accent sur les connaissances utilisées et le raisonnement qui mène à la segmentation des images. / Image processing has been a very active area of research for years. The interpretation of images is one of its most important branches because of its socio-economic and scientific applications. However, the interpretation, like most image processing processes, requires a segmentation phase to delimit the regions to be analyzed. In fact, interpretation is a process that gives meaning to the regions detected by the segmentation phase. Thus, the interpretation phase can only analyze the regions detected during the segmentation. Although the ultimate objective of automatic interpretation is to produce the same result as a human, the logic of classical techniques in this field does not marry that of human interpretation. Most conventional approaches to this task separate the segmentation phase from the interpretation phase. The images are first segmented and then the detected regions are interpreted. In addition, conventional techniques of segmentation scan images sequentially, in the order of pixels appearance. This way does not necessarily reflect the way of the expert during the image exploration. Indeed, a human usually starts by scanning the image for possible region of interest. When he finds a potential area, he analyzes it under three view points trying to recognize what object it is. First, he analyzes the area based on its physical characteristics. Then he considers the region's surrounding areas and finally he zooms in on the whole image in order to have a wider view while considering the information local to the region and those of its neighbors. In addition to information directly gathered from the physical characteristics of the image, the expert uses several sources of information that he merges to interpret the image. These sources include knowledge acquired through professional experience, existing constraints between objects from the images, and so on.The idea of the proposed approach, in this manuscript, is that simulating the visual activity of the expert would allow a better compatibility between the results of the interpretation and those ofthe expert. We retain from the analysis of the expert's behavior three important aspects of the image interpretation process that we will model in this work: 1. Unlike what most of the segmentation techniques suggest, the segmentation process is not necessarily sequential, but rather a series of decisions that each one may question the results of its predecessors. The main objective is to produce the best possible regions classification. 2. The process of characterizing an area of interest is not a one way process i.e. the expert can go from a local view restricted to the region of interest to a wider view of the area, including its neighbors and vice versa. 3. Several information sources are gathered and merged for a better certainty, during the decision of region characterisation. The proposed model of these three levels places particular emphasis on the knowledge used and the reasoning behind image segmentation.
|
249 |
Vícetřídá segmentace 3D lékařských dat pomocí hlubokého učení / Multiclass segmentation of 3D medical data using deep learningSlunský, Tomáš January 2019 (has links)
Master's thesis deals with multiclass image segmentation using convolutional neural networks. The theoretical part of the Master's thesis focuses on image segmentation. There are basics principles of neural networks and image segmentation with more types of approaches. In practical part the Unet architecture is choosen and is described for image segmentation more. U-net was applied for medicine dataset. There is processing procedure which is more described for image proccesing of three-dimmensional data. There are also methods for data preproccessing which were applied for image multiclass segmentation. Final part of current master's thesis evaluates results.
|
250 |
GAN-Based Synthesis of Brain Tumor Segmentation Data : Augmenting a dataset by generating artificial imagesForoozandeh, Mehdi January 2020 (has links)
Machine learning applications within medical imaging often suffer from a lack of data, as a consequence of restrictions that hinder the free distribution of patient information. In this project, GANs (generative adversarial networks) are used to generate data synthetically, in an effort to circumvent this issue. The GAN framework PGAN is trained on the brain tumor segmentation dataset BraTS to generate new, synthetic brain tumor masks with the same visual characteristics as the real samples. The image-to-image translation network SPADE is subsequently trained on the image pairs in the real dataset, to learn a transformation from segmentation masks to brain MR images, and is in turn used to map the artificial segmentation masks generated by PGAN to corresponding artificial MR images. The images generated by these networks form a new, synthetic dataset, which is used to augment the original dataset. Different quantities of real and synthetic data are then evaluated in three different brain tumor segmentation tasks, where the image segmentation network U-Net is trained on this data to segment (real) MR images into the classes in question. The final segmentation performance of each training instance is evaluated over test data from the real dataset with the Weighted Dice Loss metric. The results indicate a slight increase in performance across all segmentation tasks evaluated in this project, when including some quantity of synthetic images. However, the differences were largest when the experiments were restricted to using only 20 % of the real data, and less significant when the full dataset was made available. A majority of the generated segmentation masks appear visually convincing to an extent (although somewhat noisy with regards to the intra-tumoral classes), while a relatively large proportion appear heavily noisy and corrupted. However, the translation of segmentation masks to MR images via SPADE proved more reliable and consistent.
|
Page generated in 0.1447 seconds