• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 24
  • 1
  • Tagged with
  • 27
  • 27
  • 19
  • 16
  • 15
  • 14
  • 11
  • 11
  • 10
  • 10
  • 8
  • 8
  • 8
  • 8
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Using Mask R-CNN for Instance Segmentation of Eyeglass Lenses / Användning av Mask R-CNN för instanssegmentering av glasögonlinser

Norrman, Marcus, Shihab, Saad January 2021 (has links)
This thesis investigates the performance of Mask R-CNN when utilizing transfer learning on a small dataset. The aim was to instance segment eyeglass lenses as accurately as possible from self-portrait images. Five different models were trained, where the key difference was the types of eyeglasses the models were trained on. The eyeglasses were grouped into three types, fully rimmed, semi-rimless, and rimless glasses. 1550 images were used for training, validation, and testing. The model's performances were evaluated using TensorBoard training data and mean Intersection over Union scores (mIoU). No major differences in performance were found in four of the models, which grouped all three types of glasses into one class. Their mIoU scores range from 0.913 to 0.94 whereas the model with one class for each group of glasses, performed worse, with a mIoU of 0.85. The thesis revealed that one can achieve great instance segmentation results using a limited dataset when taking advantage of transfer learning. / Denna uppsats undersöker prestandan för Mask R-CNN vid användning av överföringsinlärning på en liten datamängd. Syftet med arbetet var att segmentera glasögonlinser så exakt som möjligt från självporträttbilder. Fem olika modeller tränades, där den viktigaste skillnaden var de typer av glasögon som modellerna tränades på. Glasögonen delades in i 3 typer, helbåge, halvbåge och båglösa. Totalt samlades 1550 träningsbilder in, dessa annoterades och användes för att träna modellerna.  Modellens prestanda utvärderades med TensorBoard träningsdata samt genomsnittlig Intersection over Union (IoU). Inga större skillnader i prestanda hittades mellan modellerna som endast tränades på en klass av glasögon. Deras genomsnittliga IoU varierar mellan 0,913 och 0,94. Modellen där varje glasögonkategori representerades som en unik klass, presterade sämre med en genomsnittlig IoU på 0,85. Resultatet av uppsatsen påvisar att goda instanssegmenteringsresultat går att uppnå med hjälp av en begränsad datamängd om överföringsinlärning används.
12

Learning to Measure Invisible Fish

Gustafsson, Stina January 2022 (has links)
In recent years, the EU has observed a decrease in the stocks of certain fish species due to unrestricted fishing. To combat the problem, many fisheries are investigating how to automatically estimate the catch size and composition using sensors onboard the vessels. Yet, measuring the size of fish in marine imagery is a difficult task. The images generally suffer from complex conditions caused by cluttered fish, motion blur and dirty sensors. In this thesis, we propose a novel method for automatic measurement of fish size that can enable measuring both visible and occluded fish. We use a Mask R-CNN to segment the visible regions of the fish, and then fill in the shape of the occluded fish using a U-Net. We train the U-Net to perform shape completion in a semi-supervised manner, by simulating occlusions on an open-source fish dataset. Different to previous shape completion work, we teach the U-Net when to fill in the shape and not by including a small portion of fully visible fish in the input training data. Our results show that our proposed method succeeds to fill in the shape of the synthetically occluded fish as well as of some of the cluttered fish in real marine imagery. We achieve an mIoU score of 93.9 % on 1 000 synthetic test images and present qualitative results on real images captured onboard a fishing vessel. The qualitative results show that the U-Net can fill in the shapes of lightly occluded fish, but struggles when the tail fin is hidden and only parts of the fish body is visible. This task is difficult even for a human, and the performance could perhaps be increased by including the fish appearance in the shape completion task. The simulation-to-reality gap could perhaps also be reduced by finetuning the U-Net on some real occlusions, which could increase the performance on the heavy occlusions in the real marine imagery.
13

Depth-Aware Deep Learning Networks for Object Detection and Image Segmentation

Dickens, James 01 September 2021 (has links)
The rise of convolutional neural networks (CNNs) in the context of computer vision has occurred in tandem with the advancement of depth sensing technology. Depth cameras are capable of yielding two-dimensional arrays storing at each pixel the distance from objects and surfaces in a scene from a given sensor, aligned with a regular color image, obtaining so-called RGBD images. Inspired by prior models in the literature, this work develops a suite of RGBD CNN models to tackle the challenging tasks of object detection, instance segmentation, and semantic segmentation. Prominent architectures for object detection and image segmentation are modified to incorporate dual backbone approaches inputting RGB and depth images, combining features from both modalities through the use of novel fusion modules. For each task, the models developed are competitive with state-of-the-art RGBD architectures. In particular, the proposed RGBD object detection approach achieves 53.5% mAP on the SUN RGBD 19-class object detection benchmark, while the proposed RGBD semantic segmentation architecture yields 69.4% accuracy with respect to the SUN RGBD 37-class semantic segmentation benchmark. An original 13-class RGBD instance segmentation benchmark is introduced for the SUN RGBD dataset, for which the proposed model achieves 38.4% mAP. Additionally, an original depth-aware panoptic segmentation model is developed, trained, and tested for new benchmarks conceived for the NYUDv2 and SUN RGBD datasets. These benchmarks offer researchers a baseline for the task of RGBD panoptic segmentation on these datasets, where the novel depth-aware model outperforms a comparable RGB counterpart.
14

Thermal Imaging-Based Instance Segmentation for Automated Health Monitoring of Steel Ladle Refractory Lining / Infraröd-baserad Instanssegmentering för Automatiserad Övervakning av Eldfast Murbruk i Stålskänk

Bråkenhielm, Emil, Drinas, Kastrati January 2022 (has links)
Equipment and machines can be exposed to very high temperatures in the steel mill industry. One particularly critical part is the ladles used to hold and pour molten iron into mouldings. A refractory lining is used as an insulation layer between the outer steel shell and the molten iron to protect the ladle from the hot iron. Over time, or if the lining is not completely cured, the lining wears out or can potentially fail. Such a scenario can lead to a breakout of molten iron, which can cause damage to equipment and, in the worst case, workers. Previous work analyses how critical areas can be identified in a proactive matter. Using thermal imaging, the failing spots on the lining could show as high-temperature areas on the outside steel shell. The idea is that the outside temperature corresponds to the thickness of the insulating lining. The detection of these spots is identified when temperatures over a given threshold are registered within the thermal camera's field of view. The images must then be manually analyzed over time, to follow the progression of a detected spot. The existing solution is also prone to the background noise of other hot objects.  This thesis proposes an initial step to automate monitoring the health of refractory lining in steel ladles. The report will investigate the usage of Instance Segmentation to isolate the ladle from its background. Thus, reducing false alarms and background noise in an autonomous monitoring setup. The model training is based on Mask R-CNN on our own thermal images, with pre-trained weights from visual images. Detection is done on two classes: open or closed ladle. The model proved reasonably successful on a small dataset of 1000 thermal images. Different models were trained with and without augmentation, pre-trained weights as well multi-phase fine-tuning. The highest mAP of 87.5\% was achieved on a pre-trained model with image augmentation without fine-tuning. Though it was not tested in production, temperature readings could lastly be extracted on the segmented ladle, decreasing the risk of false alarms from background noise.
15

Point clouds in the application of Bin Picking

Anand, Abhijeet January 2023 (has links)
Automatic bin picking is a well-known problem in industrial automation and computer vision, where a robot picks an object from a bin and places it somewhere else. There is continuous ongoing research for many years to improve the contemporary solution. With camera technology advancing rapidly and available fast computation resources, solving this problem with deep learning has become a current interest for several researchers. This thesis intends to leverage the current state-of-the-art deep learning based methods of 3D instance segmentation and point cloud registration and combine them to improve the bin picking solution by improving the performance and make them robust. The problem of bin picking becomes complex when the bin contains identical objects with heavy occlusion. To solve this problem, a 3D instance segmentation is performed with Fast Point Cloud Clustering (FPCC) method to detect and locate the objects in the bin. Further, an extraction strategy is proposed to choose one predicted instance at a time. Inthe next step, a point cloud registration technique is implemented based on PointNetLK method to estimate the pose of the selected object from the bin. The above implementation is trained, tested, and evaluated on synthetically generated datasets. The synthetic dataset also contains several noisy point clouds to imitate a real situation. The real data captured at the company ’SICK IVP’ is also tested with the implemented model. It is observed that the 3D instance segmentation can detect and locate the objects available in the bin. In a noisy environment, the performance degrades as the noise level increase. However, the decrease in the performance is found to be not so significant. Point cloud registration is observed to register best with the full point cloud of the object, when compared to point cloud with missing points.
16

Strategies for the Characterization and Virtual Testing of SLM 316L Stainless Steel

Hendrickson, Michael Paul 02 August 2023 (has links)
The selective laser melting (SLM) process allows for the control of unique part form and function characteristics not achievable with conventional manufacturing methods and has thus gained interest in several industries such as the aerospace and biomedical fields. The fabrication processing parameters selected to manufacture a given part influence the created material microstructure and the final mechanical performance of the part. Understanding the process-structure and structure-performance relationships is very important for the design and quality assurance of SLM parts. Image based analysis methods are commonly used to characterize material microstructures, but are very time consuming, traditionally requiring manual segmentation of imaged features. Two Python-based image analysis tools are developed here to automate the instance segmentation of manufacturing defects and subgranular cell features commonly found in SLM 316L stainless steel (SS) for quantitative analysis. A custom trained mask region-based convolution neural network (Mask R-CNN) model is used to segment cell features from scanning electron microscopy (SEM) images with an instance segmentation accuracy nearly identical to that of a human researcher, but about four orders of magnitude faster. The defect segmentation tool uses techniques from the OpenCV Python library to identify and segment defect instances from optical images. A melt pool structure generation tool is also developed to create custom melt-pool geometries based on a few user inputs with the ability to create functionally graded structures for use in a virtual testing framework. This tool allows for the study of complex melt-pool geometries and graded structures commonly seen in SLM parts and is applied to three finite element analyses to investigate the effects of different melt-pool geometries on part stress concentrations. / Master of Science / Recent advancements in additive manufacturing (AM) processes like the selective laser melting (SLM) process are revolutionizing the way many products are manufactured. The geometric form and material microstructure of SLM parts can be controlled by manufacturing settings, referred to as fabrication processing parameters, in ways not previously possible via conventional manufacturing techniques such as machining and casting. The improved geometric control of SLM parts has enabled more complex part geometries as well as significant manufacturing cost savings for some parts. With improved control over the material microstructure, the mechanical performance of SLM parts can be finely tailored and optimized for a particular application. Complex functionally graded materials (FGM) can also easily be created with the SLM process by varying the fabrication processing parameters spatially within the manufactured part to improve mechanical performance for a desired application. The added control offered by the SLM process has created a need for understanding how changes in the fabrication processing parameters affect the material structure, and in turn, how the produced structure affects the mechanical properties of the part. This study presents three different tools developed for the automated characterization of SLM 316L stainless steel (SS) material structures and the generation of realistic material structures for numerical simulation of mechanical performance. A defect content tool is presented to automatically identify and create binary segmentations of defects in SLM parts, consisting of small air pockets within the volume of the parts, from digital optical images. A machine learning based instance segmentation tool is also trained on a custom data set and used to measure the size of nanoscale cell features unique to 316L (SS) and some other metal alloys processed with SLM from scanning electron microscopy (SEM) images. Both these tools automate the laborious process of segmenting individual objects of interest from hundreds or thousands of images and are shown to have an accuracy very close to that of manually produced results from a human. The results are also used to analyze three different samples produced with different fabrication processing parameters which showed similar process-structure relationships with other studies. The SLM structure generation tool is developed to create melt pool structures similar to those seen in SLM parts from the successive melting and solidification of material from the laser scanning path. This structural feature is unique to AM processes such as SLM, and the example test cases investigated in this study shows that changes in the melt pool structure geometry have a measurable effect, slightly above 10% difference, on the stress and strain response of the material when a tensile load is applied. The melt pool structure generation tool can create complex geometries capable of varying spatially to create FGMs from a few user inputs, and when applied to existing simulation methods for SLM parts, offers improved estimates for the mechanical response of SLM parts.
17

Online Panoptic Mapping of Indoor Environments : A complete panoptic mapping framework / Realtid Panoptisk Kartläggning av Inomhusmiljöer : Ett komplett panoptiskt kartläggningsramverk

G Sneltvedt, Isak January 2024 (has links)
Replicating a real-world environment is crucial for creating simulations, computer vision, global and local path planning, and localization. While computer-aided design software is a standard tool for such a task, it may not always be practical or effective. An alternative approach is mapping, which uses sensory input and computer vision technologies to reconstruct the environment. However, developing such software requires knowledge of various fields, making it a challenging task. This thesis deep-dives into a state-of-the-art mapping framework and explores potential improvements, providing a foundation for an open-source project. The resulting software can replicate a real-world environment while storing panoptic classification data on a voxel level. Through 3D object matching and probability theory, the mapping software is resilient to object misclassifications and retains consistency in the different instances of observed objects. The final software is designed to make it easy to use in a different project by substituting the simulation data with a semantic, instance, or panoptic segmentation model. Additionally, the software integrates certain functionalities that facilitate the visualization of diverse classes or a particular class instance. / Att replikera en verklig miljö är avgörande för att skapa simuleringar, datorseende, global och lokal vägplanering samt lokalisering. Trots att ett datorstött designprogram är ett standardverktyg för sådana uppgifter kanske det inte alltid är praktiskt eller effektivt. Ett alternativt tillvägagångssätt är kartläggning, som använder sensorisk input och datorseendeteknik för att uppnå reskonstruering av omgivningar. Att utveckla sådan programvara kräver dock kunskap inom olika områden, vilket gör det till en utmanande uppgift. Den här avhandlingen fördjupar sig i ett toppmodernt kartläggningsramverk och utforskar potentiella förbättringar, vilket ger en grund för ett öppet källkodsprojekt. Resultatet av denna avhandling är en programvara som kan replikera en verklig miljö samtidigt som den lagrar panoptisk klassificeringsdata på en voxelnivå. Genom 3D-objektmatchning och sannolikhetsteori är kartläggningsprogramvaran motståndskraftig mot felaktiga objektklassificeringar och är koncekvent avseende förekomsten av olika observerade objekt. Den slutliga programvaran är utformad med fokus på att göra den enkel att använda i andra projekt genom att ersätta simuleringsdata med en semantisk, instans eller panoptisk segmenteringsmodell. Dessutom integrerar programvaran funktioner som underlättar visualiseringen av antingen olika klasser eller en specifik instans av en klass.
18

Instance segmentation using 2.5D data

Öhrling, Jonathan January 2023 (has links)
Multi-modality fusion is an area of research that has shown promising results in the domain of 2D and 3D object detection. However, multi-modality fusion methods have largely not been utilized in the domain of instance segmentation. This master’s thesis investigated if multi-modality fusion methods can be applied to deep learning instance segmentation models to improve their performance on multi-modality data. The two multi-modality fusion methods presented, input extension and feature fusions, were applied to a two-stage instance segmentation model, Mask R-CNN, and a single-stage instance segmentation model, RTMDet. Models were trained on different variations of preprocessed RGBD and ToF data provided by SICK IVP, as well as RGBD data from the publicly available NYUDepth dataset. The master’s thesis concludes that the multi-modality fusion method presented as feature fusion can be applied to the Mask R-CNN model to improve the networks performance by 1.8%points (1.8%pt.) bounding box mAP and 1.6%pt. segmentation mAP on SICK RGBD, 7.7%pt. bounding box mAP and 7.4%pt. segmentation mAP on ToF, and 7.4%pt. bounding box mAP and 7.4%pt. segmentation mAP on NYUDepth. The RTMDet model saw little to no improvements from the inclusion of depth but had similar baseline performance as the improved Mask R-CNN model that utilized feature fusion. The input extension method saw no improvements to performance as it faced technical implementation limitations.
19

Instance Segmentation for Printed Circuit Board (PCB) Component Analysis : Exploring CNNs and Transformers for Component Detection on Printed Circuit Boards

Möller, Oliver January 2023 (has links)
In the intricate domain of Printed Circuit Boards (PCBs), object detection poses unique challenges, particularly given the broad size spectrum of components, ranging from a mere 2 pixels to several thousand pixels within a single high-resolution image, often averaging 4000x3000 pixels. Such resolutions are atypical in the realm of deep learning for computer vision, making the task even more demanding. Further complexities arise from the significant intra-class variability and minimal inter-class differences for certain component classes. In this master thesis, we rigorously evaluated the performance of a CNN-based object detection framework (FCOS) and a transformer model (DETR) for the task. Additionally, by integrating the novel foundational model from Meta, named ”Segment Anything,” we advanced the pipeline to include instance segmentation. The resultant model is proficient in detecting and segmenting component instances on PCB images, achieving an F1 score of 81% and 82% for the primary component classes of resistors and capacitors, respectively. Overall, when aggregated over 18 component classes, the model attains a commendable F1 score of 74%. This study not only underscores the potential of advanced deep learning techniques in PCB analysis but also paves the way for future endeavors in this interdisciplinary convergence of electronics and computer vision / I det komplicerade området med kretskort (PCB) innebär objektdetektering unika utmaningar, särskilt med tanke på det breda storleksspektrumet av komponenter, från bara 2 pixlar till flera tusen pixlar i en enda högupplöst bild, ofta i genomsnitt 4000x3000 pixlar. Sådana upplösningar är atypiska när det gäller djupinlärning för datorseende, vilket gör uppgiften ännu mer krävande. Ytterligare komplexitet uppstår från den betydande variationen inom klassen och minimala skillnader mellan klasserna för vissa komponentklasser. I denna masteruppsats utvärderade vi noggrant prestandan hos ett CNNbaserat ramverk för objektdetektering (FCOS) och en transformatormodell (DETR) för uppgiften. Genom att integrera den nya grundmodellen från Meta, med namnet ”Segment Anything”, utvecklade vi dessutom pipelinen för att inkludera instanssegmentering. Den resulterande modellen är skicklig på att upptäcka och segmentera komponentinstanser på PCB-bilder och uppnår en F1-poäng på 81% och 82% för de primära komponentklasserna resistorer respektive kondensatorer. När modellen aggregeras över 18 komponentklasser uppnår den en F1-poäng på 74%. Denna studie understryker inte bara potentialen hos avancerade djupinlärningstekniker vid PCB-analys utan banar också väg för framtida insatser inom denna tvärvetenskapliga konvergens av elektronik och datorseende.
20

Live Cell Imaging Analysis Using Machine Learning and Synthetic Food Image Generation

Yue Han (18390447) 17 April 2024 (has links)
<p dir="ltr">Live cell imaging is a method to optically investigate living cells using microscopy images. It plays an increasingly important role in biomedical research as well as drug development. In this thesis, we focus on label-free mammalian cell tracking and label-free abnormally shaped nuclei segmentation of microscopy images. We propose a method to use a precomputed velocity field to enhance cell tracking performance. Additionally, we propose an ensemble method, Weighted Mask Fusion (WMF), combining the results of multiple segmentation models with shape analysis, to improve the final nuclei segmentation mask. We also propose an edge-aware Mask RCNN and introduce a hybrid architecture, an ensemble of CNNs and Swin-Transformer Edge Mask R-CNNs (HER-CNN), to accurately segment irregularly shaped nuclei of microscopy images. Our experiments indicate that our proposed method outperforms other existing methods for cell tracking and abnormally shaped nuclei segmentation.</p><p dir="ltr">While image-based dietary assessment methods reduce the time and labor required for nutrient analysis, the major challenge with deep learning-based approaches is that the performance is heavily dependent on the quality of the datasets. Challenges with food data include suffering from high intra-class variance and class imbalance. In this thesis, we present an effective clustering-based training framework named ClusDiff for generating high-quality and representative food images. From experiments, we showcase our method’s effectiveness in enhancing food image generation. Additionally, we conduct a study on the utilization of synthetic food images to address the class imbalance issue in long-tailed food classification.</p>

Page generated in 0.651 seconds