• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • 1
  • Tagged with
  • 25
  • 25
  • 18
  • 14
  • 13
  • 13
  • 10
  • 9
  • 9
  • 8
  • 8
  • 6
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

AI-based Quality Inspection forShort-Series Production : Using synthetic dataset to perform instance segmentation forquality inspection / AI-baserad kvalitetsinspektion för kortserieproduktion : Användning av syntetiska dataset för att utföra instans segmentering förkvalitetsinspektion

Russom, Simon Tsehaie January 2022 (has links)
Quality inspection is an essential part of almost any industrial production line. However, designing customized solutions for defect detection for every product can be costlyfor the production line. This is especially the case for short-series production, where theproduction time is limited. That is because collecting and manually annotating the training data takes time. Therefore, a possible method for defect detection using only synthetictraining data focused on geometrical defects is proposed in this thesis work. The methodis partially inspired by previous related work. The proposed method makes use of aninstance segmentation model and pose-estimator. However, this thesis work focuses onthe instance segmentation part while using a pre-trained pose-estimator for demonstrationpurposes. The synthetic data was automatically generated using different data augmentation techniques from a 3D model of a given object. Moreover, Mask R-CNN was primarilyused as the instance segmentation model and was compared with a rival model, HTC. Thetrials show promising results in developing a trainable general-purpose defect detectionpipeline using only synthetic data
22

Alternative Solution to Catastrophical Forgetting on FewShot Instance Segmentation

Álvarez Fernández Del Vallado, Juan January 2021 (has links)
Video instance segmentation is a rapidly-growing research area within the computer vision field. Models for segmentation require data already annotated, which can be a daunting task when starting from scratch. Although there are some publicly available datasets for image instance segmentation, they are limited to the application they target. This work proposes a new approach to training an instance segmentation model using transfer learning, notably reducing the need for annotated data. Transferring knowledge from domain A to domain B can result in catastrophical forgetting, leading to an algorithm unable to properly generalize and remember the previous knowledge acquired at the initial domain. This problem is studied and a solution is proposed based on data transformations applied precisely at the process of transferring knowledge to the target domain following the empirical research method and using publicly available video instance segmentation datasets as resources for the experiments. Conclusions show there is a relationship between the data transformations and ability to generalize both domains. / Segmentering av videointervjuer är ett snabbt växande forskningsområde inom datorseende. Modeller för segmentering kräver data som redan är annoterade, vilket kan vara en krävande uppgift när man börjar från början. Även om det finns några offentligt tillgängliga datamängder för bildinstanssegmentering är de begränsade till den tillämpning de är inriktade på. I detta arbete föreslås en ny metod för att träna en modell för instanssegmentering med hjälp av överföringsinlärning, vilket framför allt minskar behovet av annoterade data. Överföring av kunskap från domän A till domän B kan resultera i katastrofal glömska, vilket leder till att en algoritm inte kan generalisera och komma ihåg den tidigare kunskap som förvärvats i den ursprungliga domänen. Detta problem studeras och en lösning föreslås som bygger på datatransformationer som tillämpas just vid överföringen av kunskap till måldomänen enligt den empiriska forskningsmetoden och med hjälp av offentligt tillgängliga datamängder för segmentering av videointervjuer som resurser för experimenten. Slutsatserna visar att det finns ett samband mellan datatransformationer och förmågan att generalisera båda områdena.
23

[pt] DESENVOLVIMENTO DE UMA METODOLOGIA PARA CARACTERIZAÇÃO DE FASES NO PELLET FEED UTILIZANDO MICROSCOPIA DIGITAL E APRENDIZAGEM PROFUNDA / [en] DEVELOPMENT OF A METHODOLOGY FOR PHASE CHARACTERIZATION IN PELLET FEED USING DIGITAL MICROSCOPY AND DEEP LEARNING

THALITA DIAS PINHEIRO CALDAS 09 November 2023 (has links)
[pt] O minério de ferro é encontrado na natureza como agregado de minerais, dentre os principais minerais presentes em sua composição estão: hematita, magnetita, goethita e quartzo. Dada a importância do minério de ferro para a indústria, há um crescente interesse por sua caracterização com o objetivo de avaliar a qualidade do material. Com o avanço de pesquisas na área de análise de imagens e microscopia, rotinas de caracterização foram desenvolvidas utilizando ferramentas de Microscopia Digital e Processamento e Análise Digital de Imagens capazes de automatizar grande parte do processo. Porém esbarrava-se em algumas dificuldades, como por exemplo identificar e classificar as diferentes texturas das partículas de hematita, as diferentes formas de seus cristais ou discriminar quartzo e resina em imagens de microscopia ótica de luz refletida. Desta forma, a partir da necessidade de se construir sistemas capazes de aprender e se adaptar a possíveis variações das imagens deste material, surgiu a possibilidade de estudar a utilização de ferramentas de Deep Learning para esta função. Este trabalho propõe o desenvolvimento de uma nova metodologia de caracterização mineral baseada em Deep Learning utilizando o algoritmo Mask R-CNN. Através do qual é possível realizar segmentação de instâncias, ou seja, desenvolver sistemas capazes de identificar, classificar e segmentar objetos nas imagens. Neste trabalho, foram desenvolvidos dois modelos: Modelo 1 que realiza segmentação de instâncias para as classes compacta, porosa, martita e goethita em imagens obtidas em Campo Claro e o Modelo 2 que utiliza imagens adquiridas em Luz Polarizada Circularmente para segmentar as classes monocristalina, policristalina e martita. Para o Modelo 1 foi obtido F1-score em torno de 80 por cento e para o Modelo 2 em torno de 90 por cento. A partir da segmentação das classes foi possível extrair atributos importantes de cada partícula, como distribuição de quantidade, medidas de forma, tamanho e fração de área. Os resultados obtidos foram muito promissores e indicam que a metodologia desenvolvida pode ser viável para tal caracterização. / [en] Iron ore is found in nature as an aggregate of minerals. Among the main minerals in its composition are hematite, magnetite, goethite, and quartz. Given the importance of iron ore for the industry, there is a growing interest in its characterization to assess the material s quality. With the advancement of image analysis and microscopy research, characterization routines were developed using Digital Microscopy and Digital Image Processing and Analysis tools capable of automating a large part of the process. However, it encountered some difficulties, such as identifying and classifying the different textures of hematite particles, the different shapes of its crystals, or discriminating between quartz and resin in optical microscopy images of reflected light. Therefore, from the need to build systems capable of learning and adapting to possible variations of the images of this material, the possibility of studying the use of Deep Learning tools for this function arose. This work proposes developing a new mineral characterization methodology based on Deep Learning using the Mask R-CNN algorithm. Through this, it is possible to perform instance segmentation, that is, to develop systems capable of identifying, classifying, and segmenting objects in images. In this work, two models were developed: Model 1 performs segmentation of instances for the compact, porous, martite, and goethite classes in images obtained in Bright Field, and Model 2 uses images acquired in Circularly Polarized Light to segment the classes monocrystalline, polycrystalline and martite. For Model 1, F1-score was obtained around 80 percent, and for Model 2, around 90 percent. From the class segmentation, it was possible to extract important attributes of each particle, such as quantity distribution, shape measurements, size, and area fraction. The obtained results were very promising and indicated that the developed methodology could be viable for such characterization.
24

Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling : A Deep Learning Model Comparison / Instanssegmentering av kategoriserat skräp samt hantering av obalanserat dataset

Sievert, Rolf January 2021 (has links)
Instance segmentation has a great potential for improving the current state of littering by autonomously detecting and segmenting different categories of litter. With this information, litter could, for example, be geotagged to aid litter pickers or to give precise locational information to unmanned vehicles for autonomous litter collection. Land-based litter instance segmentation is a relatively unexplored field, and this study aims to give a comparison of the instance segmentation models Mask R-CNN and DetectoRS using the multiclass litter dataset called Trash Annotations in Context (TACO) in conjunction with the Common Objects in Context precision and recall scores. TACO is an imbalanced dataset, and therefore imbalanced data-handling is addressed, exercising a second-order relation iterative stratified split, and additionally oversampling when training Mask R-CNN. Mask R-CNN without oversampling resulted in a segmentation of 0.127 mAP, and with oversampling 0.163 mAP. DetectoRS achieved 0.167 segmentation mAP, and improves the segmentation mAP of small objects most noticeably, with a factor of at least 2, which is important within the litter domain since small objects such as cigarettes are overrepresented. In contrast, oversampling with Mask R-CNN does not seem to improve the general precision of small and medium objects, but only improves the detection of large objects. It is concluded that DetectoRS improves results compared to Mask R-CNN, as well does oversampling. However, using a dataset that cannot have an all-class representation for train, validation, and test splits, together with an iterative stratification that does not guarantee all-class representations, makes it hard for future works to do exact comparisons to this study. Results are therefore approximate considering using all categories since 12 categories are missing from the test set, where 4 of those were impossible to split into train, validation, and test set. Further image collection and annotation to mitigate the imbalance would most noticeably improve results since results depend on class-averaged values. Doing oversampling with DetectoRS would also help improve results. There is also the option to combine the two datasets TACO and MJU-Waste to enforce training of more categories.
25

Towards meaningful and data-efficient learning : exploring GAN losses, improving few-shot benchmarks, and multimodal video captioning

Huang, Gabriel 09 1900 (has links)
Ces dernières années, le domaine de l’apprentissage profond a connu des progrès énormes dans des applications allant de la génération d’images, détection d’objets, modélisation du langage à la réponse aux questions visuelles. Les approches classiques telles que l’apprentissage supervisé nécessitent de grandes quantités de données étiquetées et spécifiques à la tâches. Cependant, celles-ci sont parfois coûteuses, peu pratiques, ou trop longues à collecter. La modélisation efficace en données, qui comprend des techniques comme l’apprentissage few-shot (à partir de peu d’exemples) et l’apprentissage self-supervised (auto-supervisé), tentent de remédier au manque de données spécifiques à la tâche en exploitant de grandes quantités de données plus “générales”. Les progrès de l’apprentissage profond, et en particulier de l’apprentissage few-shot, s’appuient sur les benchmarks (suites d’évaluation), les métriques d’évaluation et les jeux de données, car ceux-ci sont utilisés pour tester et départager différentes méthodes sur des tâches précises, et identifier l’état de l’art. Cependant, du fait qu’il s’agit de versions idéalisées de la tâche à résoudre, les benchmarks sont rarement équivalents à la tâche originelle, et peuvent avoir plusieurs limitations qui entravent leur rôle de sélection des directions de recherche les plus prometteuses. De plus, la définition de métriques d’évaluation pertinentes peut être difficile, en particulier dans le cas de sorties structurées et en haute dimension, telles que des images, de l’audio, de la parole ou encore du texte. Cette thèse discute des limites et des perspectives des benchmarks existants, des fonctions de coût (training losses) et des métriques d’évaluation (evaluation metrics), en mettant l’accent sur la modélisation générative - les Réseaux Antagonistes Génératifs (GANs) en particulier - et la modélisation efficace des données, qui comprend l’apprentissage few-shot et self-supervised. La première contribution est une discussion de la tâche de modélisation générative, suivie d’une exploration des propriétés théoriques et empiriques des fonctions de coût des GANs. La deuxième contribution est une discussion sur la limitation des few-shot classification benchmarks, certains ne nécessitant pas de généralisation à de nouvelles sémantiques de classe pour être résolus, et la proposition d’une méthode de base pour les résoudre sans étiquettes en phase de testing. La troisième contribution est une revue sur les méthodes few-shot et self-supervised de détection d’objets , qui souligne les limites et directions de recherche prometteuses. Enfin, la quatrième contribution est une méthode efficace en données pour la description de vidéo qui exploite des jeux de données texte et vidéo non supervisés. / In recent years, the field of deep learning has seen tremendous progress for applications ranging from image generation, object detection, language modeling, to visual question answering. Classic approaches such as supervised learning require large amounts of task-specific and labeled data, which may be too expensive, time-consuming, or impractical to collect. Data-efficient methods, such as few-shot and self-supervised learning, attempt to deal with the limited availability of task-specific data by leveraging large amounts of general data. Progress in deep learning, and in particular, few-shot learning, is largely driven by the relevant benchmarks, evaluation metrics, and datasets. They are used to test and compare different methods on a given task, and determine the state-of-the-art. However, due to being idealized versions of the task to solve, benchmarks are rarely equivalent to the original task, and can have several limitations which hinder their role of identifying the most promising research directions. Moreover, defining meaningful evaluation metrics can be challenging, especially in the case of high-dimensional and structured outputs, such as images, audio, speech, or text. This thesis discusses the limitations and perspectives of existing benchmarks, training losses, and evaluation metrics, with a focus on generative modeling—Generative Adversarial Networks (GANs) in particular—and data-efficient modeling, which includes few-shot and self-supervised learning. The first contribution is a discussion of the generative modeling task, followed by an exploration of theoretical and empirical properties of the GAN loss. The second contribution is a discussion of a limitation of few-shot classification benchmarks, which is that they may not require class semantic generalization to be solved, and the proposal of a baseline method for solving them without test-time labels. The third contribution is a survey of few-shot and self-supervised object detection, which points out the limitations and promising future research for the field. Finally, the fourth contribution is a data-efficient method for video captioning, which leverages unsupervised text and video datasets, and explores several multimodal pretraining strategies.

Page generated in 0.0417 seconds