• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • 4
  • 3
  • 1
  • Tagged with
  • 32
  • 32
  • 13
  • 11
  • 10
  • 10
  • 10
  • 8
  • 8
  • 8
  • 8
  • 7
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

muGen : Generative AI as Machinic Exploration of Cultural Archives / muGen : Generativ AI som maskinell utforskning av kulturarkiv

Yu, Yan January 2023 (has links)
In recent years, generative AI has quickly become a new creative and artistic tool that could challenge our understanding of the creative process and the role of the machine. Despite having exhibited visually promising results, images generated by AI tools present various challenges, most notably their tendency to display cultural, gender and racial biases. The objective of the project is to speculate on the concept and prototype of an alternative text-to-image generation system, designed to mitigate biases from linguistic and cultural differences, and facilitate diversity in machine creativity. muGen, the final design, is a fictional system that allows the user to generate images using data in different languages, while adding user controls such as time period to better associate user’s idea with the system. / Under de senaste åren har generativ AI snabbt blivit ett nytt kreativt och konstnärligt verktyg som kan utmana vår förståelse av den kreativa processen och maskinens roll. Trots att bilder som genererats av AI-verktyg har uppvisat visuellt lovande resultat finns det flera utmaningar, framför allt deras tendens att visa kulturella, köns- och rasmässiga partiskhet. Syftet med projektet är att spekulera kring konceptet och prototypen för ett alternativt text-till-bild-genereringssystem, utformat för att mildra partiskhet från språkliga och kulturella skillnader, och underlätta mångfald i maskinkreativitet. muGen, den slutliga designen, är ett fiktivt system som låter användaren generera bilder med hjälp av data på olika språk, samtidigt som det lägger till användarkontroller som tidsperiod för att bättre associera användarens idé med systemet.
22

Predictive MR Image Generation for Alzheimer’s Disease and Normal Aging Using Diffeomorphic Registration / Förutsägande generering av MR-bilder för Alzheimers sjukdom och normal åldrande med användning av diffeomorfisk registrering

Zheng, Yuqi January 2023 (has links)
Alzheimer´s Disease (AD) is the most prevalent cause of dementia, signifying a progressive and degenerative brain disorder that causes cognitive function deterioration including memory loss, communication difficulties, impaired judgment, and changes in behavior and personality. Compared to normal aging, AD introduces more profound cognitive impairments and brain morphology changes. Understanding these morphological changes associated with both normal aging and AD holds pivotal significance for the study of brain health. In recent years, the flourishing development of Artificial Intelligence (AI) has facilitated the analysis of medical images and the study of longitudinal brain morphology evolution. Numerous advanced AI-based frameworks have emerged to generate unbiased and realistic medical templates that represent the common characteristics within a cohort, providing valuable insights for cohort studies. Among these, Atlas-GAN is a state-of-the-art framework which can generate high-quality conditional deformable templates using diffeomorphic registration. However, cohort studies are not sufficient for individualized healthcare and treatment as each patient has a unique condition. Fortunately, the introduction of a mathematical mechanism, parallel transport, enables the inference of individual brain morphological evolution from cohort-level longitudinal templates. This project proposed an image generator that integrates the pole ladder, a tool for parallel transport implementation, into Atlas-GAN, to translate the cohort-level brain morphological evolution onto individual subjects, enabling the synthesis of anatomically plausible and personalized longitudinal Magnetic Resonance (MR) images based on one individual Magnetic Resonance Imaging (MRI) scan. In clinics, the synthesized images empower the physicians to retrospectively understand the patient's premorbid brain states and prospectively predict their brain morphology changes over time. Such capabilities are of paramount importance for the prognosis, diagnosis, and early-stage intervention of AD, especially given the current absence of a cure for AD. The primary contributions of this project include: (1) Introduction of an image generator that combines parallel transport with Atlas-GAN to synthesize individual longitudinal MR images for both the normal aging cohort and the cohort suffering from AD with both anatomical plausibility and preservation of individualized characteristics; (2) exploration into the prediction of individual longitudinal MR images in the case of an individual undergoing a state transition using the proposed generator; (3) conduction of both qualitative and quantitative evaluations and analyses for the synthesized images. / AD är den mest framträdande orsaken till demens och innebär en progressiv och degenerativ hjärnsjukdom som resulterar i kognitiv försämring, inklusive minnesförlust, kommunikationssvårigheter, nedsatt omdöme samt förändringar i beteende och personlighet. I jämförelse med normal åldrande introducerar AD mer djupgående kognitiva störningar och förändringar i hjärnans morfologi. Att förstå dessa morfologiska förändringar i samband med både normalt åldrande och AD har avgörande betydelse för studien av järnhälsa. De senaste årens blomstrande utveckling inom AI har underlättat analysen av medicinska bilder och studiet av långsiktig hjärnmorfologi. Flera avancerade AI-baserade ramverk har utvecklats för att generera opartiska och realistiska medicinska mallar som representerar gemensamma egenskaper inom en kohort och ger värdefulla insikter for kohortstudier. Bland dessa ar Atlas-GAN ett framstående ramverk som kan generera högkvalitativa, konditionellt deformabla mallar med hjälp av diffeomorfisk registrering. Dock ar kohortstudier inte tillräckliga för individualiserad sjukvård och behandling, eftersom varje patient har en unik situation. Som tur är möjliggör introduktionen av en matematisk mekanism, parallell transport, att man kan dra slutsatser om individuell hjärnmorfologisk utveckling från kohortbaserade longitudinella mallar. I detta projekt föreslogs en bildgenerator som integrerar pole ladder", ett verktyg for implementering av parallell transport, i Atlas- GAN. Detta möjliggör att kohortbaserad hjärnmorfologisk utveckling kan översättas till individnivå, vilket gör det möjligt att syntetisera anatomiskt trovärdiga och personifierade longitudinella MR-bilder baserade på en individs MRI-skanning. Inom kliniken gör de syntetiserade bilderna det möjligt för läkare att retrospektivt förstå patientens premorbida hjärnstatus och prospektivt förutsäga deras hjärnmorfologiska förändringar över tiden. Sådana möjligheter är av avgörande betydelse för prognos, diagnos och tidig intervention vid AD, särskilt med tanke på den nuvarande bristen på en botemedel för AD. De huvudsakliga bidragen från detta projekt inkluderar: (1) Introduktion av en bildgenerator som kombinerar parallell transport med Atlas-GAN för att syntetisera individuella longitudinella MR-bilder för både kohorten med normalt åldrande och kohorten som lider av AD, med både anatomisk trovärdighet och bevarande av individualiserade egenskaper. Dessutom har de genererade bilderna genomgått både kvalitativa och kvantitativa utvärderingar och analyser; (2) Utforskning av förutsägelse av individuella longitudinella MR-bilder i fallet när en individ genomgår en tillståndsövergång med hjälp av det föreslagna generatorn.
23

Live Cell Imaging Analysis Using Machine Learning and Synthetic Food Image Generation

Yue Han (18390447) 17 April 2024 (has links)
<p dir="ltr">Live cell imaging is a method to optically investigate living cells using microscopy images. It plays an increasingly important role in biomedical research as well as drug development. In this thesis, we focus on label-free mammalian cell tracking and label-free abnormally shaped nuclei segmentation of microscopy images. We propose a method to use a precomputed velocity field to enhance cell tracking performance. Additionally, we propose an ensemble method, Weighted Mask Fusion (WMF), combining the results of multiple segmentation models with shape analysis, to improve the final nuclei segmentation mask. We also propose an edge-aware Mask RCNN and introduce a hybrid architecture, an ensemble of CNNs and Swin-Transformer Edge Mask R-CNNs (HER-CNN), to accurately segment irregularly shaped nuclei of microscopy images. Our experiments indicate that our proposed method outperforms other existing methods for cell tracking and abnormally shaped nuclei segmentation.</p><p dir="ltr">While image-based dietary assessment methods reduce the time and labor required for nutrient analysis, the major challenge with deep learning-based approaches is that the performance is heavily dependent on the quality of the datasets. Challenges with food data include suffering from high intra-class variance and class imbalance. In this thesis, we present an effective clustering-based training framework named ClusDiff for generating high-quality and representative food images. From experiments, we showcase our method’s effectiveness in enhancing food image generation. Additionally, we conduct a study on the utilization of synthetic food images to address the class imbalance issue in long-tailed food classification.</p>
24

Segmentation and Deconvolution of Fluorescence Microscopy Volumes

Soonam Lee (6738881) 14 August 2019 (has links)
<div>Recent advances in optical microscopy have enabled biologists collect fluorescence microscopy volumes cellular and subcellular structures of living tissue. This results in collecting large datasets of microscopy volume and needs image processing aided automated quantification method. To quantify biological structures a first and fundamental step is segmentation. Yet, the quantitative analysis of the microscopy volume is hampered by light diffraction, distortion created by lens aberrations in different directions, complex variation of biological structures. This thesis describes several proposed segmentation methods to identify various biological structures such as nuclei or tubules observed in fluorescence microscopy volumes. To achieve nuclei segmentation, multiscale edge detection method and 3D active contours with inhomogeneity correction method are used for segmenting nuclei. Our proposed 3D active contours with inhomogeneity correction method utilizes 3D microscopy volume information while addressing intensity inhomogeneity across vertical and horizontal directions. To achieve tubules segmentation, ellipse model fitting to tubule boundary method and convolutional neural networks with inhomogeneity correction method are performed. More specifically, ellipse fitting method utilizes a combination of adaptive and global thresholding, potentials, z direction refinement, branch pruning, end point matching, and boundary fitting steps to delineate tubular objects. Also, the deep learning based method combines intensity inhomogeneity correction, data augmentation, followed by convolutional neural networks architecture. Moreover, this thesis demonstrates a new deconvolution method to improve microscopy image quality without knowing the 3D point spread function using a spatially constrained cycle-consistent adversarial networks. The results of proposed methods are visually and numerically compared with other methods. Experimental results demonstrate that our proposed methods achieve better performance than other methods for nuclei/tubules segmentation as well as deconvolution.</div>
25

TAIGA: uma abordagem para geração de dados de teste por meio de algoritmo genético para programas de processamento de imagens / TAIGA: an Approach to Test Image Generation for Image Processing Programs Using Genetic Algorithm

Rodrigues, Davi Silva 24 November 2017 (has links)
As atividades de teste de software são de crescente importância devido à maciça presença de sistemas de informação em nosso cotidiano. Programas de Processamento de Imagens (PI) têm um domínio de entrada bastante complexo e, por essa razão, o teste tradicional realizado com esse tipo de programa, conduzido majoritariamente de forma manual, é uma tarefa de alto custo e sujeita a imperfeições. No teste tradicional, em geral, as imagens de entrada são construídas manualmente pelo testador ou selecionadas aleatoriamente de bases de imagens, muitas vezes dificultando a revelação de defeitos no software. A partir de um mapeamento sistemático da literatura realizado, foi identificada uma lacuna no que se refere à geração automatizada de dados de teste no domínio de imagens. Assim, o objetivo desta pesquisa é propor uma abordagem - denominada TAIGA (Test imAge generatIon by Genetic Algorithm) - para a geração de dados de teste para programas de PI por meio de algoritmo genético. Na abordagem proposta, operadores genéticos tradicionais (mutação e crossover) são adaptados para o domínio de imagens e a função fitness é substituída por uma avaliação de resultados provenientes de teste de mutação. A abordagem TAIGA foi validada por meio de experimentos com oito programas de PI distintos, nos quais observaram-se ganhos de até 38,61% em termos de mutation score em comparação ao teste tradicional. Ao automatizar a geração de dados de teste, espera-se conferir maior qualidade ao desenvolvimento de sistemas de PI e contribuir com a diminuição de custos com as atividades de teste de software neste domínio / The massive presence of information systems in our lives has been increasing the importance of software test activities. Image Processing (IP) programs have very complex input domains and, therefore, the traditional testing for this kind of program is a highly costly and vulnerable to errors task. In traditional testing, usually, testers create images by themselves or they execute random selection from images databases, which can make it harder to reveal faults in the software under test. In this context, a systematic mapping study was conducted and a gap was identified concerning the automated test data generation in the images domain. Thus, an approach for generating test data for IP programs by means of genetic algorithms was proposed: TAIGA - Test imAge generatIon by Genetic Algorithm. This approach adapts traditional genetic operators (mutation and crossover) to the images domain and replaces the fitness function by the evaluation of the results of mutation testing. The proposed approach was validated by the execution of experiments involving eight distinct IP programs. TAIGA was able to provide up to 38.61% increase in mutation score when compared to the traditional testing for IP programs. It\'s expected that the automation of test data generation elevates the quality of image processing systems development and reduces the costs of software test activities in the images domain
26

TAIGA: uma abordagem para geração de dados de teste por meio de algoritmo genético para programas de processamento de imagens / TAIGA: an Approach to Test Image Generation for Image Processing Programs Using Genetic Algorithm

Davi Silva Rodrigues 24 November 2017 (has links)
As atividades de teste de software são de crescente importância devido à maciça presença de sistemas de informação em nosso cotidiano. Programas de Processamento de Imagens (PI) têm um domínio de entrada bastante complexo e, por essa razão, o teste tradicional realizado com esse tipo de programa, conduzido majoritariamente de forma manual, é uma tarefa de alto custo e sujeita a imperfeições. No teste tradicional, em geral, as imagens de entrada são construídas manualmente pelo testador ou selecionadas aleatoriamente de bases de imagens, muitas vezes dificultando a revelação de defeitos no software. A partir de um mapeamento sistemático da literatura realizado, foi identificada uma lacuna no que se refere à geração automatizada de dados de teste no domínio de imagens. Assim, o objetivo desta pesquisa é propor uma abordagem - denominada TAIGA (Test imAge generatIon by Genetic Algorithm) - para a geração de dados de teste para programas de PI por meio de algoritmo genético. Na abordagem proposta, operadores genéticos tradicionais (mutação e crossover) são adaptados para o domínio de imagens e a função fitness é substituída por uma avaliação de resultados provenientes de teste de mutação. A abordagem TAIGA foi validada por meio de experimentos com oito programas de PI distintos, nos quais observaram-se ganhos de até 38,61% em termos de mutation score em comparação ao teste tradicional. Ao automatizar a geração de dados de teste, espera-se conferir maior qualidade ao desenvolvimento de sistemas de PI e contribuir com a diminuição de custos com as atividades de teste de software neste domínio / The massive presence of information systems in our lives has been increasing the importance of software test activities. Image Processing (IP) programs have very complex input domains and, therefore, the traditional testing for this kind of program is a highly costly and vulnerable to errors task. In traditional testing, usually, testers create images by themselves or they execute random selection from images databases, which can make it harder to reveal faults in the software under test. In this context, a systematic mapping study was conducted and a gap was identified concerning the automated test data generation in the images domain. Thus, an approach for generating test data for IP programs by means of genetic algorithms was proposed: TAIGA - Test imAge generatIon by Genetic Algorithm. This approach adapts traditional genetic operators (mutation and crossover) to the images domain and replaces the fitness function by the evaluation of the results of mutation testing. The proposed approach was validated by the execution of experiments involving eight distinct IP programs. TAIGA was able to provide up to 38.61% increase in mutation score when compared to the traditional testing for IP programs. It\'s expected that the automation of test data generation elevates the quality of image processing systems development and reduces the costs of software test activities in the images domain
27

Algorithmes, architecture et éléments optiques pour l'acquisition embarquées d'images totalement focalisées et annotées en distance / Algorithms, architecture and optics components for embedded All-in-Focus and distance-annoted image acquision system

Emberger, Simon 13 December 2017 (has links)
L'acquisition de la profondeur d'une scène en plus de son image est une caractéristique souhaitable pour de nombreuses applications qui dépendent de l'environnement proche. L'état de l'art dans le domaine de l'extraction de profondeur propose de nombreuses méthodes, mais très peu sont réellement adaptées aux systèmes embarqués miniaturisés. Certaines parce qu'elles sont trop encombrantes en raison de leur système optique, d'autres parce qu'elles nécessitent une calibration délicate, ou des méthodes de reconstructions difficilement implantables dans un système embarqué. Dans cette thèse nous nous concentrons sur des méthodes a faible complexité matérielle afin de proposer une solution algorithmique et optique pour réaliser un capteur permettant à la fois d'extraire la profondeur de la scène, de fournir une évaluation de pertinence de cette mesure et de proposer des images focalisées en tout point. Dans ce sens, nous montrons que les algorithmes du type Depth from Focus (DfF) sont les plus adaptés à ces contraintes. Ce procédé consiste à acquérir un cube d'images multi-focus d'une même scène pour différentes distances de focalisation. Les images sont analysées afin d'annoter chacune des zones de la scène d'un indice relatif à sa profondeur estimée. Cet indice est utilisé pour reconstruire une image nette en tout point.Nous avons travaillé sur la notion de netteté afin de proposer des solutions peu complexes, uniquement basées sur des additions et comparaisons, et de fait, facilement adaptables pour un portage sur une architecture matérielle. La solution proposée effectue une analyse bidirectionnelle de contraste local puis combine les meilleures estimations de profondeur en fin de traitement. Elle se décline en trois approches avec une restriction de la complexité de plus en plus forte et ainsi une aptitude de plus en plus marquée pour l'embarqué. Pour chaque méthode, des cartes de profondeurs et de confiances sont établies, ainsi qu'une image totalement focalisée constituée d'éléments issus de l'ensemble du cube multi-focus. Ces approches sont comparées en qualité et en complexité à d'autres méthodes de l'état de l'art de complexité similaire. Une architecture est proposée pour une implantation matérielle de la solution la plus prometteuse. La conception de ces algorithmes soulève le problème de la qualité d'image. Il est en effet primordial d'avoir une évolution remarquable du contraste ainsi qu'une invariance de la scène lors de la capture du cube multi-focus. Un effet très souvent négligé dans ce type d'approche est le zoom parasite provoqué par la lentille responsable de la variation de focus. Ce zoom de focalisation fragilise l'aspect invariance de la scène et provoque l'apparition d'artefacts sur les trois informations Profondeur, Image et Confiance. La recherche d'optiques adaptées au DfF constitue donc un second axe de ces travaux. Nous avons évalué des lentilles liquides industrielles et des lentilles modales expérimentales à cristaux liquides nématiques conçues durant cette thèse. Ces technologies ont été comparées en termes de rapidité, de qualité d'image, d'intensité de zoom de focalisation engendré, de tension d'alimentation et enfin de qualité des cartes de profondeur extraites et des images totalement focalisées reconstruites.La lentille et l'algorithme répondant le mieux à cette problématique DfF embarqué ont ensuite été évalués via le portage sur une plateforme de développement CPU-GPU permettant l'acquisition d'images et de cartes de profondeurs et de confiances en temps réel. / Acquiring the depth of a scene in addition to its image is a desirable feature for many applications which depend on the near environment. The state of the art in the field of depth extraction offers many methods, but very few are well adapted to small embedded systems. Some of them are too cumbersome because of their large optical system. Others might require a delicate calibration or processing methods which are difficult to implement in an embedded system. In this PhD thesis, we focus on methods with low hardware complexity in order to propose algorithms and optical solutions that extract the depth of the scene, provide a relevance evaluation of this measurement and produce all-in-focus images. We show that Depth from Focus (DfF) algorithms are the most adapted to embedded electronics constraints. This method consists in acquiring a cube of multi-focus images of the same scene for different focusing distances. The images are analyzed in order to annotate each zone of the scene with an index relative to its estimated depth. This index is then used to build an all in focus image. We worked on the sharpness criterion in order to propose low complexity solutions, only based on additions and comparisons, easily adaptable on a hardware architecture. The proposed solution uses bidirectional local contrast analysis and then combines the most relevant depth estimations based on detection confidence at the end of treatment. It is declined in three approaches which need less and less processing and thus make them more and more adapted for a final embedded solution. For each method, depth and confidence maps are established, as well as an all-in-focus image composed of elements from the entire multi-focus cube. These approaches are compared in quality and complexity with other state-of-the-art methods which present similar complexity. A hardware implementation of the best solution is proposed. The design of these algorithms raises the problem of image quality. It is indeed essential to have a remarkable contrast evolution as well as a motionless scene during the capture of the multi-focus cube. A very often neglected effect in this type of approach is the parasitic zoom caused by the lens motion during a focus variation. This "focal zoom" weakens the invariance aspect of the scene and causes artifacts on the depth and confidence maps and on the all in focus image. The search for optics adapted to DfF is thus a second line of research in this work. We have evaluated industrial liquid lenses and experimental nematic liquid crystal modal lenses designed during this thesis. These technologies were compared in terms of speed, image quality, generated focal zoom intensity, power supply voltage and finally the quality of extracted depth maps and reconstructed all in focus images. The lens and the algorithm which best suited this embedded DfF issue were then evaluated on a CPU-GPU development platform allowing real time acquisition of depth maps, confidence maps and all in focus images.
28

Generation of Synthetic Retinal Images with High Resolution / Generation of Synthetic Retinal Images with High Resolution

Aubrecht, Tomáš January 2020 (has links)
K pořízení snímků sítnice, která představuje nejdůležitější část lidského oka, je potřeba speciálního vybavení, kterým je fundus kamera. Z tohoto důvodu je cílem této práce navrhnout a implementovat systém, který bude schopný generovat takovéto snímky bez použítí této kamery. Navržený systém využívá mapování vstupního černobílého snímku krevního řečiště sítnice na barevný výstupní snímek celé sítnice. Systém se skládá ze dvou neuronových sítí: generátoru, který generuje snímky sítnic, a diskriminátoru, který klasifikuje dané snímky jako reálné či syntetické. Tento systém byl natrénován na 141 snímcích z veřejně dostupných databází. Následně byla vytvořena nová databáze obsahující více než 2,800 snímků zdravých sítnic v rozlišení 1024x1024. Tato databáze může být použita jako učební pomůcka pro oční lékaře nebo může poskytovat základ pro vývoj různých aplikací pracujících se sítnicemi.
29

Object Detection with Deep Convolutional Neural Networks in Images with Various Lighting Conditions and Limited Resolution / Detektion av objekt med Convolutional Neural Networks (CNN) i bilder med dåliga belysningförhållanden och lågupplösning

Landin, Roman January 2021 (has links)
Computer vision is a key component of any autonomous system. Real world computer vision applications rely on a proper and accurate detection and classification of objects. A detection algorithm that doesn’t guarantee reasonable detection accuracy is not applicable in real time scenarios where safety is the main objective. Factors that impact detection accuracy are illumination conditions and image resolution. Both contribute to degradation of objects and lead to low classifications and detection accuracy. Recent development of Convolutional Neural Networks (CNNs) based algorithms offers possibilities for low-light (LL) image enhancement and super resolution (SR) image generation which makes it possible to combine such models in order to improve image quality and increase detection accuracy. This thesis evaluates different CNNs models for SR generation and LL enhancement by comparing generated images against ground truth images. To quantify the impact of the respective model on detection accuracy, a detection procedure was evaluated on generated images. Experimental results evaluated on images selected from NoghtOwls and Caltech Pedestrian datasets proved that super resolution image generation and low-light image enhancement improve detection accuracy by a substantial margin. Additionally, it has been proven that a cascade of SR generation and LL enhancement further boosts detection accuracy. However, the main drawback of such cascades is related to an increased computational time which limits possibilities for a range of real time applications. / Datorseende är en nyckelkomponent i alla autonoma system. Applikationer för datorseende i realtid är beroende av en korrekt detektering och klassificering av objekt. En detekteringsalgoritm som inte kan garantera rimlig noggrannhet är inte tillämpningsbar i realtidsscenarier, där huvudmålet är säkerhet. Faktorer som påverkar detekteringsnoggrannheten är belysningförhållanden och bildupplösning. Dessa bidrar till degradering av objekt och leder till låg klassificerings- och detekteringsnoggrannhet. Senaste utvecklingar av Convolutional Neural Networks (CNNs) -baserade algoritmer erbjuder möjligheter för förbättring av bilder med dålig belysning och bildgenerering med superupplösning vilket gör det möjligt att kombinera sådana modeller för att förbättra bildkvaliteten och öka detekteringsnoggrannheten. I denna uppsats utvärderas olika CNN-modeller för superupplösning och förbättring av bilder med dålig belysning genom att jämföra genererade bilder med det faktiska data. För att kvantifiera inverkan av respektive modell på detektionsnoggrannhet utvärderades en detekteringsprocedur på genererade bilder. Experimentella resultat utvärderades på bilder utvalda från NoghtOwls och Caltech datauppsättningar för fotgängare och visade att bildgenerering med superupplösning och bildförbättring i svagt ljus förbättrar noggrannheten med en betydande marginal. Dessutom har det bevisats att en kaskad av superupplösning-generering och förbättring av bilder med dålig belysning ytterligare ökar noggrannheten. Den största nackdelen med sådana kaskader är relaterad till en ökad beräkningstid som begränsar möjligheterna för en rad realtidsapplikationer.
30

Image generation through feature extraction and learning using a deep learning approach

Bruneel, Tibo January 2023 (has links)
With recent advancements, image generation has become more and more possible with the introduction of stronger generative artificial intelligence (AI) models. The idea and ability of generating non-existing images that highly resemble real world images is interesting for many use cases. Generated images could be used, for example, to augment, extend or replace real data sets for training AI models, therefore being capable of minimising costs on data collection and similar processes. Deep learning, a sub-field within the AI field has been on the forefront of such methodologies due to its nature of being able to capture and learn highly complex and feature-rich data. This work focuses on deep generative learning approaches within a forestry application, with the goal of generating tree log end images in order to enhance an AI model that uses such images. This approach would not only reduce costs of data collection for this model, but also many other information extraction models within the forestry field. This thesis study includes research on the state of the art within deep generative modelling and experiments using a full pipeline from a deep generative modelling stage to a log end recognition model. On top of this, a variant architecture and image sampling algorithm are proposed to add in this pipeline and evaluate its performance. The experiments and findings show that the applied generative model approaches show good feature learning, but lack the high-quality and realistic generation, resulting in more blurry results. The variant approach resulted in slightly better feature learning with a trade-off in generation quality. The proposed sampling algorithm proved to work well on a qualitative basis. The problems found in the generative models propagated further into the training of the recognition model, making the improvement of another AI model based on purely generated data impossible at this point in the research. The results of this research show that more work is needed on improving the application and generation quality to make it resemble real world data more, so that other models can be trained on artificial data. The variant approach does not improve much and its findings contribute to the field by proving its strengths and weaknesses, as with the proposed image sampling algorithm. At last this study provides a good starting point for research within this application, with many different directions and opportunities for future work.

Page generated in 0.1179 seconds