• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 5
  • 3
  • 1
  • Tagged with
  • 36
  • 36
  • 15
  • 12
  • 12
  • 12
  • 10
  • 9
  • 9
  • 9
  • 9
  • 7
  • 7
  • 7
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

AI-based image generation: The impact of fine-tuning on fake image detection

Hagström, Nick, Rydberg, Anders January 2024 (has links)
Machine learning-based image generation models such as Stable Diffusion are now capable of generating synthetic images that are difficult to distinguish from real images, which gives rise to a number of legal and ethical concerns. As a potential measure of mitigation, it is possible to train neural networks to detect the digital artifacts present in the images synthesized by many generative models. However, as the artifacts in question are often rather model-specific, these so-called detectors usually suffer from poor performance when presented with images from models it has not been trained on. In this thesis we study DreamBooth and LoRA, two recently emerged finetuning methods, and their impact on the performance of fake image detectors. DreamBooth and LoRA can be used to fine-tune a Stable Diffusion foundation model, which has the effect of creating an altered version of the base model. The ease with which this can be done has led to a proliferation of communitygenerated synthetic images. However, the effect of model fine-tuning on the detectability of images has not yet been studied in a scientific context. We therefore formulate the following research question: Does fine-tuning a Stable Diffusion base model using DreamBooth or LoRA affect the performance metrics of detectors trained on only base model images? We employ an experimental approach, using the pretrained VGG16 architecture for binary classification as detector. We train the detector on real images from the ImageNet dataset together with images synthesized by three different Stable Diffusion foundation models, resulting in three trained detectors. We then test their performance on images generated by fine-tuned versions of these models. We find that the accuracy of detectors when tested on images generated using fine-tuned models is lower than when tested on images generated by the base models on which they were trained. Within the former category, DreamBooth-generated images have a greater negative impact on detector accuracy than LoRA-generated images. Our study suggests there is a need to consider in particular DreamBooth fine-tuned models as distinct entities in the context of fake image detector training.
12

Generative Data Augmentation: Using DCGAN To Expand Training Datasets For Chest X-Ray Pneumonia Detection

Maier, Ryan D 01 June 2024 (has links) (PDF)
Recent advancements in computer vision have demonstrated remarkable success in image classification tasks, particularly when provided with an ample supply of accurately labeled images for training. These techniques have also exhibited significant potential in revolutionizing computer-aided medical diagnosis by enabling the segmentation and classification of medical images, leveraging Convolutional Neural Networks (CNNs) and similar models. However, the integration of such technologies into clinical practice faces notable challenges. Chief among these is the obstacle of acquiring high-quality medical imaging data for training purposes. Patient privacy concerns often hinder researchers from accessing large datasets, while less common medical conditions pose additional hurdles due to scarcity of relevant data. This study aims to address the issue of insufficient data availability in medical imaging analysis. We present experiments employing Deep Convolutional Generative Adversarial Networks (DCGANs) to augment training datasets of chest X-ray images, specifically targeting the identification of pneumonia-affected lungs using CNNs. Our findings demonstrate that DCGAN-based generative data augmentation consistently enhances classification performance, even when training sets are severely limited in size.
13

MORP: Monocular Orientation Regression Pipeline

Gunderson, Jacob 01 June 2024 (has links) (PDF)
Orientation estimation of objects plays a pivotal role in robotics, self-driving cars, and augmented reality. Beyond mere position, accurately determining the orientation of objects is essential for constructing precise models of the physical world. While 2D object detection has made significant strides, the field of orientation estimation still faces several challenges. Our research addresses these hurdles by proposing an efficient pipeline which facilitates rapid creation of labeled training data and enables direct regression of object orientation from a single image. We start by creating a digital twin of a physical object using an iPhone, followed by generating synthetic images using the Unity game engine and domain randomization. Our deep learning model, trained exclusively on these synthetic images, demonstrates promising results in estimating the orientations of common objects. Notably, our model achieves a median geodesic distance error of 3.9 degrees and operates at a brisk 15 frames per second.
14

Geração de imagens artificiais e quantização aplicadas a problemas de classificação / Artificial images generation and quantization applied to classification problems

Thumé, Gabriela Salvador 29 April 2016 (has links)
Cada imagem pode ser representada como uma combinação de diversas características, como por exemplo o histograma de intensidades de cor ou propriedades de textura da imagem. Essas características compõem um vetor multidimensional que representa a imagem. É comum esse vetor ser dado como entrada para um método de classificação de padrões que, após aprender por meio de diversos exemplos, pode gerar um modelo de decisão. Estudos sugerem evidências de que a preparação das imagens-- por meio da especificação cuidadosa da aquisição, pré-processamento e segmentação-- pode impactar significativamente a classificação. Além da falta de tratamento das imagens antes da extração de características, o desbalanceamento de classes também se apresenta como um obstáculo para que a classificação seja satisfatória. Imagens possuem características que podem ser exploradas para melhorar a descrição dos objetos de interesse e, portanto, sua classificação. Entre as possibilidades de melhorias estão: a redução do número de intensidades das imagens antes da extração de características ao invés de métodos de quantização no vetor já extraído; e a geração de imagens a partir das originais, de forma a promover o balanceamento de bases de dados cujo número de exemplos de cada classe é desbalanceado. Portanto, a proposta desta dissertação é melhorar a classificação de imagens utilizando métodos de processamento de imagens antes da extração de características. Especificamente, busca analisar a influência do balanceamento de bases de dados e da quantização na classificação. Este estudo analisa ainda a visualização do espaço de características após os métodos de geração artificial de imagens e de interpolação das características extraídas das imagens originais (SMOTE), comparando como espaço original. A ênfase dessa visualização se dá na observação da importância do rebalanceamento das classes. Os resultados obtidos indicam que a quantização simplifica as imagens antes da extração de características e posterior redução de dimensionalidade, produzindo vetores mais compactos; e que o rebalanceamento de classes de imagens através da geração de imagens artificiais pode melhorar a classificação da base de imagens, em relação à classificação original e ao uso de métodos no espaço de características já extraídas. / Each image can be represented by a combination of several features like color frequency and texture properties. Those features compose a multidimensional vector, which represents the original image. Commonly this vector is given as an input to a classification method that can learn from examplesand build a decision model. The literature suggests that image preparation steps like acute acquisition, preprocessing and segmentation can positively impact such classification. Besides that, class unbalancing is also a barrier to achieve good classification accuracy. Some features and methods can be explored to improveobjects\' description, thus their classification. Possible suggestions include: reducing colors number before feature extraction instead of applying quantization methods to raw vectors already extracted; and generating synthetic images from original ones, to balance the number of samples in an uneven data set. We propose to improve image classification using image processing methods before feature extraction. Specifically we want to analyze the influence of both balancing and quantization methods while applied to datasets in a classification routine. This research also analyses the visualization of feature space after the artificial image generation and feature interpolation (SMOTE), against to original space. Such visualization is used because it allows us to know how important is the rebalacing method. The results show that quantization simplifies imagesby producing compacted vectors before feature extraction and dimensionality reduction; and that using artificial generation to rebalance image datasets can improve classification, when compared to the original one and to applying methods on the already extracted feature vectors.
15

Geração de imagens artificiais e quantização aplicadas a problemas de classificação / Artificial images generation and quantization applied to classification problems

Gabriela Salvador Thumé 29 April 2016 (has links)
Cada imagem pode ser representada como uma combinação de diversas características, como por exemplo o histograma de intensidades de cor ou propriedades de textura da imagem. Essas características compõem um vetor multidimensional que representa a imagem. É comum esse vetor ser dado como entrada para um método de classificação de padrões que, após aprender por meio de diversos exemplos, pode gerar um modelo de decisão. Estudos sugerem evidências de que a preparação das imagens-- por meio da especificação cuidadosa da aquisição, pré-processamento e segmentação-- pode impactar significativamente a classificação. Além da falta de tratamento das imagens antes da extração de características, o desbalanceamento de classes também se apresenta como um obstáculo para que a classificação seja satisfatória. Imagens possuem características que podem ser exploradas para melhorar a descrição dos objetos de interesse e, portanto, sua classificação. Entre as possibilidades de melhorias estão: a redução do número de intensidades das imagens antes da extração de características ao invés de métodos de quantização no vetor já extraído; e a geração de imagens a partir das originais, de forma a promover o balanceamento de bases de dados cujo número de exemplos de cada classe é desbalanceado. Portanto, a proposta desta dissertação é melhorar a classificação de imagens utilizando métodos de processamento de imagens antes da extração de características. Especificamente, busca analisar a influência do balanceamento de bases de dados e da quantização na classificação. Este estudo analisa ainda a visualização do espaço de características após os métodos de geração artificial de imagens e de interpolação das características extraídas das imagens originais (SMOTE), comparando como espaço original. A ênfase dessa visualização se dá na observação da importância do rebalanceamento das classes. Os resultados obtidos indicam que a quantização simplifica as imagens antes da extração de características e posterior redução de dimensionalidade, produzindo vetores mais compactos; e que o rebalanceamento de classes de imagens através da geração de imagens artificiais pode melhorar a classificação da base de imagens, em relação à classificação original e ao uso de métodos no espaço de características já extraídas. / Each image can be represented by a combination of several features like color frequency and texture properties. Those features compose a multidimensional vector, which represents the original image. Commonly this vector is given as an input to a classification method that can learn from examplesand build a decision model. The literature suggests that image preparation steps like acute acquisition, preprocessing and segmentation can positively impact such classification. Besides that, class unbalancing is also a barrier to achieve good classification accuracy. Some features and methods can be explored to improveobjects\' description, thus their classification. Possible suggestions include: reducing colors number before feature extraction instead of applying quantization methods to raw vectors already extracted; and generating synthetic images from original ones, to balance the number of samples in an uneven data set. We propose to improve image classification using image processing methods before feature extraction. Specifically we want to analyze the influence of both balancing and quantization methods while applied to datasets in a classification routine. This research also analyses the visualization of feature space after the artificial image generation and feature interpolation (SMOTE), against to original space. Such visualization is used because it allows us to know how important is the rebalacing method. The results show that quantization simplifies imagesby producing compacted vectors before feature extraction and dimensionality reduction; and that using artificial generation to rebalance image datasets can improve classification, when compared to the original one and to applying methods on the already extracted feature vectors.
16

Real-time image based lighting with streaming HDR-light probe sequences

Hajisharif, Saghi January 2012 (has links)
This work presents a framework for shading of virtual objects using high dynamic range (HDR) light probe sequences in real-time. The method is based on using HDR environment map of the scene which is captured in an on-line process by HDR video camera as light probes. In each frame of the HDR video, an optimized CUDA kernel is used to project incident lighting into spherical harmonics in realtime. Transfer coefficients are calculated in an offline process. Using precomputed radiance transfer the radiance calculation reduces to a low order dot product between lighting and transfer coefficients. We exploit temporal coherence between frames to further smooth lighting variation over time. Our results show that the framework can achieve the effects of consistent illumination in real-time with flexibility to respond to dynamic changes in the real environment. We are using low-order spherical harmonics for representing both lighting and transfer functionsto avoid aliasing.
17

Vytváření umělých dat pro sestavování policejních fotorekognic / Generating synthetic data for an assembly of police lineups

Dokoupil, Patrik January 2021 (has links)
Eyewitness identification plays an important role during criminal proceedings and may lead to prosecution and conviction of a suspect. One of the methods of eyewitness identification is a police photo lineup when a collection of photographs is presented to the witness in order to identify the perpetrator of the crime. In the lineup, there is typically at most one photograph (typically exactly one) of the suspect and the remaining photographs are the so-called fillers, i.e. photographs of innocent people. Positive identification of the suspect by the witness may result in charge or conviction of the suspect. Assembly of the lineup is a challenging and tedious problem, because the wrong selection of the fillers may end up in a biased lineup, where the suspect will stand out from the fillers and would be easily identifiable even by a highly uncertain witness. The reason why it is tedious is due to the fact that this process is still done manually or only semi-automatically. This thesis tries to solve both issues by proposing a model that will be capable of generating synthetic data, together with an application that will allow users to obtain the fillers for a given suspect's photograph. 1
18

Three-Dimensional Fluorescence Microscopy Image Synthesis and Analysis Using Machine Learning

Liming Wu (6622538) 07 February 2023 (has links)
<p>Recent advances in fluorescence  microscopy enable deeper cellular imaging in living tissues with near-infrared excitation light. </p> <p>High quality fluorescence microscopy images provide useful information for analyzing biological structures and diagnosing diseases.</p> <p>Nuclei detection and segmentation are two fundamental steps for quantitative analysis of microscopy images.</p> <p>However, existing machine learning-based approaches are hampered by three main challenges: (1) Hand annotated ground truth is difficult to obtain especially for 3D volumes, (2) Most of the object detection methods work only on 2D images and are difficult to extend to 3D volumes, (3) Segmentation-based approaches typically cannot distinguish different object instances without proper post-processing steps.</p> <p>In this thesis, we propose various new methods for microscopy image analysis including nuclei synthesis, detection, and segmentation. </p> <p>Due to the limitation of manually annotated ground truth masks, we first describe how we generate 2D/3D synthetic microscopy images using SpCycleGAN and use them as a data augmentation technique for our detection and segmentation networks.</p> <p>For nuclei detection, we describe our RCNN-SliceNet for nuclei counting and centroid detection using slice-and-cluster strategy. </p> <p>Then we introduce our 3D CentroidNet for nuclei centroid estimation using vector flow voting mechanism which does not require any post-processing steps.</p> <p>For nuclei segmentation, we first describe our EMR-CNN for nuclei instance segmentation using ensemble learning and slice fusion strategy.</p> <p>Then we present the 3D Nuclei Instance Segmentation Network (NISNet3D) for nuclei instance segmentation using gradient vector field array.</p> <p>Extensive experiments have been conducted on a variety of challenging microscopy volumes to demonstrate that our approach can accurately detect and segment the cell nuclei and outperforms other compared methods.</p> <p>Finally, we describe the Distributed and Networked Analysis of Volumetric Image Data (DINAVID) system we developed for biologists to remotely analyze large microscopy volumes using machine learning. </p>
19

Guiding generation of 2D pixel art characters using text-image similarity models : A comparative study of generating 2D pixel art characters using PixelDraw and Diffusion Model guided by text-image similarity models / Guidad bildgeneration med använding av text-bild-likhetsmodeller för generation av 2D-pixel art karaktärer : En komparativ studie mellan bildgenerering av 2D-pixel art karaktärer med använding av PixelDraw och Diffusion model guidad av text-bild-likhetsmodeller

Löwenström, Paul January 2024 (has links)
Image generation has been taking large strides and new models showing great potential have been created. One of the continued struggles with image generation is controlling what the output will be, with no real way of guiding the generation into creating what the user wants. This has now been improved with the creation of text-image similarity models, which can be used together with an image generation model to guide the generation. This thesis will examine this new method of using a text-image similarity model and see how well it can generate pixel art of humanoid characters. The thesis compares the popular model Diffusion with a simple image generation method that relies solely on the text-image similarity models guidance. The results show that combining a diffusion model with a text-image similarity model improves the results over only using the text-image similarity model in almost every regard. Using a text-image similarity model allows the user to guide the generation, although sometimes the model will misinterpret the request. / Bildgeneration har tagit stora steg och nya modeller har tagits fram som visar stor potential. En av de forsatta svårigheterna med bildgeneration är att kontrollera vad modellen genererar. De nya text-bild-likhet modellerna förenklar nu för användare att tillsammans med en bildgenerator modell använda text-bild-likhet modellen att styra bildgeneratorn. Den här uppsatsen kommer utforska den nya metoden och se hur väl den kan användas för att generera mänskliga pixel art karaktärer. I uppsatsen kommer den populära Diffusion modellen jämföras med en enkel ritmetod som styrs av text-bild likhet modeller. Resultatet visar att kombinationen av en Diffusion modell och text-bild likhets modell ökar prestandan på nästan alla sätt i jämförelse med att låta text-bild-likhets modellen styra bildgeneratorn helt och hållet. Det visar sig att text-bild likhet modellen kan användas för att styra generationen men ibland så missförstår modellen vad som önskas.
20

Assisted Prompt Engineering : Making Text-to-Image Models Available Through Intuitive Prompt Applications / Assisterad Prompt Engineering : Gör Text-till-Bild Modeller Tillgängliga Med Intuitiva Prompt Applikationer

Björnler, Zimone January 2024 (has links)
This thesis explores the application of prompt engineering combined with human-AI interaction (HAII) to make text-to-image (TTI) models more accessible and intuitive for non-expert users. The thesis research focuses on developing an application with an intuitive interface that enables users to generate images without extensive knowledge of prompt engineering. A pre-post study was conducted to evaluate the application, demonstrating significant improvements in user satisfaction and ease of use. The findings suggest that such tailored interfaces can make AI technologies more accessible, empowering users to engage creatively with minimal technical barriers. This study contributes to the fields of Media technology and AI by showcasing how simplifying prompt engineering can enhance the accessibility of generative AI tools. / Detta examensarbete utforskar tillämpningen av prompt engineering i kombination med human-AI interaction för att göra text-till-bild modeller mer tillgängliga och intuitiva för icke-experter. Forskningen för examensarbetet fokuseras på att utveckla en applikation med ett intuitivt gränssnitt som gör det möjligt för användare att generera bilder utan omfattande kunskaper om prompt engineering. En före-efter-studie genomfördes för att utvärdera applikationen, vilket visade på en tydlig ökning i användarnöjdhet och användarvänlighet. Utfallet från studien tyder på att skräddarsydda gränssnitt kan göra AI-tekniken mer tillgänglig, och göra det möjligt för användare att nyttja det kreativa skapandet med minimerade tekniska hinder. Den här studien bidrar till områdena avmedieteknik och AI genom att demonstrera hur prompt engineering kan förenklas vilket kan förbättra tillgängligheten av AI-verktyg.

Page generated in 0.0389 seconds