• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 87
  • 2
  • 1
  • 1
  • Tagged with
  • 109
  • 109
  • 109
  • 74
  • 49
  • 48
  • 37
  • 33
  • 31
  • 31
  • 31
  • 23
  • 20
  • 19
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

<b>MODEL BASED TRANSFER LEARNING ACROSS NANOMANUFACTURING PROCESSES AND BAYESIAN OPTIMIZATION FOR ADVANCED MODELING OF MIXTURE DATA</b>

Yueyun Zhang (18183583) 24 June 2024 (has links)
<p dir="ltr">Broadly, the focus of this work is on efficient statistical estimation and optimization of data arising from experimental data, particularly motivated by nanomanufacturing experiments on the material tellurene. Tellurene is a novel material for transistors with reliable attributes that enhance the performance of electronics (e.g., nanochip). As a solution-grown product, two-dimensional (2D) tellurene can be manufactured through a scalable process at a low cost. There are three main throughlines to this work, data augmentation, optimization, and equality constraint, and three distinct methodological projects, each of which addresses a subset of these throughlines. For the first project, I apply transfer learning in the analysis of data from a new tellurene experiment (process B) using the established linear regression model from a prior experiment (process A) from a similar study to combine the information from both experiments. The key of this approach is to incorporate the total equivalent amounts (TEA) of a lurking variable (experimental process changes) in terms of an observed (base) factor that appears in both experimental designs into the prespecified linear regression model. The results of the experimental data are presented including the optimal PVP chain length for scaling up production through a larger autoclave size. For the second project, I develop a multi-armed bandit Bayesian optimization (BO) approach to incorporate the equality constraint that comes from a mixture experiment on tellurium nanoproduct and account for factors with categorical levels. A more complex optimization approach was necessitated by the experimenters’ use of a neural network regression model to estimate the response surface. Results are presented on synthetic data to validate the ability of BO to recover the optimal response and its efficiency is compared to Monte Carlo random sampling to understand the level of experimental design complexity at which BO begins to pay off. The third project examines the potential enhancement of parameter estimation by utilizing synthetic data generated through Generative Adversarial Networks (GANs) to augment experimental data coming from a mixture experiment with a small to moderate number of runs. Transfer learning shows high promise for aiding in tellurene experiments, BO’s value increases with the complexity of the experiment, and GANs performed poorly on smaller experiments introducing bias to parameter estimates.</p>
82

[pt] FCGAN: CONVOLUÇÕES ESPECTRAIS VIA TRANSFORMADA RÁPIDA DE FOURIER PARA CAMPO RECEPTIVOS DE ABRANGÊNCIA GLOBAL EM REDES ADVERSÁRIAS GENERATIVAS / [en] FCGAN: SPECTRAL CONVOLUTIONS VIA FFT FOR CHANNEL-WIDE RECEPTIVE FIELD IN GENERATIVE ADVERSARIAL NETWORKS

PEDRO HENRIQUE BARROSO GOMES 23 May 2024 (has links)
[pt] Esta dissertação propõe a Rede Generativa Adversarial por Convolução Rápida de Fourier (FCGAN). Essa abordagem inovadora utiliza convoluções no domínio da frequência para permitir que a rede opere com um campo receptivo de abrangência global. Devido aos seus campos receptivos pequenos, GANs baseadas em convoluções tradicionais enfrentam dificuldades para capturar padrões estruturais e geométricos. Nosso método utiliza Convoluções Rápidas de Fourier (FFCs), que usam Transformadas de Fourier para operar no domínio espectral, afetando globalmente os canais da imagem. Assim, a FCGAN é capaz de gerar imagens considerando informações de todas as localizações dos mapas de entrada. Essa nova característica da rede pode levar a um desempenho errático e instável. Mostramos que a utilização de normalização espectral e injeções de ruído estabilizam o treinamento adversarial. O uso de convoluções espectrais em redes convolucionais tem sido explorado para tarefas como inpainting e super-resolução de imagens. Este trabalho foca no seu potencial para geração de imagens. Nossos experimentos também sustentam a afirmação que features de Fourier são substitutos de baixo custo operacional para camadas de self-attention, permitindo que a rede aprenda informações globais desde camadas iniciais. Apresentamos resultados qualitativos e quantitativos para demonstrar que a FCGAN proposta obtém resultados comparáveis a abordagens estado-da-arte com profundidade e número de parâmetros semelhantes, alcançando um FID de 18,98 no CIFAR-10 e 38,71 no STL-10 - uma redução de 4,98 e 1,40, respectivamente. Além disso, em maiores dimensões de imagens, o uso de FFCs em vez de self-attention permite batch-sizes com até o dobro do tamanho, e iterações até 26 por cento mais rápidas. / [en] This thesis proposes the Fast Fourier Convolution Generative Adversarial Network (FCGAN). This novel approach employs convolutions in the frequency domain to enable the network to operate with a channel-wide receptive field. Due to small receptive fields, traditional convolution-based GANs struggle to capture structural and geometric patterns. Our method uses Fast Fourier Convolutions (FFCs), which use Fourier Transforms to operate in the spectral domain, affecting the feature input globally. Thus, FCGAN can generate images considering information from all feature locations. This new hallmark of the network can lead to erratic and unstable performance. We show that employing spectral normalization and noise injections stabilizes adversarial training. The use of spectral convolutions in convolutional networks has been explored for tasks such as image inpainting and super-resolution. This work focuses on its potential for image generation. Our experiments further support the claim that Fourier features are lightweight replacements for self-attention, allowing the network to learn global information from early layers. We present qualitative and quantitative results to demonstrate that the proposed FCGAN achieves results comparable to state-of-the-art approaches of similar depth and parameter count, reaching an FID of 18.98 on CIFAR-10 and 38.71 on STL-10 - a reduction of 4.98 and 1.40, respectively. Moreover, in larger image dimensions, using FFCs instead of self-attention allows for batch sizes up to twice as large and iterations up to 26 percent faster.
83

DATA DRIVEN TECHNIQUES FOR THE ANALYSIS OF ORAL DOSAGE DRUG FORMULATIONS

Ziyi Cao (16986465) 20 September 2024 (has links)
<p dir="ltr">This thesis focusses on developing novel data driven oral drug formulation analysis methods by employing technologies such as Fourier transform analysis and generative adversarial learning. Data driven measurements have been addressing challenges in advanced manufacturing and analysis for pharmaceutical development for the last two decade. Data science combined with analytical chemistry holds the future to solving key problems in the next wave of industrial research and development. Data acquisition is expensive in the realm of pharmaceutical development, and how to leverage the capability of data science to extract information in data deprived circumstances is a key aspect for improving such data driven measurements. Among multiple measurement techniques, chemical imaging is an informative tool for analyzing oral drug formulations. However, chemical imaging can often fall into data deprived situations, where data could be limited from the time-consuming sample preparation or related chemical synthesis. An integrated imaging approach, which folds data science techniques into chemical measurements, could lead to a future of informative and cost-effective data driven measurements. In this thesis, the development of data driven chemical imaging techniques for the analysis of oral drug formulations via Fourier transformation and generative adversarial learning are elaborated. Chapter 1 begins with a brief introduction of current techniques commonly implemented within the pharmaceutical industry, their limitations, and how the limitations are being addressed. Chapter 2 discusses how Fourier transform fluorescence recovery after photobleaching (FT-FRAP) technique can be used for monitoring the phase separated drug-polymer aggregation. Chapter 3 follows the innovation presented in Chapter 1 and illustrates how analysis can be improved by incorporating diffractive optical elements in the patterned illumination. While previous chapters discuss dynamic analysis aspects of drug product formulation, Chapter 4 elaborates on the innovation in composition analysis of oral drug products via use of novel generative adversarial learning methods for linear analyses.</p>
84

Analyzing the Negative Log-Likelihood Loss in Generative Modeling / Analys av log-likelihood-optimering inom generativa modeller

Espuña I Fontcuberta, Aleix January 2022 (has links)
Maximum-Likelihood Estimation (MLE) is a classic model-fitting method from probability theory. However, it has been argued repeatedly that MLE is inappropriate for synthesis applications, since its priorities are at odds with important principles of human perception, and that, e.g. Generative Adversarial Networks (GANs) are a more appropriate choice. In this thesis, we put these ideas to the test, and explore the effect of MLE in deep generative modelling, using image generation as our example application. Unlike previous studies, we apply a new methodology that allows us to isolate the effects of the training paradigm from several common confounding factors of variation, such as the model architecture and the properties of the true data distribution. The thesis addresses two main questions. First, we ask if models trained via Non-Saturating Generative Adversarial Networks (NSGANs) are capable of producing more realistic images than the exact same architecture trained by directly minimizing the Negative Log-Likelihood (NLL) loss function instead (which is equivalent to MLE). We compare the two training paradigms using the MNIST dataset and a normalizing-flow architecture known as Real NVP, which can explicitly represent a very broad family of density functions. We use the Fréchet Inception Distance (FID) as an algorithmic estimate of subjective image quality. Second, we also analyze how the NLL loss behaves in the presence of model misspecification, which is when the model architecture is not capable of representing the true data distribution, and compare the resulting training curves and performance to those produced by models without misspecification. In order to control for and study different degrees of model misspecification, we create a realistic-looking – but actually synthetic – toy version of the classic MNIST dataset. By this we mean that we create a machine-learning problem where the examples in the dataset look like MNIST, but in fact it have been generated by a Real NVP architecture with known weights, and therefore the true distribution that generated the image data is known. We are not aware of this type of large-scale, realistic-looking toy problem having been used in prior work. Our results show that, first, models trained via NLL perform unexpectedly well in terms of FID, and that a Real NVP trained via an NSGAN approach is unstable during training – even at the Nash equilibrium, which is the global optimum onto which the NSGAN training updates are supposed to converge. Second, the experiments on synthetic data show that models with different degrees of misspecification reach different NLL losses on the training set, but all of them exhibit qualitatively similar convergence behavior. However, looking at the validation NLL loss reveals an important overfitting effect due to the finite size of the synthetic dataset: The models that in theory are able to perfectly describe the true data distribution achieve worse validation NLL losses in practice than some misspecified models, whose reduced complexity acts as a regularizer that helps them generalize better. At the same time, we observe that overfitting has a much stronger negative effect on the validation NLL loss than on the image quality as measured by the FID score. We also conclude that models with too many parameters and degrees of freedom (overparameterized models) should be avoided, as they not only are slow and frequently unstable to train, even using the NLL loss, but they also overfit heavily and produce poorer images. Throughout the thesis, our results highlight the complex and non-intuitive relationship between the NLL loss and the perceptual image quality as measured by the FID score. / Maximum likelihood-metoden är en klassisk parameteruppskattningsmetod från sannolikhetsteori. Det hävdas dock ofta att maximum likelihood är ett olämpligt val för tillämpningar inom exempelvis ljud- och bildsyntes, eftersom metodens prioriteringar står i strid med viktiga principer inom mänsklig perception, och att t.ex. Generative Adversarial Networks (GANs) är ett mer perceptuellt lämpligt val. I den här avhandlingen testar vi dessa hypoteser och utforskar effekten av maximum likelihood i djupa generativa modeller, med bildsyntes som vår exempeltillämpning. Till skillnad från tidigare studier använder vi en ny metodik som gör att vi kan isolera effekterna av träningsparadigmen från flera vanliga störfaktorer, såsom modellarkitekturen och hur väl denna arkitektur svarar mot datats sanna fördelning. Avhandlingen tar upp två huvudfrågor. Först frågar vi oss huruvida modeller tränade via NSGAN (Non-Saturating Generative Adversarial Networks) producerar mer realistiska bilder än om exakt samma arkitektur istället tränas att direkt minimera målfunktionen Negativ Log-Likelihood (NLL). (Att minimera NLL är ekvivalent med maximum likelihood-metoden.) För att jämföra de två träningsparadigmerna använder vi datamängden MNIST samt en normalizing flow-arkitektur kallad Real NVP, vilken på ett explicit sätt kan representera en mycket bred familj av kontinuerliga fördelingsfunktioner. Vi använder också Fréchet Inception Distance (FID) som ett mått för att algoritmiskt uppskatta kvaliteten på syntetiserade bilder. För det andra analyserar vi också hur målfunktionen NLL beter sig för felspecificerade modeller, vilket är det fall när modellarkitekturen inte kan representera datas sanna sannolikhetsfördelning perfekt, och jämför resulterande träningskurvor och -prestanda med motsvarande resultat när vi tränar modeller utan felspecifikation. För att studera och utöva kontroll över olika grader av felspecificerade modeller skapar vi en realistisk – men i själva verket syntetisk – leksaksversion av MNIST. Med detta menar vi att vi skapar ett maskininlärningsproblem där exemplen i datamängden är visuellt mycket lika de i MNIST, men i själva verket alla är slumpgenererade från en Real NVP-arkitektur med kända modellparametrar (vikter), och således är den sanna fördelningen för detta syntetiska bilddatamaterialet känd. Vi är inte medvetna om att någon tidigare forskning använt ett realistiskt och storskaligt leksaksproblem enligt detta recept. Våra resultat visar, för det första, att modeller som tränats via NLL presterar oväntat bra i termer av FID, och att NSGAN-baserad träning av Real NVP-modeller är instabil – även om vi startar träningen vid Nashjämvikten, vilken är det globala optimum som NSGAN är tänkt att konvergera mot. För det andra visar experimenten på syntetiska data att modeller med olika grader av felspecifikation når olika NLL-värden på träningsmaterialet, men de uppvisar alla kvalitativt liknande konvergensbeteende. Om man tittar på NLL-värdena på valideringsdata syns dock en överanpassningseffekt, som härrör från den ändliga storleken på det syntetiska träningsdatamaterialet; specifikt ser vi att de modeller som i teorin perfekt kan beskriva den sanna datafördelningen i praktiken uppnår sämre NLL-värden på valideringsdata än vissa felspecificerade modeller. Den reducerade komplexiteten hos de senare regulariserar uppenbarligen modellerna och hjälper dem att generalisera bättre. Samtidigt noterar vi att överanpassning har en mycket mer uttalad negativ effekt på validerings-NLL än på bildkvalitetsmåttet FID. Vi drar också slutsatsen att modeller med alltför många parametrar och frihetsgrader (överparametriserade modeller) bör undvikas, eftersom de inte bara är långsamma och ofta instabila att träna, också om vi tränar baserat på NLL, men dessutom uppvisar kraftig överanpassning och sämre bildkvalitet. Som helhet belyser resultaten i detta examensarbete det komplexa och icke-intuitiva förhållandet mellan NLL/maximum likelihood och perceptuell bildkvalitet utvärderad med hjälp av FID.
85

Advances in deep learning with limited supervision and computational resources

Almahairi, Amjad 12 1900 (has links)
Les réseaux de neurones profonds sont la pierre angulaire des systèmes à la fine pointe de la technologie pour une vaste gamme de tâches, comme la reconnaissance d'objets, la modélisation du langage et la traduction automatique. Mis à part le progrès important établi dans les architectures et les procédures de formation des réseaux de neurones profonds, deux facteurs ont été la clé du succès remarquable de l'apprentissage profond : la disponibilité de grandes quantités de données étiquetées et la puissance de calcul massive. Cette thèse par articles apporte plusieurs contributions à l'avancement de l'apprentissage profond, en particulier dans les problèmes avec très peu ou pas de données étiquetées, ou avec des ressources informatiques limitées. Le premier article aborde la question de la rareté des données dans les systèmes de recommandation, en apprenant les représentations distribuées des produits à partir des commentaires d'évaluation de produits en langage naturel. Plus précisément, nous proposons un cadre d'apprentissage multitâches dans lequel nous utilisons des méthodes basées sur les réseaux de neurones pour apprendre les représentations de produits à partir de textes de critiques de produits et de données d'évaluation. Nous démontrons que la méthode proposée peut améliorer la généralisation dans les systèmes de recommandation et atteindre une performance de pointe sur l'ensemble de données Amazon Reviews. Le deuxième article s'attaque aux défis computationnels qui existent dans l'entraînement des réseaux de neurones profonds à grande échelle. Nous proposons une nouvelle architecture de réseaux de neurones conditionnels permettant d'attribuer la capacité du réseau de façon adaptative, et donc des calculs, dans les différentes régions des entrées. Nous démontrons l'efficacité de notre modèle sur les tâches de reconnaissance visuelle où les objets d'intérêt sont localisés à la couche d'entrée, tout en maintenant une surcharge de calcul beaucoup plus faible que les architectures standards des réseaux de neurones. Le troisième article contribue au domaine de l'apprentissage non supervisé, avec l'aide du paradigme des réseaux antagoniste génératifs. Nous introduisons un cadre fléxible pour l'entraînement des réseaux antagonistes génératifs, qui non seulement assure que le générateur estime la véritable distribution des données, mais permet également au discriminateur de conserver l'information sur la densité des données à l'optimum global. Nous validons notre cadre empiriquement en montrant que le discriminateur est capable de récupérer l'énergie de la distribution des données et d'obtenir une qualité d'échantillons à la fine pointe de la technologie. Enfin, dans le quatrième article, nous nous attaquons au problème de l'apprentissage non supervisé à travers différents domaines. Nous proposons un modèle qui permet d'apprendre des transformations plusieurs à plusieurs à travers deux domaines, et ce, à partir des données non appariées. Nous validons notre approche sur plusieurs ensembles de données se rapportant à l'imagerie, et nous montrons que notre méthode peut être appliquée efficacement dans des situations d'apprentissage semi-supervisé. / Deep neural networks are the cornerstone of state-of-the-art systems for a wide range of tasks, including object recognition, language modelling and machine translation. In the last decade, research in the field of deep learning has led to numerous key advances in designing novel architectures and training algorithms for neural networks. However, most success stories in deep learning heavily relied on two main factors: the availability of large amounts of labelled data and massive computational resources. This thesis by articles makes several contributions to advancing deep learning, specifically in problems with limited or no labelled data, or with constrained computational resources. The first article addresses sparsity of labelled data that emerges in the application field of recommender systems. We propose a multi-task learning framework that leverages natural language reviews in improving recommendation. Specifically, we apply neural-network-based methods for learning representations of products from review text, while learning from rating data. We demonstrate that the proposed method can achieve state-of-the-art performance on the Amazon Reviews dataset. The second article tackles computational challenges in training large-scale deep neural networks. We propose a conditional computation network architecture which can adaptively assign its capacity, and hence computations, across different regions of the input. We demonstrate the effectiveness of our model on visual recognition tasks where objects are spatially localized within the input, while maintaining much lower computational overhead than standard network architectures. The third article contributes to the domain of unsupervised learning with the generative adversarial networks paradigm. We introduce a flexible adversarial training framework, in which not only the generator converges to the true data distribution, but also the discriminator recovers the relative density of the data at the optimum. We validate our framework empirically by showing that the discriminator is able to accurately estimate the true energy of data while obtaining state-of-the-art quality of samples. Finally, in the fourth article, we address the problem of unsupervised domain translation. We propose a model which can learn flexible, many-to-many mappings across domains from unpaired data. We validate our approach on several image datasets, and we show that it can be effectively applied in semi-supervised learning settings.
86

Fast Simulations of Radio Neutrino Detectors : Using Generative Adversarial Networks and Artificial Neural Networks

Holmberg, Anton January 2022 (has links)
Neutrino astronomy is expanding into the ultra-high energy (&gt;1017eV) frontier with the use of in-ice detection of Askaryan radio emission from neutrino-induced particle showers. There are already pilot arrays for validating the technology and the next few years will see the planning and construction of IceCube-Gen2, an upgrade to the current neutrino telescope IceCube. This thesis aims to facilitate that planning by providing faster simulations using deep learning surrogate models. Faster simulations could enable proper optimisation of the antenna stations providing better sensitivity and reconstruction of neutrino properties. The surrogates are made for two parts of the end-to-end simulations: the signal generation and the signal propagation. These two steps are the most time-consuming parts of the simulations. The signal propagation is modelled with a standard fully connected neural network whereas for the signal generation a conditional Wasserstein generative adversarial network is used. There are multiple reasons for using these types of models. For both problems the neural networks provide the speed necessary as well as being differentiable -both important factors for optimisation. Generative adversarial networks are used in the signal generation because of the inherent stochasticity in the particle shower development that leads to the Askaryan radio signal. A more standard neural network is used for the signal propagation as it is a regression task. Promising results are obtained for both tasks. The signal propagation surrogate model can predict the parameters of interest at the desired accuracy, except for the travel time which needs further optimisation to reduce the uncertainty from 0.5 ns to 0.1 ns. The signal generation surrogate model predicts the Askaryan emission well for the limited parameter space of hadronic showers and within 5° of the Cherenkov cone. The two models provide a first step and a proof of concept. It is believed that the models can reach the required accuracies with more work.
87

Building Information Extraction and Refinement from VHR Satellite Imagery using Deep Learning Techniques

Bittner, Ksenia 26 March 2020 (has links)
Building information extraction and reconstruction from satellite images is an essential task for many applications related to 3D city modeling, planning, disaster management, navigation, and decision-making. Building information can be obtained and interpreted from several data, like terrestrial measurements, airplane surveys, and space-borne imagery. However, the latter acquisition method outperforms the others in terms of cost and worldwide coverage: Space-borne platforms can provide imagery of remote places, which are inaccessible to other missions, at any time. Because the manual interpretation of high-resolution satellite image is tedious and time consuming, its automatic analysis continues to be an intense field of research. At times however, it is difficult to understand complex scenes with dense placement of buildings, where parts of buildings may be occluded by vegetation or other surrounding constructions, making their extraction or reconstruction even more difficult. Incorporation of several data sources representing different modalities may facilitate the problem. The goal of this dissertation is to integrate multiple high-resolution remote sensing data sources for automatic satellite imagery interpretation with emphasis on building information extraction and refinement, which challenges are addressed in the following: Building footprint extraction from Very High-Resolution (VHR) satellite images is an important but highly challenging task, due to the large diversity of building appearances and relatively low spatial resolution of satellite data compared to airborne data. Many algorithms are built on spectral-based or appearance-based criteria from single or fused data sources, to perform the building footprint extraction. The input features for these algorithms are usually manually extracted, which limits their accuracy. Based on the advantages of recently developed Fully Convolutional Networks (FCNs), i.e., the automatic extraction of relevant features and dense classification of images, an end-to-end framework is proposed which effectively combines the spectral and height information from red, green, and blue (RGB), pan-chromatic (PAN), and normalized Digital Surface Model (nDSM) image data and automatically generates a full resolution binary building mask. The proposed architecture consists of three parallel networks merged at a late stage, which helps in propagating fine detailed information from earlier layers to higher levels, in order to produce an output with high-quality building outlines. The performance of the model is examined on new unseen data to demonstrate its generalization capacity. The availability of detailed Digital Surface Models (DSMs) generated by dense matching and representing the elevation surface of the Earth can improve the analysis and interpretation of complex urban scenarios. The generation of DSMs from VHR optical stereo satellite imagery leads to high-resolution DSMs which often suffer from mismatches, missing values, or blunders, resulting in coarse building shape representation. To overcome these problems, a methodology based on conditional Generative Adversarial Network (cGAN) is developed for generating a good-quality Level of Detail (LoD) 2 like DSM with enhanced 3D object shapes directly from the low-quality photogrammetric half-meter resolution satellite DSM input. Various deep learning applications benefit from multi-task learning with multiple regression and classification objectives by taking advantage of the similarities between individual tasks. Therefore, an observation of such influences for important remote sensing applications such as realistic elevation model generation and roof type classification from stereo half-meter resolution satellite DSMs, is demonstrated in this work. Recently published deep learning architectures for both tasks are investigated and a new end-to-end cGAN-based network is developed, which combines different models that provide the best results for their individual tasks. To benefit from information provided by multiple data sources, a different cGAN-based work-flow is proposed where the generative part consists of two encoders and a common decoder which blends the intensity and height information within one network for the DSM refinement task. The inputs to the introduced network are single-channel photogrammetric DSMs with continuous values and pan-chromatic half-meter resolution satellite images. Information fusion from different modalities helps in propagating fine details, completes inaccurate or missing 3D information about building forms, and improves the building boundaries, making them more rectilinear. Lastly, additional comparison between the proposed methodologies for DSM enhancements is made to discuss and verify the most beneficial work-flow and applicability of the resulting DSMs for different remote sensing approaches.
88

Image generation through feature extraction and learning using a deep learning approach

Bruneel, Tibo January 2023 (has links)
With recent advancements, image generation has become more and more possible with the introduction of stronger generative artificial intelligence (AI) models. The idea and ability of generating non-existing images that highly resemble real world images is interesting for many use cases. Generated images could be used, for example, to augment, extend or replace real data sets for training AI models, therefore being capable of minimising costs on data collection and similar processes. Deep learning, a sub-field within the AI field has been on the forefront of such methodologies due to its nature of being able to capture and learn highly complex and feature-rich data. This work focuses on deep generative learning approaches within a forestry application, with the goal of generating tree log end images in order to enhance an AI model that uses such images. This approach would not only reduce costs of data collection for this model, but also many other information extraction models within the forestry field. This thesis study includes research on the state of the art within deep generative modelling and experiments using a full pipeline from a deep generative modelling stage to a log end recognition model. On top of this, a variant architecture and image sampling algorithm are proposed to add in this pipeline and evaluate its performance. The experiments and findings show that the applied generative model approaches show good feature learning, but lack the high-quality and realistic generation, resulting in more blurry results. The variant approach resulted in slightly better feature learning with a trade-off in generation quality. The proposed sampling algorithm proved to work well on a qualitative basis. The problems found in the generative models propagated further into the training of the recognition model, making the improvement of another AI model based on purely generated data impossible at this point in the research. The results of this research show that more work is needed on improving the application and generation quality to make it resemble real world data more, so that other models can be trained on artificial data. The variant approach does not improve much and its findings contribute to the field by proving its strengths and weaknesses, as with the proposed image sampling algorithm. At last this study provides a good starting point for research within this application, with many different directions and opportunities for future work.
89

Automatic Question Paraphrasing in Swedish with Deep Generative Models / Automatisk frågeparafrasering på svenska med djupa generativa modeller

Lindqvist, Niklas January 2021 (has links)
Paraphrase generation refers to the task of automatically generating a paraphrase given an input sentence or text. Paraphrase generation is a fundamental yet challenging natural language processing (NLP) task and is utilized in a variety of applications such as question answering, information retrieval, conversational systems etc. In this study, we address the problem of paraphrase generation of questions in Swedish by evaluating two different deep generative models that have shown promising results on paraphrase generation of questions in English. The first model is a Conditional Variational Autoencoder (C-VAE) and the other model is an extension of the first one where a discriminator network is introduced into the model to form a Generative Adversarial Network (GAN) architecture. In addition to these models, a method not based on machine-learning was implemented to act as a baseline. The models were evaluated using both quantitative and qualitative measures including grammatical correctness and equivalence to source question. The results show that the deep generative models outperformed the baseline across all quantitative metrics. Furthermore, from the qualitative evaluation it was shown that the deep generative models outperformed the baseline at generating grammatically correct sentences, but there was no noticeable difference in terms of equivalence to the source question between the models. / Parafrasgenerering syftar på uppgiften att, utifrån en given mening eller text, automatiskt generera en parafras, det vill säga en annan text med samma betydelse. Parafrasgenerering är en grundläggande men ändå utmanande uppgift inom naturlig språkbehandling och används i en rad olika applikationer som informationssökning, konversionssystem, att besvara frågor givet en text etc. I den här studien undersöker vi problemet med parafrasgenerering av frågor på svenska genom att utvärdera två olika djupa generativa modeller som visat lovande resultat på parafrasgenerering av frågor på engelska. Den första modellen är en villkorsbaserad variationsautokodare (C-VAE). Den andra modellen är också en C-VAE men introducerar även en diskriminator vilket gör modellen till ett generativt motståndarnätverk (GAN). Förutom modellerna presenterade ovan, implementerades även en icke maskininlärningsbaserad metod som en baslinje. Modellerna utvärderades med både kvantitativa och kvalitativa mått inklusive grammatisk korrekthet och likvärdighet mellan parafras och originalfråga. Resultaten visar att de djupa generativa modellerna presterar bättre än baslinjemodellen på alla kvantitativa mätvärden. Vidare, visade the kvalitativa utvärderingen att de djupa generativa modellerna kunde generera grammatiskt korrekta frågor i större utsträckning än baslinjemodellen. Det var däremot ingen större skillnad i semantisk ekvivalens mellan parafras och originalfråga för de olika modellerna.
90

Navigating the Metric Zoo: Towards a More Coherent Model For Quantitative Evaluation of Generative ML Models

Dozier, Robbie 26 August 2022 (has links)
No description available.

Page generated in 0.1299 seconds