Global ETD Search

241	Detection and Classification of Cancer and Other Noncommunicable Diseases Using Neural Network Models Gore, Steven Lee 07 1900 (has links) Here, we show that training with multiple noncommunicable diseases (NCDs) is both feasible and beneficial to modeling this class of diseases. We first use data from the Cancer Genome Atlas (TCGA) to train a pan cancer model, and then characterize the information the model has learned about the cancers. In doing this we show that the model has learned concepts that are relevant to the task of cancer classification. We also test the model on datasets derived independently of the TCGA cohort and show that the model is robust to data outside of its training distribution such as precancerous legions and metastatic samples. We then utilize the cancer model as the basis of a transfer learning study where we retrain it on other, non-cancer NCDs. In doing so we show that NCDs with very differing underlying biology contain extractible information relevant to each other allowing for a broader model of NCDs to be developed with existing datasets. We then test the importance of the samples source tissue in the model and find that the NCD class and tissue source may not be independent in our model. To address this, we use the tissue encodings to create augmented samples. We test how successfully we can use these augmented samples to remove or diminish tissue source importance to NCD class through retraining the model. In doing this we make key observations about the nature of concept importance and its usefulness in future neural network explainability efforts. Cancer Neural network VAE generative augmented data methylation variational autoencoder CpG island TCGA schizophrenia asthma arthritis transfer learning TCAV Biology, Bioinformatics Computer Science
242	Neural maskinöversättning av gawarbati / Neural machine translation for Gawarbati Gillholm, Katarina January 2023 (has links) Nya neurala modeller har lett till stora framsteg inom maskinöversättning, men fungerar fortfarande sämre på språk som saknar stora mängder parallella data, så kallade lågresursspråk. Gawarbati är ett litet, hotat lågresursspråk där endast 5000 parallella meningar finns tillgängligt. Denna uppsats använder överföringsinlärning och hyperparametrar optimerade för små datamängder för att undersöka möjligheter och begränsningar för neural maskinöversättning från gawarbati till engelska. Genom att använda överföringsinlärning där en föräldramodell först tränades på hindi-engelska förbättrades översättningar med 1.8 BLEU och 1.3 chrF. Hyperparametrar optimerade för små datamängder ökade BLEU med 0.6 men minskade chrF med 1. Att kombinera överföringsinlärning och hyperparametrar optimerade för små datamängder försämrade resultatet med 0.5 BLEU och 2.2 chrF. De neurala modellerna jämförs med och presterar bättre än ordbaserad statistisk maskinöversättning och GPT-3. Den bäst presterande modellen uppnådde endast 2.8 BLEU och 19 chrF, vilket belyser begränsningarna av maskinöversättning på lågresursspråk samt det kritiska behovet av mer data. / Recent neural models have led to huge improvements in machine translation, but performance is still suboptimal for languages without large parallel datasets, so called low resource languages. Gawarbati is a small, threatened low resource language with only 5000 parallel sentences. This thesis uses transfer learning and hyperparameters optimized for small datasets to explore possibilities and limitations for neural machine translation from Gawarbati to English. Transfer learning, where the parent model was trained on parallel data between Hindi and English, improved results by 1.8 BLEU and 1.3 chrF. Hyperparameters optimized for small datasets increased BLEU by 0.6 but decreased chrF by 1. Combining transfer learning and hyperparameters optimized for small datasets led to a decrease in performance by 0.5 BLEU and 2.2 chrF. The neural models outperform a word based statistical machine translation and GPT-3. The highest performing model only achieved 2.8 BLEU and 19 chrF, which illustrates the limitations of machine translation for low resource languages and the critical need for more data. / VR 2020-01500 Machine translation neural machine translation NMT low resource language Gawarbati transfer learning GPT Maskinöversättning neural maskinöversättning NMT lågresursspråk gawarbati överföringsinlärning GPT
243	Développement et validation d’un modèle d’apprentissage machine pour la détection de potentiels donneurs d’organes Sauthier, Nicolas 08 1900 (has links) Le processus du don d’organes, crucial pour la survie de nombreux patients, ne répond pas à la demande croissante. Il dépend d’une identification, par les cliniciens, des potentiels donneurs d’organes. Cette étape est imparfaite et manque entre 30% et 60% des potentiels donneurs d’organes et ce indépendamment des pays étudiés. Améliorer ce processus est un impératif à la fois moral et économique. L’objectif de ce mémoire était de développer et valider un modèle afin de détecter automatiquement les potentiels donneurs d’organes. Pour ce faire, les données cliniques de l’ensemble des patients adultes hospitalisés aux soins intensifs du CHUM entre 2012 et 2019 ont été utilisées. 103 valeurs de laboratoires temporelles différentes et 2 valeurs statiques ont été utilisées pour développer un modèle de réseaux de neurones convolutifs entrainé à prédire les potentiels donneurs d’organes. Ce modèle a été comparé à un modèle fréquentiste linéaire non temporel. Le modèle a par la suite été validé dans une population externe cliniquement distincte. Différentes stratégies ont été comparées pour peaufiner le modèle dans cette population externe et améliorer les performances. Un total de 19 463 patients, dont 397 donneurs potentiels, ont été utilisés pour développer le modèle et 4 669, dont 36 donneurs potentiels, ont été utilisés pour la validation externe. Le modèle démontrait une aire sous la courbe ROC (AUROC) de 0.966 (IC95% 0.9490.981), supérieure au modèle fréquentiste linéaire (AUROC de 0.940 IC95% 0.908-0.969, p=0.014). Le modèle était aussi supérieur dans certaines sous populations d’intérêt clinique. Dans le groupe de validation externe, l’AUROC du modèle de réseaux de neurones était de 0.820 (0.682-0.948) augmentant à 0.874 (0.731-0.974) à l’aide d’un ré-entrainement. Ce modèle prometteur a le potentiel de modifier et d’améliorer la détection des potentiels donneurs d’organes. D’autres étapes de validation prospectives et d’amélioration du modèle, notamment l’ajout de données spécifiques, sont nécessaires avant une utilisation clinique de routine. / The organ donation process, however crucial for many patients’ survival, is not enough to address the increasing demand. Its efficiency depends on potential organ donors’ identification by clinicians. This imperfect step misses between 30%–60% of potential organ donor. Improving that process is a moral and economic imperative. The main goal of this work was to address that liming step by developing and validating a predictive model that could automatically detect potential organ donors. The clinical data from all patients hospitalized, between 2012 and 2019 to the CHUM critical care units were extracted. The temporal evolution of 103 types of laboratory analysis and 2 static clinical data was used to develop and test a convolutive neural network (CNN), trained to predict potential organ donors. This model was compared to a non-temporal logistical model as a baseline. The CNN model was validated in a clinically distinct external population. To improve the performance in this external cohort, strategies to fine-tune the network were compared. 19 463 patients, including 397 potential organ donors, were used to create the model and 4 669 patients, including 36 potential organ donors, served as the external validation cohort. The CNN model performed better with an AUROC of 0.966 (IC95% 0.949-0.981), compared to the logistical model (AUROC de 0.940 IC95% 0.908-0.969, p=0.014). The CNN model was also superior in specific subpopulation of increased clinical interest. In the external validation cohort, the CNN model’s AUROC was 0.820 (0.682-0.948) and could be improved to 0.874 (0.731-0.974) after fine tuning. This promising model could change potential organ donors' detection for the better. More studies are however required to improve the model, by adding more types of data, and to validate prospectively the mode before routine clinical usage. don d’organes transplantation modèle prédictif autoencodeur réseaux de neurones transfert de connaissance organ donation transplant predictive model machine learning autoencoder neural networks transfer learning Medicine / Médicine (UMI : 0564)
244	Head-to-head Transfer Learning Comparisons made Possible : A Comparative Study of Transfer Learning Methods for Neural Machine Translation of the Baltic Languages Stenlund, Mathias January 2023 (has links) The struggle of training adequate MT models using data-hungry NMT frameworks for low-resource language pairs has created a need to alleviate the scarcity of sufficiently large parallel corpora. Different transfer learning methods have been introduced as possible solutions to this problem, where a new model for a target task is initialized using parameters learned from some other high-resource task. Many of these methods are claimed to increase the translation quality of NMT systems in some low-resource environments, however, they are often proven to do so using different parent and child language pairs, a variation in data size, NMT frameworks, and training hyperparameters, which makes comparing them impossible. In this thesis project, three such transfer learning methods are put head-to-head in a controlled environment where the target task is to translate from the under-resourced Baltic languages Lithuanian and Latvian to English. In this controlled environment, the same parent language pairs, data sizes, data domains, transformer framework, and training parameters are used to ensure fair comparisons between the three transfer learning methods. The experiments involve training and testing models using all different combinations of transfer learning methods, parent language pairs, and either in-domain or out-domain data for an extensive study where different strengths and weaknesses are observed. The results display that Multi-Round Transfer Learning improves the overall translation quality the most but, at the same time, requires the longest training time by far. The Parameter freezing method provides a marginally lower overall improvement of translation quality but requires only half the training time, while Trivial Transfer learning improves quality the least. Both Polish and Russian work well as parents for the Baltic languages, while web-crawled data improves out-domain translations the most. The results suggest that all transfer learning methods are effective in a simulated low-resource environment, however, none of them can compete with simply having a larger target language pair data set, due to none of them overcoming the strong higher-resource baseline. machine translation transfer learning Latvian Lithuanian low-resource languages transformers parent language child language comparative study
245	Deep Learning Methods for Recovering Trading Strategies Emtell, Erik, Spjuth, Oliver January 2022 (has links) The aim of this paper is first of all to determine whether deep learning methods can recover trading strategies based on historical price and volume data, with scarcity of real data in mind. The second aim is to evaluate the methods to generate a deep learning blueprint for strategy extraction. Trading strategies can be built on many different types of data, often combined from different areas. In this paper, we focus on trading strategies based solely on historical price and volume data to limit the scope of the problem. Combinations of different deep learning architectures and methods such as transfer- and ensemble methods were evaluated. The results clearly show that deep learning models can recover relatively complex trading strategies to some extent. Models leveraging transfer learning outperform other models when data is scarce and ensemble methods elevate performance in certain regards. / Målet med denna rapport är i första hand att ta reda på om djupinlärningsmetoder kan återskapa handlingsstragetier baserat på historiska priser och volymdata, med vetskapen att datan är begränsad. Det andra målet är att utvärdera metoder för att skapa en djupinlärningsmall för att utvinna handelsstrategier. Handelsstrategier kan vara byggda på många olika datatyper, ofta i kombination från olika områden. I denna rapport fokuserar vi på strategier som enbart är baserade på historiska priser och volymdata för att begränsa problemet. Kombinationer av olika djupinlärningsarkitekturer tillsammans med metoder som till exempel överföringsinlärning och ensembleinlärning utvärderades. Resultaten visar tydligt att djupinlärningsmodeller kan återskapa relativt komplexa handlingsstrategier. Modeller som utnyttjade överföringsinlärning presterade bättre än andra modeller när datan var begränsad och ensembleinlärning ökade prestandan ytterligare i vissa sammanhang. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm Deep Learning Recurrent Neural Network Convolutional Neural Network WaveNet Ensemble Methods Stacking Bagging Transfer Learning Algorithmic Trading Elektroteknik och elektronik
246	Predicting Digital Porous Media Properties Using Machine Learning Methods Elmorsy, Mohamed January 2023 (has links) Subsurface porous media, like aquifers, petroleum reservoirs, and geothermal systems, are vital for natural resources and environmental management. Extensive research has been conducted to understand flow and transport in these media, addressing challenges in hydrocarbon extraction, carbon storage and waste management. Classifying the type of porous media (e.g., sandstone, carbonate) is often the first step in the rock characterization process, and it provides critical information regarding the physical properties of the porous media. Therefore, we utilize multivariate statistical methods with discriminant analysis to categorize porous media samples which proved to be efficient by achieving excellent classification accuracy on testing datasets and served as a surrogate tool to study key porous media characteristics. While recent advances in three-dimensional (3D) imaging of core samples have enabled digital subsurface characterization, the exorbitant computational cost associated with direct numerical simulation in 3D remains a persistent challenge. In contrast, machine learning (ML) models are much more efficient, though their use in subsurface characterization is still in its infancy. Therefore, we introduce a novel 3D convolution neural network (CNN) for end-to-end prediction of permeability. By increasing dataset size, diversity, and optimizing the network architecture, our model surpasses the accuracy of existing 3D CNN models for permeability prediction. It demonstrates excellent generalizability, accurately predicting permeability in previously unseen samples. However, despite the efficiency of the developed 3D CNN model for accurate and fast permeability prediction, its utility remains limited to small subdomains of the digital rock samples. Therefore, we introduce an upscaling technique using a new analytical solution to calculate effective permeability in a 3D digital rock composed of 2 × 2 × 2 anisotropic cells. By incorporating this solution into physics-informed neural network (PINN) models, we achieve highly accurate results. Even when upscaling previously unseen samples at multiple levels, the PINN with the physics-informed module maintains excellent accuracy. This advancement enhances the capability of ML models, like 3D CNN, for efficient and accurate digital rock analysis at the core scale. After successfully applying ML models in permeability prediction, we now extend their application to another important parameter in subsurface engineering projects: effective thermal conductivity, which is a key parameter in engineering projects like radioactive waste repositories, geothermal energy production, and underground energy storage. To address the need for large training data and processing power in ML models, we propose a novel framework based on transfer learning. This approach allows prior knowledge from previous applications to be transferred, resulting in faster and more efficient implementation of new relevant applications. We introduce CNN models trained on various porous media samples that leverage transfer learning to predict porous media sample thermal conductivity accurately. Our approach reduces training time, processing power, and data requirements, enabling effective prediction and analysis of porous media properties such as permeability and thermal conductivity. It also facilitates the application of ML to other properties, improving efficiency and accuracy. / Thesis / Doctor of Philosophy (PhD) Digital Rocks Physics Machine learning Computer Vision Deep Learning 3D CNN Convolutional Neural Networks PINN Physics informed Neural Networks Transfer Learning Digital Porous Media
247	Efficient Sentiment Analysis and Topic Modeling in NLP using Knowledge Distillation and Transfer Learning / Effektiv sentimentanalys och ämnesmodellering inom NLP med användning av kunskapsdestillation och överföringsinlärning Malki, George January 2023 (has links) This abstract presents a study in which knowledge distillation techniques were applied to a Large Language Model (LLM) to create smaller, more efficient models without sacrificing performance. Three configurations of the RoBERTa model were selected as ”student” models to gain knowledge from a pre-trained ”teacher” model. Multiple steps were used to improve the knowledge distillation process, such as copying some weights from the teacher to the student model and defining a custom loss function. The selected task for the knowledge distillation process was sentiment analysis on Amazon Reviews for Sentiment Analysis dataset. The resulting student models showed promising performance on the sentiment analysis task capturing sentiment-related information from text. The smallest of the student models managed to obtain 98% of the performance of the teacher model while being 45% lighter and taking less than a third of the time to analyze an entire the entire IMDB Dataset of 50K Movie Reviews dataset. However, the student models struggled to produce meaningful results on the topic modeling task. These results were consistent with the topic modeling results from the teacher model. In conclusion, the study showcases the efficacy of knowledge distillation techniques in enhancing the performance of LLMs on specific downstream tasks. While the model excelled in sentiment analysis, further improvements are needed to achieve desirable outcomes in topic modeling. These findings highlight the complexity of language understanding tasks and emphasize the importance of ongoing research and development to further advance the capabilities of NLP models. / Denna sammanfattning presenterar en studie där kunskapsdestilleringstekniker tillämpades på en stor språkmodell (Large Language Model, LLM) för att skapa mindre och mer effektiva modeller utan att kompremissa på prestandan. Tre konfigurationer av RoBERTa-modellen valdes som ”student”-modeller för att inhämta kunskap från en förtränad ”teacher”-modell. Studien mäter även modellernas prestanda på två ”DOWNSTREAM” uppgifter, sentimentanalys och ämnesmodellering. Flera steg användes för att förbättra kunskapsdestilleringsprocessen, såsom att kopiera vissa vikter från lärarmodellen till studentmodellen och definiera en anpassad förlustfunktion. Uppgiften som valdes för kunskapsdestilleringen var sentimentanalys på datamängden Amazon Reviews for Sentiment Analysis. De resulterande studentmodellerna visade lovande prestanda på sentimentanalysuppgiften genom att fånga upp information relaterad till sentiment från texten. Den minsta av studentmodellerna lyckades erhålla 98% av prestandan hos lärarmodellen samtidigt som den var 45% lättare och tog mindre än en tredjedel av tiden att analysera hela IMDB Dataset of 50K Movie Reviews datasettet.Dock hade studentmodellerna svårt att producera meningsfulla resultat på ämnesmodelleringsuppgiften. Dessa resultat överensstämde med ämnesmodelleringsresultaten från lärarmodellen. Dock hade studentmodellerna svårt att producera meningsfulla resultat på ämnesmodelleringsuppgiften. Dessa resultat överensstämde med ämnesmodelleringsresultaten från lärarmodellen. Large Language Model RoBERTa Knowledge distillation Transfer learning Sentiment analysis Topic modeling Stor språkmodell RoBERTa Kunskapsdestillation överföringsinlärning Sentimentanalys Ämnesmodellering Computer and Information Sciences Data- och informationsvetenskap
248	[pt] APRIMORANDO A SÍNTESE DE IMAGENS A PARTIR DE TEXTO UTILIZANDO TRANSFERÊNCIA DE APRENDIZADO U2C / [en] IMPROVING TEXT-TO-IMAGE SYNTHESIS WITH U2C - TRANSFER LEARNING VINICIUS GOMES PEREIRA 06 February 2024 (has links) [pt] As Redes Generativas Adversariais (GANs) são modelos não supervisionados capazes de aprender a partir de um número indefinidamente grande de imagens. Entretanto, modelos que geram imagens a partir de linguagem dependem de dados rotulados de alta qualidade, que são escassos. A transferência de aprendizado é uma técnica conhecida que alivia a necessidade de dados rotulados, embora transformar um modelo gerativo incondicional em um modelo condicionado a texto não seja uma tarefa trivial. Este trabalho propõe uma abordagem de ajuste simples, porém eficaz, chamada U2C transfer. Esta abordagem é capaz de aproveitar modelos pré-treinados não condicionados enquanto aprende a respeitar as condições textuais fornecidas. Avaliamos a eficiência do U2C transfer ao ajustar o StyleGAN2 em duas das fontes de dados mais utilizadas para a geração images a partir de texto, resultando na arquitetura Text-Conditioned StyleGAN2 (TC-StyleGAN2). Nossos modelos alcançaram rapidamente o estado da arte nas bases de dados CUB-200 e Oxford-102, com valores de FID de 7.49 e 9.47, respectivamente. Esses valores representam ganhos relativos de 7 por cento e 68 por cento, respectivamente, em comparação com trabalhos anteriores. Demonstramos que nosso método é capaz de aprender detalhes refinados a partir de consultas de texto, produzindo imagens fotorrealistas e detalhadas. Além disso, mostramos que os modelos organizam o espaço intermediário de maneira semanticamente significativa. Nossas descobertas revelam que as imagens sintetizadas usando nossa técnica proposta não são apenas críveis, mas também exibem forte alinhamento com suas descrições textuais correspondentes. De fato, os escores de alinhamento textual alcançados por nosso método são impressionantemente e comparáveis aos das imagens reais. / [en] Generative Adversarial Networks (GANs) are unsupervised models that can learn from an indefinitely large amount of images. On the other hand, models that generate images from language queries depend on high-quality labeled data that is scarce. Transfer learning is a known technique that alleviates the need for labeled data, though it is not trivial to turn an unconditional generative model into a text-conditioned one. This work proposes a simple, yet effective fine-tuning approach, called Unconditional-to-Conditional Transfer Learning (U2C transfer). It can leverage well-established pre-trained models while learning to respect the given textual condition conditions. We evaluate U2C transfer efficiency by fine-tuning StyleGAN2 in two of the most widely used text-to-image data sources, generating the Text-Conditioned StyleGAN2 (TC-StyleGAN2). Our models quickly achieved state-of-the-art results in the CUB-200 and Oxford-102 datasets, with FID values of 7.49 and 9.47, respectively. These values represent relative gains of 7 percent and 68 percent compared to prior work. We show that our method is capable of learning fine-grained details from text queries while producing photorealistic and detailed images. Our findings highlight that the images created using our proposed technique are credible and display a robust alignment with their corresponding textual descriptions. [pt] REDES GENERATIVAS ADVERSARIAIS [pt] APRENDIZADO MULTIMODAL [pt] TRANSFERENCIA DE APRENDIZADO [pt] SINTESE DE IMAGENS [en] GENERATIVE ADVERSARIAL NETWORKS [en] MULTIMODAL LEARNING [en] TRANSFER LEARNING [en] IMAGE SYNTHESIS
249	Machine Learning for Automation of Chromosome based Genetic Diagnostics / Maskininlärning för automatisering av kromosombaserad genetisk diagnostik Chu, Gongchang January 2020 (has links) Chromosome based genetic diagnostics, the detection of specific chromosomes, plays an increasingly important role in medicine as the molecular basis of hu- man disease is defined. The current diagnostic process is performed mainly by karyotyping specialists. They first put chromosomes in pairs and generate an image listing all the chromosome pairs in order. This process is called kary- otyping, and the generated image is called karyogram. Then they analyze the images based on the shapes, size, and relationships of different image segments and then make diagnostic decisions. Manual inspection is time-consuming, labor-intensive, and error-prone.This thesis investigates supervised methods for genetic diagnostics on karyo- grams. Mainly, the theory targets abnormality detection and gives the confi- dence of the result in the chromosome domain. This thesis aims to divide chromosome pictures into normal and abnormal categories and give the con- fidence level. The main contributions of this thesis are (1) an empirical study of chromosome and karyotyping; (2) appropriate data preprocessing; (3) neu- ral networks building by using transfer learning; (4) experiments on different systems and conditions and comparison of them; (5) a right choice for our requirement and a way to improve the model; (6) a method to calculate the confidence level of the result by uncertainty estimation.Empirical research shows that the karyogram is ordered as a whole, so preprocessing such as rotation and folding is not appropriate. It is more rea- sonable to choose noise or blur. In the experiment, two neural networks based on VGG16 and InceptionV3 were established using transfer learning and com- pared their effects under different conditions. We hope to minimize the error of assuming normal cases because we cannot accept that abnormal chromo- somes are predicted as normal cases. This thesis describes how to use Monte Carlo Dropout to do uncertainty estimation like a non-Bayesian model[1]. / Kromosombaserad genetisk diagnostik, detektering av specifika kromosomer, kommer att spela en allt viktigare roll inom medicin eftersom den molekylära grunden för mänsklig sjukdom definieras. Den nuvarande diagnostiska pro- cessen utförs huvudsakligen av specialister på karyotypning. De sätter först kromosomer i par och genererar en bild som listar alla kromosompar i ord- ning. Denna process kallas karyotypning, och den genererade bilden kallas karyogram. Därefter analyserar de bilderna baserat på former, storlek och för- hållanden för olika bildsegment och fattar sedan diagnostiska beslut.Denna avhandling undersöker övervakade metoder för genetisk diagnostik på karyogram. Huvudsakligen riktar teorin sig mot onormal detektion och ger förtroendet för resultatet i kromosomdomänen. Manuell inspektion är tidskrä- vande, arbetskrävande och felbenägen. Denna uppsats syftar till att dela in kro- mosombilder i normala och onormala kategorier och ge konfidensnivån. Dess huvudsakliga bidrag är (1) en empirisk studie av kromosom och karyotyp- ning; (2) lämplig förbehandling av data; (3) Neurala nätverk byggs med hjälp av transfer learning; (4) experiment på olika system och förhållanden och jäm- förelse av dem; (5) ett rätt val för vårt krav och ett sätt att förbättra modellen; en metod för att beräkna resultatets konfidensnivå genom osäkerhetsupp- skattning. Empirisk forskning visar att karyogrammet är ordnat som en helhet, så förbehandling som rotation och vikning är inte lämpligt. Det är rimligare att välja brus, oskärpa etc. I experimentet upprättades två neurala nätverk base- rade på VGG16 och InceptionV3 med hjälp av transfer learning och jämförde deras effekter under olika förhållanden. När vi väljer utvärderingsindikatorer, eftersom vi inte kan acceptera att onormala kromosomer bedöms förväntas, hoppas vi att minimera felet att anta som vanligt. Denna avhandling beskriver hur man använder Monte Carlo Dropout för att göra osäkerhetsberäkningar som en icke-Bayesisk modell [1]. Genetic Diagnostics Abnormality Detection Transfer Learning Deep Learning Uncertainty Estimation Genetisk diagnos onormal detektion överföringsinlärning djupinlärning osäkerhetsuppskattning Computer and Information Sciences Data- och informationsvetenskap
250	Leveraging Adult Fashion to Enhance Children’s Fashion Recognition Igareta Herráiz, Angel Luis January 2021 (has links) The future of the fashion industry is expected to be online, thus a significant amount of research is being conducted in the field of fashion image analysis. Currently, a task that places a heavy workload on online stores is manually tagging new garments, including attributes such as category, color, pattern, or style. To this end, extensive research has targeted the automatic prediction of clothing categories and attributes, achieving promising results. Nevertheless, no previous study has been found in the literature that specifically reflects the performance of clothing attribute recognition with children’s clothing. This work intends to fill this gap and effectively present, in the same fashion analysis task, how a model trained in adult fashion performs over a model trained exclusively in children’s fashion. When examining the global understanding of children’s fashion apparel, the experiments exhibit that the best performance is obtained when leveraging the domain knowledge of adult fashion, specifically from the iMaterialist dataset, wherein the best model a difference in the overall performance of about 3% was achieved compared to pre- training on the ImageNet dataset or 12% when only children’s fashion was considered for training. / Modebranschen förväntas i framtiden vara online, och därför bedrivs det mycket forskning inom området bildanalys av modebilder. En uppgift som för närvarande innebär en stor arbetsbörda för nätbutiker är att manuellt tagga nya plagg med attribut som kategori, färg, mönster eller stil. Därför har omfattande forskning genomförts om automatisk förutsägelse av klädkategorier och attribut, och man har uppnått lovande resultat. Trots detta har ingen tidigare studie hittats i litteraturen som specifikt speglar prestandan för igenkänning av klädattribut för barnkläder. Syftet med det här arbetet är att fylla denna lucka och, som en del i en analys av mode, på ett effektivt sätt visa hur en modell som tränats för vuxenmode presterar jämfört med en modell som enbart tränats för barnmode. När man undersöker den globala förståelsen för barnkläder visar experimenten att den bästa prestandan uppnås när man utnyttjar domänkunskapen om vuxenmode, särskilt från iMaterialist- dataset, där man med den bästa modellen uppnådde en skillnad i den totala prestandan på cirka 3% jämfört med förträning på ImageNet- dataset eller 12% när endast barnmode beaktades vid träningen. CNN Machine Learning Transfer Learning Fashion Analysis Children’s Fashion Clothing Attribute Recognition CNN maskininlärning överföringsinlärning modeanalys barnmode igenkänning av klädesattribut Computer and Information Sciences Data- och informationsvetenskap

Search results