• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 92
  • 23
  • 10
  • 4
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 149
  • 149
  • 40
  • 38
  • 35
  • 32
  • 22
  • 19
  • 19
  • 18
  • 18
  • 17
  • 17
  • 14
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Synthesis of Tabular Financial Data using Generative Adversarial Networks / Syntes av tabulär finansiell data med generativa motstridande nätverk

Karlsson, Anton, Sjöberg, Torbjörn January 2020 (has links)
Digitalization has led to tons of available customer data and possibilities for data-driven innovation. However, the data needs to be handled carefully to protect the privacy of the customers. Generative Adversarial Networks (GANs) are a promising recent development in generative modeling. They can be used to create synthetic data which facilitate analysis while ensuring that customer privacy is maintained. Prior research on GANs has shown impressive results on image data. In this thesis, we investigate the viability of using GANs within the financial industry. We investigate two state-of-the-art GAN models for synthesizing tabular data, TGAN and CTGAN, along with a simpler GAN model that we call WGAN. A comprehensive evaluation framework is developed to facilitate comparison of the synthetic datasets. The results indicate that GANs are able to generate quality synthetic datasets that preserve the statistical properties of the underlying data and enable a viable and reproducible subsequent analysis. It was however found that all of the investigated models had problems with reproducing numerical data. / Digitaliseringen har fört med sig stora mängder tillgänglig kunddata och skapat möjligheter för datadriven innovation. För att skydda kundernas integritet måste dock uppgifterna hanteras varsamt. Generativa Motstidande Nätverk (GANs) är en ny lovande utveckling inom generativ modellering. De kan användas till att syntetisera data som underlättar dataanalys samt bevarar kundernas integritet. Tidigare forskning på GANs har visat lovande resultat på bilddata. I det här examensarbetet undersöker vi gångbarheten av GANs inom finansbranchen. Vi undersöker två framstående GANs designade för att syntetisera tabelldata, TGAN och CTGAN, samt en enklare GAN modell som vi kallar för WGAN. Ett omfattande ramverk för att utvärdera syntetiska dataset utvecklas för att möjliggöra jämförelse mellan olika GANs. Resultaten indikerar att GANs klarar av att syntetisera högkvalitativa dataset som bevarar de statistiska egenskaperna hos det underliggande datat, vilket möjliggör en gångbar och reproducerbar efterföljande analys. Alla modellerna som testades uppvisade dock problem med att återskapa numerisk data.
132

Some phenomenological investigations in deep learning

Baratin, Aristide 12 1900 (has links)
Les remarquables performances des réseaux de neurones profonds dans de nombreux domaines de l'apprentissage automatique au cours de la dernière décennie soulèvent un certain nombre de questions théoriques. Par exemple, quels mecanismes permettent à ces reseaux, qui ont largement la capacité de mémoriser entièrement les exemples d'entrainement, de généraliser correctement à de nouvelles données, même en l'absence de régularisation explicite ? De telles questions ont fait l'objet d'intenses efforts de recherche ces dernières années, combinant analyses de systèmes simplifiés et études empiriques de propriétés qui semblent être corrélées à la performance de généralisation. Les deux premiers articles présentés dans cette thèse contribuent à cette ligne de recherche. Leur but est de mettre en évidence et d'etudier des mécanismes de biais implicites permettant à de larges modèles de prioriser l'apprentissage de fonctions "simples" et d'adapter leur capacité à la complexité du problème. Le troisième article aborde le problème de l'estimation de information mutuelle en haute, en mettant à profit l'expressivité et la scalabilité des reseaux de neurones profonds. Il introduit et étudie une nouvelle classe d'estimateurs, dont il présente plusieurs applications en apprentissage non supervisé, notamment à l'amélioration des modèles neuronaux génératifs. / The striking empirical success of deep neural networks in machine learning raises a number of theoretical puzzles. For example, why can they generalize to unseen data despite their capacity to fully memorize the training examples? Such puzzles have been the subject of intense research efforts in the past few years, which combine rigorous analysis of simplified systems with empirical studies of phenomenological properties shown to correlate with generalization. The first two articles presented in these thesis contribute to this line of work. They highlight and discuss mechanisms that allow large models to prioritize learning `simple' functions during training and to adapt their capacity to the complexity of the problem. The third article of this thesis addresses the long standing problem of estimating mutual information in high dimension, by leveraging the scalability of neural networks. It introduces and studies a new class of estimators and present several applications in unsupervised learning, especially on enhancing generative models.
133

Applying the Shadow Rating Approach: A Practical Review / Tillämpning av skuggrating-modellen: En praktisk studie

Barry, Viktor, Stenfelt, Carl January 2023 (has links)
The combination of regulatory pressure and rare but impactful defaults together comprise the domain of low default portfolios, which is a central and complex topic that lacks clear industry standards. A novel approach that utilizes external data to create a Shadow Rating model has been proposed by Ulrich Erlenmaier. It addresses the lack of data by estimating a probability of default curve from an external rating scale and subsequently training a statistical model to estimate the credit rating of obligors. The thesis intends to first explore the capabilities of the Cohort model and the Pluto and Tasche model to estimate the probability of default associated with banks and financial institutions through the use of external data. Secondly, the thesis will implement a multinomial logistic regression model, an ordinal logistic regression model, Classification and Regression Trees, and a Random Forest model. Subsequently, their performance to correctly estimate the credit rating of companies in a portfolio of banks and financial institutions using financial data is evaluated. Results suggest that the Cohort model is superior in modelling the underlying data, given a Gini coefficient of 0.730 for the base case, as opposed to Pluto and Tasche's 0.260. Moreover, the Random Forest model displays marginally higher performance across all metrics (such as an accuracy of 57%, a mean absolute error of 0.67 and a multiclass receiver operating characteristic of 0.83). However, given a lower degree of interpretability, the more simplistic ordinal logistic regression model (50%, 0.80 and 0.81, respectively) can be preferred due to its clear interpretability and explainability. / Kombinationen av regulatoriskt påtryck och få men påverkande fallissemang utgör tillsammans området lågfallissemangsportföljer, vilket är ett centralt men komplext ämne med avsaknad av tydliga industristandarder. En metod som använder extern data för att skapa en skuggrating-modell har föreslagits av Ulrich Erlenmaier. Den adresserar problemet av bristande data genom att använda externa ratings för att estimera en kurva över sannolikheten. Sedermera implementeras en statistisk modell som estimerar kreditratingen av låntagare. Denna uppsats ämnar för det första att utforska möjligheterna för kohortmodellen samt Pluto-och-Tasche-modellen att estimera sannolikheten för fallissemang associerat med banker och finansiella institutioner genom användandet av extern data. För det andra implementeras statistiska modeller genom nominell logistisk regression, ordinal logistisk regression, klassificerings- och regressionsträd samt Random Forest. Sedermera utvärderas modellernas förmåga att förutse kreditratings för företag från en portfölj av banker och finansiella institutioner. Resultat föreslår att kohortmodellen är att föredra vid modellering av underliggande data, givet en Ginikoefficient på 0.730 för grundfallet, till skillnad från Pluto och Tasches resultat på 0.260. Vidare genererade Random Forest marginellt bättre resultat över alla utvärderingskriterier (till exempel, 57% träffsäkerhet, 0.67 mean absolute error och 0.83 multiclass receiver operating characteristic). Däremot har den en lägre tolkningsbarhet så att ordinal logistisk regression (med respektive värden 50%, 0.80 och 0.81) skulle kunna föredras, givet dess tydlighet och transparens.
134

[en] PORTFOLIO SELECTION USING ROBUST OPTIMIZATION AND SUPPORT VECTOR MACHINE (SVM) / [pt] SELEÇÃO DE PORTFÓLIO USANDO OTIMIZAÇÃO ROBUSTA E MÁQUINAS DE SUPORTE VETORIAL

ROBERTO PEREIRA GARCIA JUNIOR 26 October 2021 (has links)
[pt] A dificuldade de se prever movimento de ativos financeiros é objeto de estudo de diversos autores. A fim de se obter ganhos, se faz necessário estimar a direção (subida ou descida) e a magnitude do retorno do ativo no qual pretende-se comprar ou vender. A proposta desse trabalho consiste em desenvolver um modelo de otimização matemática com variáveis binárias capaz de prever movimentos de subidas e descidas de ativos financeiros e utilizar um modelo de otimização de portfólio para avaliar os resultados obtidos. O modelo de previsão será baseado no Support Vector Machine (SVM), no qual faremos modificações na regularização do modelo tradicional. Para o gerenciamento de portfólio será utilizada otimização robusta. As técnicas de otimização estão sendo cada vez mais aplicadas no gerenciamento de portfólio, pois são capazes de lidar com os problemas das incertezas introduzidas na estimativa dos parâmetros. Vale ressaltar que o modelo desenvolvido é data-driven, i.e, as previsões são feitas utilizando sinais não-lineares baseados em dados de retorno/preço histórico passado sem ter nenhum tipo de intervenção humana. Como os preços dependem de muitos fatores é de se esperar que um conjunto de parâmetros só consiga descrever a dinâmica dos preços dos ativos financeiros por um pequeno intervalo de dias. Para capturar de forma mais precisa essa mudança na dinâmica, a estimação dos parâmetros dos modelos é feita em janela móvel. Para testar a acurácia dos modelos e os ganhos obtidos foi feito um estudo de caso utilizando 6 ativos financeiros das classes de moedas, renda fixa, renda variável e commodities. Os dados abrangem o período de 01/01/2004 até 30/05/2018 totalizando um total de 3623 cotações diárias. Considerando os custos de transações e os resultados out-of-sample obtidos no período analisado percebe-se que a carteira de investimentos desenvolvida neste trabalho exibe resultados superiores aos dos índices tradicionais com risco limitado. / [en] The difficulty of predicting the movement of financial assets is the subject of study by several authors. In order to obtain gains, it is necessary to estimate the direction (rise or fall) and the magnitude of the return on the asset in which it is intended to be bought or sold. The purpose of this work is to develop a mathematical optimization model with binary variables capable of predicting up and down movements of financial assets and using a portfolio optimization model to evaluate the results obtained. The prediction model will be based on the textit Support Vector Machine (SVM), in which we will make modifications in the regularization of the traditional model. For the portfolio management will be used robust optimization. The robust optimization techniques are being increasingly applied in portfolio management, since they are able to deal with the problems of the uncertainties introduced in the estimation of the parameters. It is noteworthy that the developed model is data-driven, i.e., the predictions are made using nonlinear signals based on past historical price / return data without any human intervention. As prices depend on many factors it is to be expected that a set of parameters can only describe the dynamics of the prices of financial assets for a small interval of days. In order to more accurately capture this change in dynamics, the estimation of model parameters is done in a moving window To test the accuracy of the models and the gains obtained, a case study was made using 6 financial assets of the currencies, fixed income, variable income and commodities classes. The data cover the period from 01/01/2004 until 05/30/2018 totaling a total of 3623 daily quotations. Considering the transaction costs and out-of-sample results obtained in the analyzed period, it can be seen that the investment portfolio developed in this work shows higher results than the traditional indexes with limited risk.
135

Perception et apprentissage des structures musicales et langagières : études des ressources cognitives partagées et des effets attentionnels / Musical and linguistic structure perception and learning : investigation of shared cognitive resources and attentionnal effects

Hoch, Lisianne 09 July 2010 (has links)
La musique et le langage sont des matériels structurés à partir de principes combinatoires. Les auditeurs ont acquis des connaissances sur ces régularités structurelles par simple exposition. Ces connaissances permettent le développement d’attentes sur les événements à venir en musique et en langage. Mon travail de thèse étudiait deux aspects de la spécificité versus la généralité des processus de traitement de la musique et du langage: la perception et l’apprentissage statistique.Dans la première partie (perception), les Études 1 à 4 ont montré que le traitement des structures musicales influence le traitement de la parole et du langage présenté en modalité visuelle, reflétant l’influence des mécanismes d’attention dynamique (Jones, 1976). Plus précisément, le traitement des structures musicales interagissait avec le traitement des structures syntaxiques, mais pas avec le traitement des structures sémantiques en langage (Étude 3). Ces résultats sont en accord avec l’hypothèse de ressources d’intégration syntaxique partagées de Patel (2003). Nos résultats et les précédentes études sur les traitements simultanés des structures musicales et linguistiques (syntaxiques et sémantiques), nous ont incités à élargir l’hypothèse de ressources d’intégration partagées au traitement d’autres d’informations structurées qui nécessitent également des ressources d’intégration structurelle et temporelle. Cette hypothèse a été testée et confirmée par l’observation d’une interaction entre les traitements simultanés des structures musicales et arithmétiques (Étude 4). Dans la deuxième partie (apprentissage), l’apprentissage statistique était étudié en comparaison directe pour des matériels verbaux et non-verbaux. Plus particulièrement, nous avons étudié l’influence de l’attention dynamique guidée par des indices temporels non-acoustiques (Études 5 et 6) et acoustiques (Étude 7) sur l’apprentissage statistique. Les indices temporels non-acoustiques influençaient l’apprentissage statistique de matériels verbaux et non-verbaux. En accord avec la théorie de l’attention dynamique (Jones, 1976), une hypothèse est que les indices temporels non-acoustiques guident l’attention dans le temps et influencent l’apprentissage statistique.Les études de ce travail de thèse ont suggéré que les ressources d’attention dynamique influençaient la perception et l’apprentissage de matériels structurés et que les traitements des structures musicales et d’autres informations structurées (e.g., langage, arithmétique) partagent des ressources d’intégration structurelle et temporelle. L’ensemble de ces résultats amène de nouvelles questions sur la possible influence du traitement des structures auditives tonales et temporelles sur les capacités cognitives générales de séquencement notamment requises pour la perception et l’apprentissage d’informations séquentielles structurées.Jones, M. R. (1976). Time, our lost dimension: Toward a new theory of perception, attention, and memory. Psychological Review, 83(5), 323-355. doi:10.1037/0033-295X.83.5.323Patel, A. D. (2003). Language, music, syntax and the brain. Nature Neuroscience, 6(7), 674-681. doi:10.1038/nn1082 / Music and language are structurally organized materials that are based on combinatorial principles. Listeners have acquired knowledge about these structural regularities via mere exposure. This knowledge allows them to develop expectations about future events in music and language perception. My PhD investigated two aspects of domain-specificity versus generality of cognitive functions in music and language processing: perception and statistical learning.In the first part (perception), musical structure processing has been shown to influence spoken and visual language processing (Études 1 & 4), partly due to dynamic attending mechanisms (Jones, 1976). More specifically, musical structure processing has been shown to interact with linguistic-syntactic processing, but not with linguistic-semantic processing (Étude 3), thus supporting the hypothesis of shared syntactic resources for music and language processing (Patel, 2003). Together with previous studies that have investigated simultaneous musical and linguistic (syntactic and semantic) structure processing, we proposed that these shared resources might extend to the processing of other structurally organized information that require structural and temporal integration resources. This hypothesis was tested and supported by interactive influences between simultaneous musical and arithmetic structure processing (Étude 4). In the second part (learning), statistical learning was directly compared for verbal and nonverbal materials. In particular, we aimed to investigate the influence of dynamic attention driven by non-acoustic (Études 5 & 6) and acoustic (Étude 7) cues on statistical learning. Non-acoustic temporal cues have been shown to influence statistical learning of verbal and nonverbal artificial languages. In agreement with the dynamic attending theory (Jones, 1976), we proposed that non-acoustic temporal cues guide attention over time and influence statistical learning.Based on the influence of dynamic attending mechanisms on perception and learning and on evidence of shared structural and temporal integration resources for the processing of musical structures and other structured information, this PhD opens new questions about the potential influence of tonal and temporal auditory structure processing on general cognitive sequencing abilities, notably required in structured sequence perception and learning.Jones, M. R. (1976). Time, our lost dimension: Toward a new theory of perception, attention, and memory. Psychological Review, 83(5), 323-355. doi:10.1037/0033-295X.83.5.323Patel, A. D. (2003). Language, music, syntax and the brain. Nature Neuroscience, 6(7), 674-681. doi:10.1038/nn1082
136

Reconstruction de profils protéiques pour la recherche de biomarqueurs / Reconstruction of proteomic profiles for biomarker discovery

Szacherski, Pascal 21 December 2012 (has links)
Cette thèse préparée au CEA Leti, Minatec Campus, Grenoble, et à l’IMS, Bordeaux, s’inscrit dans le thème du traitement de l’information pour des données protéomiques. Nous cherchons à reconstruire des profils protéiques à partir des données issues de chaînes d’analyse complexes associant chromatographie liquide et spectrométrie de masse. Or, les signaux cibles sont des mesures de traces peptidiques qui sont de faible niveau dans un environnement très complexe et perturbé. Ceci nous a conduits à étudier des outils statistiques adaptés. Ces perturbations peuvent provenir des instruments de mesure (variabilité technique) ou des individus (variabilité biologique). Le modèle hiérarchique de l’acquisition des données permet d’inclure ces variabilités explicitement dans la modélisation probabiliste directe. La mise en place d’une méthodologie problèmes inverses permet ensuite d’estimer les grandeurs d’intérêt. Dans cette thèse, nous avons étudié trois types de problèmes inverses associés aux opérations suivantes: 1. la quantification de protéines cibles, vue comme l’estimation de la concentration protéique, 2. l’apprentissage supervisé à partir d’une cohorte multi-classe, vu comme l’estimation des paramètres des classes, et 3. la classification à partir des connaissances sur les classes, vue comme l’estimation de la classe à laquelle appartient un nouvel échantillon.La résolution des problèmes inverses se fait dans le cadre des méthodes statistiques bayésiennes, en ayant recours pour les calculs numériques aux méthodes d’échantillonnage stochastique (Monte Carlo Chaîne de Markov). / This thesis has been prepared at the CEA Leti, Minatec Campus, (Grenoble, France) and the IMS (Bordeaux, France) in the context of information and signal processing of proteomic data. The aim is to reconstruct the proteomic profile from the data provided by complex analytical workflow combining a spectrometer and a chromatograph. The signals are measurements of peptide traces which have low amplitude within a complex and noisy background. Therefore, adapted statistical signal processing methods are required. The uncertainty can be of technical nature (instruments, measurements) or of biological nature (individuals, “patients”). A hierarchical model, describing the forward problem of data acquisition, allows for includingexplicitly those variability sources within the probabilistic model. The use of the inverse problem methodology, finally, leads us to the estimation of the parameters of interest. In this thesis, we have studied three types of inverse problems for the following applications:1. quantification of targeted proteins, seen as estimation of the protein concentration,2. supervised training from a labelled cohort, seen as estimation of distribution parameters for each class,3. classification given the knowledge about the classes, seen as estimation of the class a biological sample belongs to.We solve these inverse problems within a Bayesian framework, resorting to stochastic sampling methods (Monte Carlo Markov Chain) for computation.
137

Contributions à l’apprentissage automatique pour l’analyse d’images cérébrales anatomiques / Contributions to statistical learning for structural neuroimaging data

Cuingnet, Rémi 29 March 2011 (has links)
L'analyse automatique de différences anatomiques en neuroimagerie a de nombreuses applications pour la compréhension et l'aide au diagnostic de pathologies neurologiques. Récemment, il y a eu un intérêt croissant pour les méthodes de classification telles que les machines à vecteurs supports pour dépasser les limites des méthodes univariées traditionnelles. Cette thèse a pour thème l'apprentissage automatique pour l'analyse de populations et la classification de patients en neuroimagerie. Nous avons tout d'abord comparé les performances de différentes stratégies de classification, dans le cadre de la maladie d'Alzheimer à partir d'images IRM anatomiques de 509 sujets de la base de données ADNI. Ces différentes stratégies prennent insuffisamment en compte la distribution spatiale des \textit{features}. C'est pourquoi nous proposons un cadre original de régularisation spatiale et anatomique des machines à vecteurs supports pour des données de neuroimagerie volumiques ou surfaciques, dans le formalisme de la régularisation laplacienne. Cette méthode a été appliquée à deux problématiques cliniques: la maladie d'Alzheimer et les accidents vasculaires cérébraux. L'évaluation montre que la méthode permet d'obtenir des résultats cohérents anatomiquement et donc plus facilement interprétables, tout en maintenant des taux de classification élevés. / Brain image analyses have widely relied on univariate voxel-wise methods. In such analyses, brain images are first spatially registered to a common stereotaxic space, and then mass univariate statistical tests are performed in each voxel to detect significant group differences. However, the sensitivity of theses approaches is limited when the differences involve a combination of different brain structures. Recently, there has been a growing interest in support vector machines methods to overcome the limits of these analyses.This thesis focuses on machine learning methods for population analysis and patient classification in neuroimaging. We first evaluated the performances of different classification strategies for the identification of patients with Alzheimer's disease based on T1-weighted MRI of 509 subjects from the ADNI database. However, these methods do not take full advantage of the spatial distribution of the features. As a consequence, the optimal margin hyperplane is often scattered and lacks spatial coherence, making its anatomical interpretation difficult. Therefore, we introduced a framework to spatially regularize support vector machines for brain image analysis based on Laplacian regularization operators. The proposed framework was then applied to the analysis of stroke and of Alzheimer's disease. The results demonstrated that the proposed classifier generates less-noisy and consequently more interpretable feature maps with no loss of classification performance.
138

Унапређење top down методологије за хијерархијско прогнозирање логистичких захтева у ланцима снабдевања / Unapređenje top down metodologije za hijerarhijsko prognoziranje logističkih zahteva u lancima snabdevanja / Boosting the performance of top down methodology for forecasting in supplychains via a new approach for determining disaggregating proportions

Mirčetić Dejan 05 July 2018 (has links)
<p>У докторату је предложен је нови модел за утврђивање деагрегационих<br />пропорција у top down методологији за хијерархијско прогнозирање.<br />Како би се утврдили показатељи рада новог приступа, извршена су<br />теоријска (симулациона студија) и емпиријска истраживања (студија<br />случаја) више ешалонског дистрибутивног ланца. Резултати показују да<br />нови приступ значајно превазилази стандардне моделе top down<br />методологије. Такође, у докторату је тестиран и утицај хијерархијских<br />прогноза на логистичке показатеље (просечне залихе и недостатак<br />залиха). Резултати показују да је нови модел остварио најмањи<br />недостатак залиха приликом примене у стратегијама управљања<br />залихама. Поред наведеног, у докторату је тестирано и комбиновање<br />различитих прогноза и истраживање утицаја особина временских серија<br />на прецизност прогнозирања модела за хијерархијско прогнозирање.</p> / <p>U doktoratu je predložen je novi model za utvrđivanje deagregacionih<br />proporcija u top down metodologiji za hijerarhijsko prognoziranje.<br />Kako bi se utvrdili pokazatelji rada novog pristupa, izvršena su<br />teorijska (simulaciona studija) i empirijska istraživanja (studija<br />slučaja) više ešalonskog distributivnog lanca. Rezultati pokazuju da<br />novi pristup značajno prevazilazi standardne modele top down<br />metodologije. Takođe, u doktoratu je testiran i uticaj hijerarhijskih<br />prognoza na logističke pokazatelje (prosečne zalihe i nedostatak<br />zaliha). Rezultati pokazuju da je novi model ostvario najmanji<br />nedostatak zaliha prilikom primene u strategijama upravljanja<br />zalihama. Pored navedenog, u doktoratu je testirano i kombinovanje<br />različitih prognoza i istraživanje uticaja osobina vremenskih serija<br />na preciznost prognoziranja modela za hijerarhijsko prognoziranje.</p> / <p>In this thesis, a new approach for determining disaggregating proportions in<br />the top down hierarchical forecasting methodology is proposed. In order to<br />estimate the accuracy of the proposed approach, the simulation and case<br />study are performed. Results demonstrate that the approach significantly<br />outperforms standard top down approaches. Also, in this reserach the impact<br />of hierarchical forecasts on logistics indicators (average stock and lack of<br />inventory) is researched. The results show that the new model achieved the<br />smallest lack of inventory in inventory management strategies. Likewise, in<br />this research, the ideas of combining the hierarchical forecasting models and<br />quantifying the influence of time series characteristics on the accuracy of<br />hierarchical forecasting models, are tested. The results are encouraging and<br />further researches are needed in order to reveal all possible benefits of<br />proposed ideas.</p>
139

RAMBLE: robust acoustic modeling for Brazilian learners of English / RAMBLE: modelagem acústica robusta para estudantes brasileiros de Inglês

Shulby, Christopher Dane 08 August 2018 (has links)
The gains made by current deep-learning techniques have often come with the price tag of big data and where that data is not available, a new solution must be found. Such is the case for accented and noisy speech where large databases do not exist and data augmentation techniques, which are less than perfect, present an even larger obstacle. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. An example of a low resource scenario exists even in the fifth largest land in the world; home to most of the speakers of the seventh most spoken language on earth. Brazil is the leader in the Latin-American economy and as a BRIC country aspires to become an ever-stronger player in the global marketplace. Still, English proficiency is low, even for professionals in businesses and universities. Low intelligibility and strong accents can damage professional credibility. It has been established in the literature for foreign language teaching that it is important that adult learners are made aware of their errors as outlined by the Noticing Theory, explaining that a learner is more successful when he is able to learn from his own mistakes. An essential objective of this dissertation is to classify phonemes in the acoustic model which is needed to properly identify phonemic errors automatically. A common belief in the community is that deep learning requires large datasets to be effective. This happens because brute force methods create a highly complex hypothesis space which requires large and complex networks which in turn demand a great amount of data samples in order to generate useful networks. Besides that, the loss functions used in neural learning does not provide statistical learning guarantees and only guarantees the network can memorize the training space well. In the case of accented or noisy speech where a new sample can carry a great deal of variation from the training samples, the generalization of such models suffers. The main objective of this dissertation is to investigate how more robust acoustic generalizations can be made, even with little data and noisy accented-speech data. The approach here is to take advantage of raw feature extraction provided by deep learning techniques and instead focus on how learning guarantees can be provided for small datasets to produce robust results for acoustic modeling without the dependency of big data. This has been done by careful and intelligent parameter and architecture selection within the framework of the statistical learning theory. Here, an intelligently defined CNN architecture, together with context windows and a knowledge-driven hierarchical tree of SVM classifiers achieves nearly state-of-the-art frame-wise phoneme recognition results with absolutely no pretraining or external weight initialization. A goal of this thesis is to produce transparent and reproducible architectures with high frame-level accuracy, comparable to the state of the art. Additionally, a convergence analysis based on the learning guarantees of the statistical learning theory is performed in order to evidence the generalization capacity of the model. The model achieves 39.7% error in framewise classification and a 43.5% phone error rate using deep feature extraction and SVM classification even with little data (less than 7 hours). These results are comparable to studies which use well over ten times that amount of data. Beyond the intrinsic evaluation, the model also achieves an accuracy of 88% in the identification of epenthesis, the error which is most difficult for Brazilian speakers of English This is a 69% relative percentage gain over the previous values in the literature. The results are significant because it shows how deep feature extraction can be applied to little data scenarios, contrary to popular belief. The extrinsic, task-based results also show how this approach could be useful in tasks like automatic error diagnosis. Another contribution is the publication of a number of freely available resources which previously did not exist, meant to aid future researches in dataset creation. / Os ganhos obtidos pelas atuais técnicas de aprendizado profundo frequentemente vêm com o preço do big data e nas pesquisas em que esses grandes volumes de dados não estão disponíveis, uma nova solução deve ser encontrada. Esse é o caso do discurso marcado e com forte pronúncia, para o qual não existem grandes bases de dados; o uso de técnicas de aumento de dados (data augmentation), que não são perfeitas, apresentam um obstáculo ainda maior. Outro problema encontrado é que os resultados do estado da arte raramente são reprodutíveis porque os métodos usam conjuntos de dados proprietários, redes prétreinadas e/ou inicializações de peso de outras redes maiores. Um exemplo de um cenário de poucos recursos existe mesmo no quinto maior país do mundo em território; lar da maioria dos falantes da sétima língua mais falada do planeta. O Brasil é o líder na economia latino-americana e, como um país do BRIC, deseja se tornar um participante cada vez mais forte no mercado global. Ainda assim, a proficiência em inglês é baixa, mesmo para profissionais em empresas e universidades. Baixa inteligibilidade e forte pronúncia podem prejudicar a credibilidade profissional. É aceito na literatura para ensino de línguas estrangeiras que é importante que os alunos adultos sejam informados de seus erros, conforme descrito pela Noticing Theory, que explica que um aluno é mais bem sucedido quando ele é capaz de aprender com seus próprios erros. Um objetivo essencial desta tese é classificar os fonemas do modelo acústico, que é necessário para identificar automaticamente e adequadamente os erros de fonemas. Uma crença comum na comunidade é que o aprendizado profundo requer grandes conjuntos de dados para ser efetivo. Isso acontece porque os métodos de força bruta criam um espaço de hipóteses altamente complexo que requer redes grandes e complexas que, por sua vez, exigem uma grande quantidade de amostras de dados para gerar boas redes. Além disso, as funções de perda usadas no aprendizado neural não fornecem garantias estatísticas de aprendizado e apenas garantem que a rede possa memorizar bem o espaço de treinamento. No caso de fala marcada ou com forte pronúncia, em que uma nova amostra pode ter uma grande variação comparada com as amostras de treinamento, a generalização em tais modelos é prejudicada. O principal objetivo desta tese é investigar como generalizações acústicas mais robustas podem ser obtidas, mesmo com poucos dados e/ou dados ruidosos de fala marcada ou com forte pronúncia. A abordagem utilizada nesta tese visa tirar vantagem da raw feature extraction fornecida por técnicas de aprendizado profundo e obter garantias de aprendizado para conjuntos de dados pequenos para produzir resultados robustos para a modelagem acústica, sem a necessidade de big data. Isso foi feito por meio de seleção cuidadosa e inteligente de parâmetros e arquitetura no âmbito da Teoria do Aprendizado Estatístico. Nesta tese, uma arquitetura baseada em Redes Neurais Convolucionais (RNC) definida de forma inteligente, junto com janelas de contexto e uma árvore hierárquica orientada por conhecimento de classificadores que usam Máquinas de Vetores Suporte (Support Vector Machines - SVMs) obtém resultados de reconhecimento de fonemas baseados em frames quase no estado da arte sem absolutamente nenhum pré-treinamento ou inicialização de pesos de redes externas. Um objetivo desta tese é produzir arquiteturas transparentes e reprodutíveis com alta precisão em nível de frames, comparável ao estado da arte. Adicionalmente, uma análise de convergência baseada nas garantias de aprendizado da teoria de aprendizagem estatística é realizada para evidenciar a capacidade de generalização do modelo. O modelo possui um erro de 39,7% na classificação baseada em frames e uma taxa de erro de fonemas de 43,5% usando raw feature extraction e classificação com SVMs mesmo com poucos dados (menos de 7 horas). Esses resultados são comparáveis aos estudos que usam bem mais de dez vezes essa quantidade de dados. Além da avaliação intrínseca, o modelo também alcança uma precisão de 88% na identificação de epêntese, o erro que é mais difícil para brasileiros falantes de inglês. Este é um ganho relativo de 69% em relação aos valores anteriores da literatura. Os resultados são significativos porque mostram como raw feature extraction pode ser aplicada a cenários de poucos dados, ao contrário da crença popular. Os resultados extrínsecos também mostram como essa abordagem pode ser útil em tarefas como o diagnóstico automático de erros. Outra contribuição é a publicação de uma série de recursos livremente disponíveis que anteriormente não existiam, destinados a auxiliar futuras pesquisas na criação de conjuntos de dados.
140

Évaluation de modèles computationnels de la vision humaine en imagerie par résonance magnétique fonctionnelle / Evaluating Computational Models of Vision with Functional Magnetic Resonance Imaging

Eickenberg, Michael 21 September 2015 (has links)
L'imagerie par résonance magnétique fonctionnelle (IRMf) permet de mesurer l'activité cérébrale à travers le flux sanguin apporté aux neurones. Dans cette thèse nous évaluons la capacité de modèles biologiquement plausibles et issus de la vision par ordinateur à représenter le contenu d'une image de façon similaire au cerveau. Les principaux modèles de vision évalués sont les réseaux convolutionnels.Les réseaux de neurones profonds ont connu un progrès bouleversant pendant les dernières années dans divers domaines. Des travaux antérieurs ont identifié des similarités entre le traitement de l'information visuelle à la première et dernière couche entre un réseau de neurones et le cerveau. Nous avons généralisé ces similarités en identifiant des régions cérébrales correspondante à chaque étape du réseau de neurones. Le résultat consiste en une progression des niveaux de complexité représentés dans le cerveau qui correspondent à l'architecture connue des aires visuelles: Plus la couche convolutionnelle est profonde, plus abstraits sont ses calculs et plus haut niveau sera la fonction cérébrale qu'elle sait modéliser au mieux. Entre la détection de contours en V1 et la spécificité à l'objet en cortex inférotemporal, fonctions assez bien comprises, nous montrons pour la première fois que les réseaux de neurones convolutionnels de détection d'objet fournissent un outil pour l'étude de toutes les étapes intermédiaires du traitement visuel effectué par le cerveau.Un résultat préliminaire à celui-ci est aussi inclus dans le manuscrit: L'étude de la réponse cérébrale aux textures visuelles et sa modélisation avec les réseaux convolutionnels de scattering.L'autre aspect global de cette thèse sont modèles de “décodage”: Dans la partie précédente, nous prédisions l'activité cérébrale à partir d'un stimulus (modèles dits d’”encodage”). La prédiction du stimulus à partir de l'activité cérébrale est le méchanisme d'inférence inverse et peut servir comme preuve que cette information est présente dans le signal. Le plus souvent, des modèles linéaires généralisés tels que la régression linéaire ou logistique ou les SVM sont utilisés, donnant ainsi accès à une interprétation des coefficients du modèle en tant que carte cérébrale. Leur interprétation visuelle est cependant difficile car le problème linéaire sous-jacent est soit mal posé et mal conditionné ou bien non adéquatement régularisé, résultant en des cartes non-informatives. En supposant une organisation contigüe en espace et parcimonieuse, nous nous appuyons sur la pénalité convexe d'une somme de variation totale et la norme L1 (TV+L1) pour développer une pénalité regroupant un terme d'activation et un terme de dérivée spatiale. Cette pénalité a la propriété de mettre à zéro la plupart des coefficients tout en permettant une variation libre des coefficients dans une zone d'activation, contrairement à TV+L1 qui impose des zones d’activation plates. Cette méthode améliore l'interprétabilité des cartes obtenues dans un schéma de validation croisée basé sur la précision du modèle prédictif.Dans le contexte des modèles d’encodage et décodage nous tâchons à améliorer les prétraitements des données. Nous étudions le comportement du signal IRMf par rapport à la stimulation ponctuelle : la réponse impulsionnelle hémodynamique. Pour générer des cartes d'activation, au lieu d’un modèle linéaire classique qui impose une réponse impulsionnelle canonique fixe, nous utilisons un modèle bilinéaire à réponse hémodynamique variable spatialement mais fixe à travers les événements de stimulation. Nous proposons un algorithme efficace pour l'estimation et montrons un gain en capacité prédictive sur les analyses menées, en encodage et décodage. / Blood-oxygen-level dependent (BOLD) functional magnetic resonance imaging (fMRI) makes it possible to measure brain activity through blood flow to areas with metabolically active neurons. In this thesis we use these measurements to evaluate the capacity of biologically inspired models of vision coming from computer vision to represent image content in a similar way as the human brain. The main vision models used are convolutional networks.Deep neural networks have made unprecedented progress in many fields in recent years. Even strongholds of biological systems such as scene analysis and object detection have been addressed with enormous success. A body of prior work has been able to establish firm links between the first and last layers of deep convolutional nets and brain regions: The first layer and V1 essentially perform edge detection and the last layer as well as inferotemporal cortex permit a linear read-out of object category. In this work we have generalized this correspondence to all intermediate layers of a convolutional net. We found that each layer of a convnet maps to a stage of processing along the ventral stream, following the hierarchy of biological processing: Along the ventral stream we observe a stage-by-stage increase in complexity. Between edge detection and object detection, for the first time we are given a toolbox to study the intermediate processing steps.A preliminary result to this was obtained by studying the response of the visual areas to presentation of visual textures and analysing it using convolutional scattering networks.The other global aspect of this thesis is “decoding” models: In the preceding part, we predicted brain activity from the stimulus presented (this is called “encoding”). Predicting a stimulus from brain activity is the inverse inference mechanism and can be used as an omnibus test for presence of this information in brain signal. Most often generalized linear models such as linear or logistic regression or SVMs are used for this task, giving access to a coefficient vector the same size as a brain sample, which can thus be visualized as a brain map. However, interpretation of these maps is difficult, because the underlying linear system is either ill-defined and ill-conditioned or non-adequately regularized, resulting in non-informative maps. Supposing a sparse and spatially contiguous organization of coefficient maps, we build on the convex penalty consisting of the sum of total variation (TV) seminorm and L1 norm (“TV+L1”) to develop a penalty grouping an activation term with a spatial derivative. This penalty sets most coefficients to zero but permits free smooth variations in active zones, as opposed to TV+L1 which creates flat active zones. This method improves interpretability of brain maps obtained through cross-validation to determine the best hyperparameter.In the context of encoding and decoding models, we also work on improving data preprocessing in order to obtain the best performance. We study the impulse response of the BOLD signal: the hemodynamic response function. To generate activation maps, instead of using a classical linear model with fixed canonical response function, we use a bilinear model with spatially variable hemodynamic response (but fixed across events). We propose an efficient optimization algorithm and show a gain in predictive capacity for encoding and decoding models on different datasets.

Page generated in 0.0767 seconds