• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

[pt] ENSAIOS SOBRE NOWCASTING COM DADOS EM ALTA DIMENSÃO / [en] ESSAYS ON NOWCASTING WITH HIGH DIMENSIONAL DATA

HENRIQUE FERNANDES PIRES 02 June 2022 (has links)
[pt] Em economia, Nowcasting é a previsão do presente, do passado recente ou mesmo a previsão do futuro muito próximo de um determinado indicador. Geralmente, um modelo nowcast é útil quando o valor de uma variável de interesse é disponibilizado com um atraso significativo em relação ao seu período de referência e/ou sua realização inicial é notavelmente revisada ao longo do tempo, se estabilizando somente após um tempo. Nesta tese, desenvolvemos e analisamos vários métodos de Nowcasting usando dados de alta dimensão (big data) em diferentes contextos: desde a previsão de séries econômicas até o nowcast de óbitos pela COVID-19. Em um de nossos estudos, comparamos o desempenho de diferentes algoritmos de Machine Learning com modelos mais naive na previsão de muitas variáveis econômicas em tempo real e mostramos que, na maioria das vezes, o Machine Learning supera os modelos de benchmark. Já no restante dos nossos exercícios, combinamos várias técnicas de nowcasting com um grande conjunto de dados (incluindo variáveis de alta frequência, como o Google Trends) para rastrear a pandemia no Brasil, mostrando que fomos capazes de antecipar os números reais de mortes e casos muito antes de estarem disponíveis oficialmente para todos. / [en] Nowcasting in economics is the prediction of the present, the recent past or even the prediction of the very near future of a certain indicator. Generally, a nowcast model is useful when the value of a target variable is released with a significant delay with respect to its reference period and/or when its value gets notably revised over time and stabilizes only after a while. In this thesis, we develop and analyze several Nowcasting methods using high-dimensional (big) data in different contexts: from the forecasting of economic series to the nowcast of COVID-19. In one of our studies, we compare the performance of different Machine Learning algorithms with more naive models in predicting many economic variables in real-time and we show that, most of the time, Machine Learning beats benchmark models. Then, in the rest of our exercises, we combine several nowcasting techniques with a big dataset (including high-frequency variables, such as Google Trends) in order to track the pandemic in Brazil, showing that we were able to nowcast the true numbers of deaths and cases way before they got available to everyone.
2

Analyse intégrative de données de grande dimension appliquée à la recherche vaccinale / Integrative analysis of high-dimensional data applied to vaccine research

Hejblum, Boris 06 March 2015 (has links)
Les données d’expression génique sont reconnues comme étant de grande dimension, etnécessitant l’emploi de méthodes statistiques adaptées. Mais dans le contexte des essaisvaccinaux, d’autres mesures, comme par exemple les mesures de cytométrie en flux, sontégalement de grande dimension. De plus, ces données sont souvent mesurées de manièrelongitudinale. Ce travail est bâti sur l’idée que l’utilisation d’un maximum d’informationdisponible, en modélisant les connaissances a priori ainsi qu’en intégrant l’ensembledes différentes données disponibles, améliore l’inférence et l’interprétabilité des résultatsd’analyses statistiques en grande dimension. Tout d’abord, nous présentons une méthoded’analyse par groupe de gènes pour des données d’expression génique longitudinales. Ensuite,nous décrivons deux analyses intégratives dans deux études vaccinales. La premièremet en évidence une sous-expression des voies biologiques d’inflammation chez les patientsayant un rebond viral moins élevé à la suite d’un vaccin thérapeutique contre le VIH. Ladeuxième étude identifie un groupe de gènes lié au métabolisme lipidique dont l’impactsur la réponse à un vaccin contre la grippe semble régulé par la testostérone, et donc liéau sexe. Enfin, nous introduisons un nouveau modèle de mélange de distributions skew t àprocessus de Dirichlet pour l’identification de populations cellulaires à partir de donnéesde cytométrie en flux disponible notamment dans les essais vaccinaux. En outre, nousproposons une stratégie d’approximation séquentielle de la partition a posteriori dans lecas de mesures répétées. Ainsi, la reconnaissance automatique des populations cellulairespourrait permettre à la fois une avancée pratique pour le quotidien des immunologistesainsi qu’une interprétation plus précise des résultats d’expression génique après la priseen compte de l’ensemble des populations cellulaires. / Gene expression data is recognized as high-dimensional data that needs specific statisticaltools for its analysis. But in the context of vaccine trials, other measures, such asflow-cytometry measurements are also high-dimensional. In addition, such measurementsare often repeated over time. This work is built on the idea that using the maximum ofavailable information, by modeling prior knowledge and integrating all data at hand, willimprove the inference and the interpretation of biological results from high-dimensionaldata. First, we present an original methodological development, Time-course Gene SetAnalysis (TcGSA), for the analysis of longitudinal gene expression data, taking into accountprior biological knowledge in the form of predefined gene sets. Second, we describetwo integrative analyses of two different vaccine studies. The first study reveals lowerexpression of inflammatory pathways consistently associated with lower viral rebound followinga HIV therapeutic vaccine. The second study highlights the role of a testosteronemediated group of genes linked to lipid metabolism in sex differences in immunologicalresponse to a flu vaccine. Finally, we introduce a new model-based clustering approach forthe automated treatment of cell populations from flow-cytometry data, namely a Dirichletprocess mixture of skew t-distributions, with a sequential posterior approximation strategyfor dealing with repeated measurements. Hence, the automatic recognition of thecell populations could allow a practical improvement of the daily work of immunologistsas well as a better interpretation of gene expression data after taking into account thefrequency of all cell populations.

Page generated in 0.0684 seconds