Global ETD Search

1	Ensembles des modeles en fMRI : l'apprentissage stable à grande échelle / Ensembles of models in fMRI : stable learning in large-scale settings Hoyos-Idrobo, Andrés 20 January 2017 (has links) En imagerie médicale, des collaborations internationales ont lançé l'acquisition de centaines de Terabytes de données - et en particulierde données d'Imagerie par Résonance Magnétique fonctionelle (IRMf) -pour les mettre à disposition de la communauté scientifique.Extraire de l'information utile de ces données nécessite d'importants prétraitements et des étapes de réduction de bruit. La complexité de ces analyses rend les résultats très sensibles aux paramètres choisis. Le temps de calcul requis augmente plus vite que linéairement: les jeux de données sont si importants qu'il ne tiennent plus dans le cache, et les architectures de calcul classiques deviennent inefficaces.Pour réduire les temps de calcul, nous avons étudié le feature-grouping commetechnique de réduction de dimension. Pour ce faire, nous utilisons des méthodes de clustering. Nous proposons un algorithme de clustering agglomératif en temps linéaire: Recursive Nearest Agglomeration (ReNA). ReNA prévient la création de clusters énormes, qui constitue un défaut des méthodes agglomératives rapidesexistantes. Nous démontrons empiriquement que cet algorithme de clustering engendre des modèles très précis et rapides, et permet d'analyser de grands jeux de données avec des ressources limitées.En neuroimagerie, l'apprentissage statistique peut servir à étudierl'organisation cognitive du cerveau. Des modèles prédictifs permettent d'identifier les régions du cerveau impliquées dans le traitement cognitif d'un stimulus externe. L'entraînement de ces modèles est un problème de très grande dimension, et il est nécéssaire d'introduire un a priori pour obtenir un modèle satisfaisant.Afin de pouvoir traiter de grands jeux de données et d'améliorer lastabilité des résultats, nous proposons de combiner le clustering etl'utilisation d'ensembles de modèles. Nous évaluons la performance empirique de ce procédé à travers de nombreux jeux de données de neuroimagerie. Cette méthode est hautement parallélisable et moins coûteuse que l'état del'art en temps de calcul. Elle permet, avec moins de données d'entraînement,d'obtenir de meilleures prédictions. Enfin, nous montrons que l'utilisation d'ensembles de modèles améliore la stabilité des cartes de poids résultantes et réduit la variance du score de prédiction. / In medical imaging, collaborative worldwide initiatives have begun theacquisition of hundreds of Terabytes of data that are made available to thescientific community. In particular, functional Magnetic Resonance Imaging --fMRI-- data. However, this signal requires extensive fitting and noise reduction steps to extract useful information. The complexity of these analysis pipelines yields results that are highly dependent on the chosen parameters.The computation cost of this data deluge is worse than linear: as datasetsno longer fit in cache, standard computational architectures cannot beefficiently used.To speed-up the computation time, we considered dimensionality reduction byfeature grouping. We use clustering methods to perform this task. We introduce a linear-time agglomerative clustering scheme, Recursive Nearest Agglomeration (ReNA). Unlike existing fast agglomerative schemes, it avoids the creation of giant clusters. We then show empirically how this clustering algorithm yields very fast and accurate models, enabling to process large datasets on budget.In neuroimaging, machine learning can be used to understand the cognitiveorganization of the brain. The idea is to build predictive models that are used to identify the brain regions involved in the cognitive processing of an external stimulus. However, training such estimators is a high-dimensional problem, and one needs to impose some prior to find a suitable model.To handle large datasets and increase stability of results, we propose to useensembles of models in combination with clustering. We study the empirical performance of this pipeline on a large number of brain imaging datasets. This method is highly parallelizable, it has lower computation time than the state-of-the-art methods and we show that, it requires less data samples to achieve better prediction accuracy. Finally, we show that ensembles of models improve the stability of the weight maps and reduce the variance of prediction accuracy. IRMf Clustering Reduction de dimension Décodage FMRI Clustering Dimentionality reduction Decoding
2	Réduction de dimension via Sliced Inverse Regression : Idées et nouvelles propositions / Dimension reductio via Sliced Inverse Regression : ideas and extensions Chiancone, Alessandro 28 October 2016 (has links) Cette thèse propose trois extensions de la Régression linéaire par tranches (Sliced Inverse Regression, SIR), notamment Collaborative SIR, Student SIR et Knockoff SIR.Une des faiblesses de la méthode SIR est l’impossibilité de vérifier si la Linearity Design Condition (LDC) est respectée. Il est établi que, si x suit une distribution elliptique, la condition est vraie ; dans le cas d’une composition de distributions elliptiques il n y a aucune garantie que la condition soit vérifiée globalement, pourtant, elle est respectée localement.On va donc proposer une extension sur la base de cette considération. Étant donné une variable explicative x, Collaborative SIR réalise d’abord un clustering. Pour chaque cluster, la méthode SIR est appliquée de manière indépendante.Le résultat de chaque composant contribue à créer la solution finale.Le deuxième papier, Student SIR, dérive de la nécessité de robustifier la méthode SIR.Vu que cette dernière repose sur l’estimation de la covariance et contient une étape APC, alors elle est sensible au bruit.Afin d’étendre la méthode SIR on a utilisé une stratégie fondée sur une formulation inverse du SIR, proposée par R.D. Cook.Finalement, Knockoff SIR est une extension de la méthode SIR pour la sélection des variables et la recherche d’une solution sparse, ayant son fondement dans le papier publié par R.F. Barber et E.J. Candès qui met l’accent sur le false discovery rate dans le cadre de la régression. L’idée sous-jacente à notre papier est de créer des copies de variables d’origine ayant certaines proprietés.On va montrer que la méthode SIR est robuste par rapport aux copies et on va proposer une stratégie pour utiliser les résultats dans la sélection des variables et pour générer des solutions sparse / This thesis proposes three extensions of Sliced Inverse Regression namely: Collaborative SIR, Student SIR and Knockoff SIR.One of the weak points of SIR is the impossibility to check if the Linearity Design Condition (LDC) holds. It is known that if X follows an elliptic distribution thecondition holds true, in case of a mixture of elliptic distributions there are no guaranties that the condition is satisfied globally, but locally holds. Starting from this consideration an extension is proposed. Given the predictor variable X, Collaborative SIR performs initially a clustering. In each cluster, SIR is applied independently. The result from each component collaborates to give the final solution.Our second contribution, Student SIR, comes from the need to robustify SIR. Since SIR is based on the estimation of the covariance, and contains a PCA step, it is indeed sensitive to noise. To extend SIR, an approach based on a inverse formulation of SIR proposed by R.D. Cook has been used.Finally Knockoff SIR is an extension of SIR to perform variable selection and give sparse solution that has its foundations in a recently published paper by R. F. Barber and E. J. Candès that focuses on the false discovery rate in the regression framework. The underlying idea of this paper is to construct copies of the original variables that have some properties. It is shown that SIR is robust to this copies and a strategy is proposed to use this result for variable selection and to generate sparse solutions. Régression linéaire par tranches Reduction de dimension Selection de variables Sliced Inverse Regression Dimension reduction Variable selection 510
3	Bayesovská optimalizace / Bayesian optimization Kostovčík, Peter January 2017 (has links) Optimization is an important part of mathematics and is mostly used for practical applications. For specific types of objective functions, a lot of different methods exist. A method to use when the objective is unknown and/or expensive can be difficult to determine. One of the answers is bayesian optimization, which instead of direct optimization creates a probabilistic model and uses it to constructs easily optimizable auxiliary function. It is an iterative method that uses information from previous iterations to find new point in which the objective is evaluated and tries to find the optimum within a fewer iterations. This thesis introduces bayesian optimization, suma- rizes its different approaches in lower and higher dimensions and shows when to use it suitably. An important part of the thesis is my own optimization algorithm which is applied to different practical problems - e.g. parameter optimization in machine learning algorithm. 1
4	Zwangsmobilität und Verkehrsmittelorientierung junger Erwachsener / Forced mobility and orientation towards transport modes of young adults: Creation of a typology Wittwer, Rico 23 January 2015 (has links) (PDF) In der Mobilitätsforschung entstand in den vergangenen Jahrzehnten eine breite Wissensbasis für das Verständnis von Verkehrsursachen und Zusammenhängen, die das Verkehrsverhalten determinieren. Mit der Entwicklung von Verkehrsmodellen lag das Forschungsinteresse zunächst primär bei Ökonomen und Ökonometrikern sowie Verkehrsingenieuren. Bald kamen andere Wissenschaftsbereiche wie die Psychologie oder die Geowissenschaften hinzu, welche sich in der Folge zunehmend mit dem Thema Mobilität befassten und die zur Erklärung des menschlichen Verhaltens ganz unterschiedliche Methoden und Maßstäbe nutzten. Heute versuchen zumeist handlungsorientierte Ansätze, auf Individualebene, Faktoren zu bestimmen, die Aufschluss über die Verhaltensvariabilität in der Bevölkerung geben und damit einen möglichst großen Beitrag zur Varianzaufklärung leisten. Werden Einflussfaktoren in geeigneter Weise identifiziert und quantifiziert, können Defizite und Chancen erkannt und das Verhalten steuernde Maßnahmen entworfen werden. Mit deren Hilfe wird ungewollten Entwicklungen entgegengesteuert. Junge Erwachsene stellen aufgrund ihrer sehr unterschiedlichen Phasen im Lebenszyklus, z. B. gerade anstehender oder abgeschlossener Ausbildung, Umzug in eine eigene Wohnung, Familiengründung, Neuorientierung in Arbeitsroutinen oder das Einleben in ein anderes Lebensumfeld einer fremden Stadt, intuitiv eine sehr heterogene Gruppe dar. Die Modellierung des Verhaltens ist für diese Altersgruppe besonders schwierig. Aus der Komplexität dieser Problemstellung heraus ist ersichtlich, dass fundierte Analysen zur Mobilität junger Erwachsener notwendig sind, um verkehrsplanerische Defizite aufzudecken und Chancen zu erkennen. Der methodische Schwerpunkt des Beitrages liegt auf der Bildung einer Typologie des Verkehrsverhaltens junger Erwachsener. Die verwendete Datengrundlage ist das „Deutsche Mobilitätspanel – MOP“. Dabei wird der Versuch unternommen, zunächst Variablen aller relevanten Dimensionen des handlungsorientierten, aktivitätsbasierten Verkehrsverhaltens zusammenzustellen und für eine entsprechende Analyse aufzubereiten. Im Anschluss werden geeignete und in den Sozialwissenschaften erprobte Verfahren zur Ähnlichkeitsmessung eingesetzt, um möglichst verhaltensähnliche Personen zu typologisieren. Im Weiteren finden konfirmatorische Analysetechniken Anwendung, mit deren Hilfe Verhaltenshintergründe erklärt und inferenzstatistisch geprüft werden. Als Ergebnis wird eine clusteranalytische Typologisierung vorgestellt, die im Anschluss anhand soziodemografischer Indikatoren und raumstruktureller Kriterien der Lagegunst beschrieben wird. Aufgrund der gewonnenen Erkenntnisse können objektive und im Idealfall quantifizierbare, d. h. prognosefähige Merkmale zur Bildung verkehrssoziologischer und weitgehend verhaltensähnlicher Personengruppen genutzt werden. / Over the last few decades of mobility research, a wide base of knowledge for understanding travel determinants and causal relationships in mobility behavior has been established. The development of travel models was at first of interest primarily to economists and econometricians as well as transportation engineers. They were soon joined by other scientific areas such as psychology or the geosciences, which as a result increasingly addressed the theme of mobility and used quite different methodologies and criteria for explaining human behavior. Today, activity-oriented approaches generally attempt to determine individual-level factors that provide information on behavioral variability within the population, thereby contributing greatly to explaining variances. If explanatory factors can be properly identified and quantified, then deficiencies and opportunities can be recognized and measures for influencing behavior can be conceptualized. With their help, undesirable developments can be avoided. Because of their highly differing stages in life, e.g. upcoming or recently completed education, moving into their own apartment, starting a family, becoming oriented in a work routine or adapting to a new environment in a different city, young adults are intuitively a very heterogeneous group. Modeling the behavior of this age group is particularly difficult. This problem makes it clear that founded analysis of the mobility of young adults is necessary in order to recognize deficiencies and opportunities in transportation planning. The methodological focus of this work is on creating a typology of young adults’ travel behavior. The base data is from the “Deutsches Mobilitätspanel – MOP” (German Mobility Panel). An attempt is made to gather and prepare all relevant dimensions of decision-oriented, activity-based travel behavior for a corresponding analysis. Afterward, appropriate and proven methods from the social sciences are used to test for similarity in order to identify groups of persons which are as behaviorally homogeneous as possible. In addition, confirmatory data analysis is utilized which helps explain and test, through inferential statistics, determinants of behavior. The resulting typology from the cluster analysis is presented and followed by a description using sociodemographic indicators and spatial criteria of accessibility. The findings make it possible to use objective and, ideally, quantifiable and therefore forecastable characteristics for identifying sociological population groups within which similar travel behavior is displayed. Junge Erwachsene Verkehrssoziologie Mobilität Verkehrsverhalten Zwangsmobilität Verkehrsmittelwahl Typologie Klassifizierung exploratorische Datenanalyse konfirmatorische Datenanalyse Dimensionsreduktion Faktorenanalyse Clusteranalyse Diskriminanzanalyse logistische Regression verhaltenshomogene Personengruppen Young adults sociology of transport mobility travel behavior forced mobility choice of transport modes typology classification exploratory data analysis confirmatory data analysis reduction of dimension factor analysis cluster analysis discriminant analysis logistic regression homogeneous groups of persons ddc:620 rvk:ZO 3300 rvk:QR 800
5	Zwangsmobilität und Verkehrsmittelorientierung junger Erwachsener: Eine Typologisierung Wittwer, Rico 12 December 2014 (has links) In der Mobilitätsforschung entstand in den vergangenen Jahrzehnten eine breite Wissensbasis für das Verständnis von Verkehrsursachen und Zusammenhängen, die das Verkehrsverhalten determinieren. Mit der Entwicklung von Verkehrsmodellen lag das Forschungsinteresse zunächst primär bei Ökonomen und Ökonometrikern sowie Verkehrsingenieuren. Bald kamen andere Wissenschaftsbereiche wie die Psychologie oder die Geowissenschaften hinzu, welche sich in der Folge zunehmend mit dem Thema Mobilität befassten und die zur Erklärung des menschlichen Verhaltens ganz unterschiedliche Methoden und Maßstäbe nutzten. Heute versuchen zumeist handlungsorientierte Ansätze, auf Individualebene, Faktoren zu bestimmen, die Aufschluss über die Verhaltensvariabilität in der Bevölkerung geben und damit einen möglichst großen Beitrag zur Varianzaufklärung leisten. Werden Einflussfaktoren in geeigneter Weise identifiziert und quantifiziert, können Defizite und Chancen erkannt und das Verhalten steuernde Maßnahmen entworfen werden. Mit deren Hilfe wird ungewollten Entwicklungen entgegengesteuert. Junge Erwachsene stellen aufgrund ihrer sehr unterschiedlichen Phasen im Lebenszyklus, z. B. gerade anstehender oder abgeschlossener Ausbildung, Umzug in eine eigene Wohnung, Familiengründung, Neuorientierung in Arbeitsroutinen oder das Einleben in ein anderes Lebensumfeld einer fremden Stadt, intuitiv eine sehr heterogene Gruppe dar. Die Modellierung des Verhaltens ist für diese Altersgruppe besonders schwierig. Aus der Komplexität dieser Problemstellung heraus ist ersichtlich, dass fundierte Analysen zur Mobilität junger Erwachsener notwendig sind, um verkehrsplanerische Defizite aufzudecken und Chancen zu erkennen. Der methodische Schwerpunkt des Beitrages liegt auf der Bildung einer Typologie des Verkehrsverhaltens junger Erwachsener. Die verwendete Datengrundlage ist das „Deutsche Mobilitätspanel – MOP“. Dabei wird der Versuch unternommen, zunächst Variablen aller relevanten Dimensionen des handlungsorientierten, aktivitätsbasierten Verkehrsverhaltens zusammenzustellen und für eine entsprechende Analyse aufzubereiten. Im Anschluss werden geeignete und in den Sozialwissenschaften erprobte Verfahren zur Ähnlichkeitsmessung eingesetzt, um möglichst verhaltensähnliche Personen zu typologisieren. Im Weiteren finden konfirmatorische Analysetechniken Anwendung, mit deren Hilfe Verhaltenshintergründe erklärt und inferenzstatistisch geprüft werden. Als Ergebnis wird eine clusteranalytische Typologisierung vorgestellt, die im Anschluss anhand soziodemografischer Indikatoren und raumstruktureller Kriterien der Lagegunst beschrieben wird. Aufgrund der gewonnenen Erkenntnisse können objektive und im Idealfall quantifizierbare, d. h. prognosefähige Merkmale zur Bildung verkehrssoziologischer und weitgehend verhaltensähnlicher Personengruppen genutzt werden. / Over the last few decades of mobility research, a wide base of knowledge for understanding travel determinants and causal relationships in mobility behavior has been established. The development of travel models was at first of interest primarily to economists and econometricians as well as transportation engineers. They were soon joined by other scientific areas such as psychology or the geosciences, which as a result increasingly addressed the theme of mobility and used quite different methodologies and criteria for explaining human behavior. Today, activity-oriented approaches generally attempt to determine individual-level factors that provide information on behavioral variability within the population, thereby contributing greatly to explaining variances. If explanatory factors can be properly identified and quantified, then deficiencies and opportunities can be recognized and measures for influencing behavior can be conceptualized. With their help, undesirable developments can be avoided. Because of their highly differing stages in life, e.g. upcoming or recently completed education, moving into their own apartment, starting a family, becoming oriented in a work routine or adapting to a new environment in a different city, young adults are intuitively a very heterogeneous group. Modeling the behavior of this age group is particularly difficult. This problem makes it clear that founded analysis of the mobility of young adults is necessary in order to recognize deficiencies and opportunities in transportation planning. The methodological focus of this work is on creating a typology of young adults’ travel behavior. The base data is from the “Deutsches Mobilitätspanel – MOP” (German Mobility Panel). An attempt is made to gather and prepare all relevant dimensions of decision-oriented, activity-based travel behavior for a corresponding analysis. Afterward, appropriate and proven methods from the social sciences are used to test for similarity in order to identify groups of persons which are as behaviorally homogeneous as possible. In addition, confirmatory data analysis is utilized which helps explain and test, through inferential statistics, determinants of behavior. The resulting typology from the cluster analysis is presented and followed by a description using sociodemographic indicators and spatial criteria of accessibility. The findings make it possible to use objective and, ideally, quantifiable and therefore forecastable characteristics for identifying sociological population groups within which similar travel behavior is displayed. info:eu-repo/classification/ddc/620 ddc:620

1

Page generated in 0.1299 seconds