Global ETD Search

1	Analyse de données de cytometrie de flux pour un grand nombre d'échantillons / Automated flow cytometric analysis across a large number of samples Chen, Xiaoyi 06 October 2015 (has links) Cette thèse a conduit à la mise au point de deux nouvelles approches statistiques pour l'identification automatique de populations cellulaires en cytometrie de flux multiparamétrique, et ceci pour le traitement d'un grand nombre d'échantillons, chaque échantillon étant prélevé sur un donneur particulier. Ces deux approches répondent à des besoins exprimés dans le cadre du projet Labex «Milieu Intérieur». Dix panels cytométriques de 8 marqueurs ont été sélectionnés pour la quantification des populations principales et secondaires présentes dans le sang périphérique. Sur la base de ces panels, les données ont été acquises et analysées sur une cohorte de 1000 donneurs sains.Tout d'abord, nous avons recherché une quantification robuste des principales composantes cellulaires du système immunitaire. Nous décrivons une procédure computationnelle, appelée FlowGM, qui minimise l'intervention de l'utilisateur. Le cœur statistique est fondé sur le modèle classique de mélange de lois gaussiennes. Ce modèle est tout d'abord utilisé pour obtenir une classification initiale, le nombre de classes étant déterminé par le critère d'information BIC. Après cela, une méta-classification, qui consiste en l'étiquetage des classes et la fusion de celles qui ont la même étiquette au regard de la référence, a permis l'identification automatique de 24 populations cellulaires sur quatre panels. Ces identifications ont ensuite été intégrées dans les fichiers de cytométrie de flux standard (FCS), permettant ainsi la comparaison avec l'analyse manuelle opérée par les experts. Nous montrons que la qualité est similaire entre FlowGM et l'analyse manuelle classique pour les lymphocytes, mais notamment que FlowGM montre une meilleure discrimination des sous-populations de monocytes et de cellules dendritiques (DC), qui sont difficiles à obtenir manuellement. FlowGM fournit ainsi une analyse rapide de phénotypes cellulaires et se prête à des études de cohortes.A des fins d'évaluation, de diagnostic et de recherche, une analyse tenant compte de l'influence de facteurs, comme par exemple les effets du protocole, l'effet de l'âge et du sexe, a été menée. Dans le contexte du projet MI, les 1000 donneurs sains ont été stratifiés selon le sexe et l'âge. Les résultats de l'analyse quantitative faite avec FlowGM ont été jugés concordants avec l'analyse manuelle qui est considérée comme l'état de l'art. On note surtout une augmentation de la précision pour les populations CD16+ et CDC1, où les sous-populations CD14loCD16hi et HLADRhi CDC1 ont été systématiquement identifiées. Nous démontrons que les effectifs de ces deux populations présentent une corrélation significative avec l'âge. En ce qui concerne les populations qui sont connues pour être associées à l'âge, un modèle de régression linéaire multiple a été considéré qui fournit un coefficient de régression renforcé. Ces résultats établissent une base efficace pour l'évaluation de notre procédure FlowGM.Lors de l'utilisation de FlowGM pour la caractérisation détaillée de certaines sous-populations présentant de fortes variations au travers des différents échantillons, par exemple les cellules T, nous avons constaté que FlowGM était en difficulté. En effet, dans ce cas, l'algorithme EM classique initialisé avec la classification de l'échantillon de référence est insuffisant pour garantir l'alignement et donc l'identification des différentes classes entre tous échantillons. Nous avons donc amélioré FlowGM en une nouvelle procédure FlowGMP. Pour ce faire, nous avens ajouté au modèle de mélange, une distribution a priori sur les paramètres de composantes, conduisant à un algorithme EM contraint. Enfin, l'évaluation de FlowGMP sur un panel difficile de cellules T a été réalisée, en effectuant une comparaison avec l'analyse manuelle. Cette comparaison montre que notre procédure Bayésienne fournit une identification fiable et efficace des onze sous-populations de cellules T à travers un grand nombre d'échantillons. / In the course of my Ph.D. work, I have developed and applied two new computational approaches for automatic identification of cell populations in multi-parameter flow cytometry across a large number of samples. Both approaches were motivated and taken by the LabEX "Milieu Intérieur" study (hereafter MI study). In this project, ten 8-color flow cytometry panels were standardized for assessment of the major and minor cell populations present in peripheral whole blood, and data were collected and analyzed from 1,000 cohorts of healthy donors.First, we aim at robust characterization of major cellular components of the immune system. We report a computational pipeline, called FlowGM, which minimizes operator input, is insensitive to compensation settings, and can be adapted to different analytic panels. A Gaussian Mixture Model (GMM) - based approach was utilized for initial clustering, with the number of clusters determined using Bayesian Information Criterion. Meta-clustering in a reference donor, by which we mean labeling clusters and merging those with the same label in a pre-selected representative donor, permitted automated identification of 24 cell populations across four panels. Cluster labels were then integrated into Flow Cytometry Standard (FCS) files, thus permitting comparisons to human expert manual analysis. We show that cell numbers and coefficient of variation (CV) are similar between FlowGM and conventional manual analysis of lymphocyte populations, but notably FlowGM provided improved discrimination of "hard-to-gate" monocyte and dendritic cell (DC) subsets. FlowGM thus provides rapid, high-dimensional analysis of cell phenotypes and is amenable to cohort studies.After having cell counts across a large number of cohort donors, some further analysis (for example, the agreement with other methods, the age and gender effect, etc.) are required naturally for the purpose of comprehensive evaluation, diagnosis and discovery. In the context of the MI project, the 1,000 healthy donors were stratified across gender (50% women and 50% men) and age (20-69 years of age). Analysis was streamlined using our established approach FlowGM, the results were highly concordant with the state-of-art gold standard manual gating. More important, further precision of the CD16+ monocytes and cDC1 population was achieved using FlowGM, CD14loCD16hi monocytes and HLADRhi cDC1 cells were consistently identified. We demonstrate that the counts of these two populations show a significant correlation with age. As for the cell populations that are well-known to be related to age, a multiple linear regression model was considered, and it is shown that our results provided higher regression coefficient. These findings establish a strong foundation for comprehensive evaluation of our previous work.When extending this FlowGM method for detailed characterization of certain subpopulations where more variations are revealed across a large number of samples, for example the T cells, we find that the conventional EM algorithm initiated with reference clustering is insufficient to guarantee the alignment of clusters between all samples due to the presence of technical and biological variations. We then improved FlowGM and presented FlowGMP pipeline to address this specific panel. We introduce a Bayesian mixture model by assuming a prior distribution of component parameters and derive a penalized EM algorithm. Finally the performance of FlowGMP on this difficult T cell panel with a comparison between automated and manual analysis shows that our method provides a reliable and efficient identification of eleven T cell subpopulations across a large number of samples. Cytometrie en flux Analyse de donnes multiparamétrique Clustering Cohort data Mélange de lois Flow cytometry High-Dimensional data analysis Clustering Cohort data Mixture model
2	Learner mobility in Johannesburg-Soweto, South Africa : dimensions and determinants. De Kadt, Julia Ruth 07 March 2012 (has links) Many South African school children are known to travel fairly long distances to school each day, in pursuit of the best possible educational opportunities in a schooling system that is known to vary greatly in quality. This thesis documents the dimensions and determinants of the daily, education-related travel of primary school aged children in Johannesburg-Soweto, South Africa. It uses data on a sample of 1428 children drawn from the Birth to Twenty cohort study to provide the first population-based data on the extent of learner mobility in contemporary urban South Africa. Learner mobility is measured in three different ways: firstly by the straight line distance between a child‘s home and his or her school; secondly by whether the child‘s school falls into the same geographical area as his or her home; and thirdly by whether the child attends his or her nearest, grade-appropriate school. The thesis provides clear evidence for extensive mobility using all three of these approaches to measurement. Over 25% of children were found to be travelling more than 5km each way to school and back on a daily basis. Almost 60% of children attended a school outside of the Census 2001 Sub-Place (roughly equivalent to a suburb) in which they lived, and fewer than 20% of children attended the grade-appropriate school nearest to their home. Counter to expectations, these figures were fairly stable over time, suggesting that educational mobility does not increase substantially as children age or transition to high school. Mobile children attended significantly more well-resourced and well-performing schools than their non-mobile peers, and the quality of schools attended increased with distance travelled. This substantiates the assumption that children and families make use of educational mobility to improve the quality of education that they are able to access. The analyses presented in the thesis suggest that two distinct patterns of mobility, with different determinants, are in use in the Johannesburg-Soweto area. The first relates primarily to travel from townships to historically advantaged schools in suburban Johannesburg, and typically requires substantial economic investment and extensive parental involvement. The second form of mobility operates at a more local level, and relates to children and families making choices between a number of relatively local schools. This form of mobility is less resource intensive. Children engaging in the first form of mobility were more likely to attend a particularly advantaged school, and to have a well-educated mother. By contrast, children engaged in the second form of mobility were more likely to live in a disadvantaged area, and come from households with moderate SES levels. iv The findings of this thesis provide important insights into the nature of school choice in South Africa, which have implications for educational policy, and the understanding of the nature of urban poverty as experienced by South African children. They also contribute to the international school choice literature, by providing novel information about the implications of relatively unregulated school choice for educational inequality and segregation in the South African context. Birth to twenty Cohort data Johannesburg Learner migration Learner mobility Primary school Quantitative analysis School choice Travel to school Secondary analysis South Africa Soweto

Search results

Analyse de données de cytometrie de flux pour un grand nombre d'échantillons / Automated flow cytometric analysis across a large number of samples

Learner mobility in Johannesburg-Soweto, South Africa : dimensions and determinants.