• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 61
  • 15
  • 11
  • 9
  • 4
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 148
  • 31
  • 28
  • 19
  • 18
  • 17
  • 15
  • 15
  • 15
  • 14
  • 14
  • 13
  • 13
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Phenotyping of chronic respiratory diseases in the South of Vietnam

Chu Thi, Ha 25 June 2019 (has links) (PDF)
Chronic respiratory diseases (CRDs) include chronic diseases involving the airways and other structures of the lung. In the current circumstance of Vietnam, people are exposed to numerous risk factors of CRD, such as heavy smoking, high frequency of pulmonary tuberculosis, chronic helminthiasis, allergic factors, migration and urbanization (the last associated with traffic-related pollution). The phenotype diagnoses should take into account the risk factors of each individual besides the clinical features, while the differential diagnoses mostly depend on the available techniques in each healthcare center. Our aim was to improve the differential diagnoses of the 3 most frequent CRDs: chronic obstructive pulmonary disease (COPD), asthma and COPD – asthma overlap syndrome (ACOS), in Vietnam. In the first part, we evaluated the prevalence of the allergen sensitization among patients with CRD, in regard to the urban and rural area in the South of Vietnam. House dust mites and cockroach droppings were the most frequent sensitizer. Compared with participants born in the urban setting, those born in the rural environment were less frequently sensitized and this protective effect disappeared in the case of migration from rural to urban areas. In the second part, we evaluated skin prick test as a method to screen dust mite sensitization in CRD in southern Vietnam. The data suggested that, in the present circumstance, skin prick test can be used to screen mite sensitization. In the third part, we evaluated the risk of mite sensitization in the native and migrant population, in regard to several environmental factors. Consistently with the hygiene hypothesis, compared to urban, exposure to high endotoxin concentration in rural was a protective factor against allergic sensitization. We reported for the first time that this effect was reversible among the migrants from rural to urban setting in association with lower endotoxin exposure. In the fourth part, we have defined asthma, COPD and ACOS based on clinical symptoms, cumulative smoking and airway expiratory flow with reversibility, on one side, and the age-related of the different phenotypes, on the other side. We hypothesized that the cumulative exposure to noxious particles should increase the age-related prevalence of COPD, while due to the immunosenescence process, the prevalence of IgE-mediated asthma should decrease with age, and ACOS prevalence being not related to age due to the combined mechanisms.  In conclusion, we showed in the South of Vietnam that:1) mites and cockroach allergens were the most frequent sensitizer in chronic respiratory diseases;2) the skin prick test to mite has been validated to screen mite sensitization;3) associated with a reduced level of endotoxin level, migration from rural to the urban setting was a risk factor of mite sensitization in chronic respiratory diseases;4) based on the clinical symptoms, spirometric values, and cumulative smoking, the diagnosis of asthma, COPD and ACOS have been made and their prevalence were 25, 42 and 33%, respectively. / Doctorat en Sciences biomédicales et pharmaceutiques (Médecine) / info:eu-repo/semantics/nonPublished
82

Phenomics enabled genetic dissection of complex traits in wheat breeding

Singh, Daljit January 1900 (has links)
Doctor of Philosophy / Genetics Interdepartmental Program / Jesse A. Poland / A central question in modern biology is to understand the genotype-to-phenotype (G2P) link, that is, how the genetics of an organism results in specific characteristics. However, prediction of phenotypes from genotypes is a difficult problem due to the complex nature of genomes, the environment, and their interactions. While the recent advancements in genome sequencing technologies have provided almost unlimited access to high-density genetic markers, large-scale rapid and accurate phenotyping of complex plant traits remains a major bottleneck. Here, we demonstrate field-based complex trait assessment approaches using a commercially available light-weight Unmanned Aerial Systems (UAS). By deploying novel data acquisition and processing pipelines, we quantified lodging, ground cover, and crop growth rate of 1745 advanced spring wheat lines at multiple time-points over the course of three field seasons at three field sites in South Asia. High correlations of digital measures to visual estimates and superior broad-sense heritability demonstrate these approaches are amenable for reproducible assessment of complex plant traits in large breeding nurseries. Using these validated high-throughput measurements, we applied genome-wide association and prediction models to assess the underlying genetic architecture and genetic control. Our results suggest a diffuse genetic architecture for lodging and ground cover in wheat, but heritable genetic variation for prediction and selection in breeding programs. The logistic regression-derived parameters of dynamic plant height exhibited strong physiological linkages with several developmental and agronomic traits, suggesting the potential targets of selection and the associated tradeoffs. Taken together, our highly reproducible approaches provide a proof-of-concept application of UAS-based phenomics that is scalable to tens-of-thousands of plots in breeding and genetic studies as will be needed to understand the G2P and increase the rate of gain for complex traits in crop breeding.
83

ESTIMATING PLANT PHENOTYPIC TRAITS FROM RGB IMAGERY

Yuhao Chen (7870844) 20 November 2019 (has links)
<div>Plant Phenotyping is a set of methodologies for measuring and analyzing characteristic traits of a plant. While traditional plant phenotyping techniques are labor-intensive and destructive, modern imaging technologies have provided faster, non-invasive, and more cost-effective capabilities for plant phenotyping. Among different image-based phenotyping platforms, I focus on phenotyping with image data captured by Unmanned Aerial Vehicle (UAV) and ground vehicles. The crop plant used in my study is sorghum [Sorghum bicolor (L.) Moench]. In this thesis, I present multiple methods to estimate plot-level and plant-level plant traits from data collected by various platforms, including UAV and ground vehicles. I propose an image plant phenotyping system that provides end-to-end RGB data analysis for plant scientists. I describe a plant segmentation method using HSV color information. I introduce two methods to locate the center of the plants using Multiple Instance Learning (MIL) and Convolutional Neural Networks (CNN). I present three methods to segment individual leaves by shape-based approaches in both Cartesian coordinates and Polar coordinates. I propose a method to estimate leaf length and width for overhead leaf images. I describe a method to estimate leaf angle from data collected by a modified wheel-based sprayer with a sensor boom vehicle, Phenorover. Methods are tested and verified on image data collected by UAV and ground vehicle platforms in sorghum fields in West Lafayette, Indiana, USA. Estimated phenotypic traits include plant locations, the number of plants per plot, leaf area, canopy cover, Leaf Area Index (LAI), leaf count, leaf angle, leaf length, and leaf width.</div>
84

Group-wise 3D MR Image Registration of Mouse Embryos

Zamyadi, Mojdeh 15 March 2010 (has links)
This dissertation provides the foundations of computer-based automated phenotyping methods for analyzing 3D images of mouse embryos. A group-wise registration technique was utilized and optimized and computerized methods were employed for analysis of 3D MRI images of mouse embryos. The assumption that embryo anatomy is highly conserved among genetically identical specimens was verified. The group-wise registration approach was used to align a group of embryos from the 129S1/SvImJ (129Sv) strain as well as a group of C57BL/6J (C57) embryos. Finally, we shed some light on some of the morphological differences between the 129Sv and C57 strains using automated techniques.
85

Group-wise 3D MR Image Registration of Mouse Embryos

Zamyadi, Mojdeh 15 March 2010 (has links)
This dissertation provides the foundations of computer-based automated phenotyping methods for analyzing 3D images of mouse embryos. A group-wise registration technique was utilized and optimized and computerized methods were employed for analysis of 3D MRI images of mouse embryos. The assumption that embryo anatomy is highly conserved among genetically identical specimens was verified. The group-wise registration approach was used to align a group of embryos from the 129S1/SvImJ (129Sv) strain as well as a group of C57BL/6J (C57) embryos. Finally, we shed some light on some of the morphological differences between the 129Sv and C57 strains using automated techniques.
86

Monogenic Traits Associated with Structural Variants in Chicken and Horse : Allelic and Phenotypic Diversity of Visually Appealing Traits

Imsland, Freyja January 2015 (has links)
Domestic animals have rich phenotypic diversity that can be explored to advance our understanding of the relationship between molecular genetics and phenotypic variation. Since the advent of second generation sequencing, it has become easier to identify structural variants and associate them with phenotypic outcomes. This thesis details studies on three such variants associated with monogenic traits. The first studies on Rose-comb in the chicken were published over a century ago, seminally describing Mendelian inheritance and epistatic interaction in animals. Homozygosity for the otherwise dominant Rose-comb allele was later associated with reduced rooster fertility. We show that a 7.38 Mb inversion is causal for Rose-comb, and that two alleles exist for Rose-comb, R1 and R2. A novel genomic context for the gene MNR2 is causative for the comb phenotype, and the bisection of the gene CCDC108 is associated with fertility issues. The recombined R2 allele has intact CCDC108, and normal fertility. The dominant phenotype Greying with Age in horses was previously associated with an intronic duplication in STX17. By utilising second generation sequencing we have examined the genomic region surrounding the duplication in detail, and excluded all other discovered variants as causative for Grey. Dun is the ancestral coat colour of equids, where the individual is mostly pale in colour, but carries intensely pigmented primitive markings, most notably a dorsal stripe. Dun is a dominant trait, and yet most domestic horses are non-dun in colour and intensely pigmented. We show that Dun colour is established by radially asymmetric expression of the transcription factor TBX3 in hair follicles. This results in a microscopic spotting phenotype on the level of the individual hair, giving the impression of pigment dilution. Non-dun colour is caused by two different alleles, non-dun1 and non-dun2, both of which disrupt the TBX3-mediated regulation of pigmentation. Non-dun1 is associated with a SNP variant 5 kb downstream of TBX3, and non-dun2 with a 1.6 kb deletion that overlaps the non-dun1 SNP. Homozygotes for non-dun2 show a more intensely pigmented appearance than horses with one or two non-dun1 alleles. We have also shown by genotyping of ancient DNA that non-dun1 predates domestication.
87

Couplage entre modélisation opto-physique des scènes de végétation complexes et chimiométrie : application au phénotypage par imagerie hyperspectrale de proximité / Coupling between opto-physical modeling of complex vegetation scenes and chemometry : application to phenotyping by short range hyperspectral imaging

Makdessi, Nathalie al 16 November 2017 (has links)
L'imagerie hyperspectrale de proximité est un outil prometteur pour le phénotypage ou la surveillance de la végétation. En association avec la régression des moindres carrés partiels ou PLS-R, elle permet de construire des cartographies de haute résolution spatiale du contenu chimique à l’échelle de la canopée. Cependant, plusieurs phénomènes optiques doivent être pris en compte lors de l'application de cette approche aux scènes de végétation dans des conditions naturelles. Notamment, les facteurs additifs et multiplicatifs liés respectivement à la réflexion spéculaire et à l'inclinaison des feuilles qui peuvent être surmontés par prétraitement. Mais le phénomène qui pose le plus de défis est la réflexion multiple. Il se produit lorsqu'une feuille est éclairée en partie par la lumière directe, et en partie par la réflexion ou la transmission de la lumière des feuilles voisines, induisant de forts effets non linéaires sur son spectre de réflectance. Bien que cet effet puisse être pris en compte dans certains modèles de télédétection à l’échelle de la canopée, aucune étude n’a été proposée à ce jour sur la façon dont un tel phénomène affecte les évaluations spectrales de la biochimie végétale par imagerie de proximité. L'objectif de la présente étude était d'analyser ces effets dans le contexte de l'imagerie hyperspectrale à des fins de phénotypage végétal et de proposer des méthodes chimiométriques pour les surmonter. Le développement méthodologique a été basé sur des outils de simulation inclus dans la plate-forme open source OpenAlea (http://openalea.gforge.inria.fr/dokuwiki/doku.php). Une scène typique de canopée de blé a été modélisée à l'aide du modèle Adel-Wheat et combinée au modèle de propagation de la lumière Caribu. L'outil proposé simule la réflectance apparente de chaque feuille visible dans la canopée pour une réflectance et une transmittance réelles données, permettant de synthétiser des images hyperspectrales réalistes. Cette approche par simulation nous a permis, dans un premier temps, d’analyser la distribution dans l’espace spectral des perturbations engendrées par les réflexions multiples, puis d’en déduire une méthode de correction applicable dans le cas d’une régression PLS. La méthode est basée sur la construction de deux sous-espaces W et B générés respectivement par la formulation analytique des réflexions multiples et la variable d'intérêt. Ceci nous permet alors de définir une matrice de projection sur B selon la direction W (projection oblique), qui permet de supprimer l’effet des réflexions multiples tout en conservant l’information utile. Il suffit ensuite d’appliquer cette projection à chaque spectre lors de l’apprentissage et de la mise en œuvre du modèle PLS. La méthode a d’abord été développée et paramétrée sur les données simulées, dans le contexte de l’évaluation de la teneur en azote (LNC) de feuilles de blé. Pour cela, les spectres de réflectance (450-1100 nm) de 57 feuilles de blé ont été collectés à l'aide d'un spectromètre ASD (FieldSpec®, Analytical Spectral Devices, Inc., Boulder, Colorado, USA), tandis que leur LNC a été mesuré à l'aide d'analyses chimiques. Des modèles de régression avec et sans projection oblique ont alors été construits à partir des spectres ASD et appliqués sur l’ensemble des données simulées. Le modèle avec projection oblique a donné d’excellents résultats (R² = 0.931; RMSEP = 0.29% DM) en comparaison du modèle classique (R² = 0.915; RMSEP = 0.42% DM).La même méthode a ensuite été appliquée en conditions réelles, sur des feuilles de blé cultivées en pot et au champ. Pour cela, des feuilles ont été collectées et imagées à plat sur fond noir pour la construction des modèles PLS, qui ont ensuite été appliqués aux plantes sur pied. Ces expérimentations ont confirmé d’une part que la PLS-R classique entraînait une forte surestimation du LNC sur les feuilles entourées d’autres feuilles, d’autre part que la projection oblique évitait cette surestimation. / Short range hyperspectral imagery is a promising tool for phenotyping and vegetation survey. When associated with partial least square regression (PLS-R), it allows high spatial resolution mapping of the plant chemical content at the canopy scale. However, several optical phenomena have to be taken into account when applying this approach to vegetation scenes in natural conditions. For instance, additive and multiplicative factors due respectively to specular reflection and leaf inclination can be overcome by spectral preprocessing. But the most challenging phenomenon is multiple scattering. It appears when a leaf is partly lightened by the reflected or transmitted light from surrounding leaves, resulting in strong non linear effects in its apparent reflectance spectrum. Though this effect can be taken into account in some remote sensing models at the canopy scale, no study has been proposed until now concerning its impact on spectral prediction of vegetation chemical content by short range imagery.The objective of this project, associated with a PhD work, was to analyze these effects in the context of hyperspectral imagery for vegetation phenotyping purpose, and to propose spectral processing methods to overcome them.The methodological development has been based on simulation tools included in the open source platform OpenAlea (http://openalea.gforge.inria.fr/dokuwiki/doku.php). A typical wheat canopy scene has been modelled using Adel-Wheat and combined with the light propagation model Caribu. The proposed tool simulates the apparent reflectance of every visible leaf in the canopy for a given actual reflectance and transmittance, allowing to synthetize realistic hyperspectral images.This simulation approach has allowed us, in a first step, to analyze the distribution of deviations due to multiple scattering in the spectral space, and then to infer a correction method in the frame of PLS regression. This method relies on the building of two subspaces EW and EB respectively generated by the analytic formulation of multiple scattering and by the variable of interest. It allows us to define a projection operation on EB subspace along EW direction (oblique projection), in order to remove multiple scattering effects while preserving useful information. This projection operation is then applied on every spectra during learning phase and using phase of the PLS model.The method has first been developed and tuned using simulated data, in the frame of leaf nitrogen content (LNC) prediction of wheat leaves. For this purpose, reflectance spectra (450-1100 nm) of 57 wheat leaves have been collected using a ASD filed spectrometer (FieldSpec®, Analytical Spectral Devices, Inc., Boulder, Colorado, USA), while their LNC was measured through reference chemical analyses. Regression models with and without oblique projection have then been built from the ASD spectra and applied to simulated data. The model with oblique projection provided excellent results (R² = 0.931; RMSEP = 0.29% DM), compared to the classical one (R² = 0.915; RMSEP = 0.42% DM).The same method has then been applied in real conditions on wheat pot plants and field plants. For this purpose, some leaves have been collected and laid on a black paper background to be imaged, in order to build PLS models that have then been applied on in-situ plants. These experimentations have confirmed that the classical PLS-R induces a strong overestimation of LNC on leaves surrounded by other leaves, and that oblique projection corrects this overestimation (same prediction on surrounded then isolated leaf).
88

An In-vivo Analysis of SLMAP Function in the Postnatal Mouse Myocardium

Rehmani, Taha January 2017 (has links)
SLMAP is a tail anchored membrane protein that alternatively splices to generate three isoforms, SLMAP1, SLMAP2 and SLMAP3. Previous studies in our lab have shown that the postnatal cardiac-specific overexpression of SLMAP1 results in intracellular vesicle expansion and enhanced endosomal recycling. I generated a postnatal cardiac-specific knockout model using the Cre-Lox system to nullify all three SLMAP isoforms and further evaluate its role in the mouse myocardium. SLMAP knockdown and knockout mouse hearts were analyzed with western blotting and qPCR. I found that only SLMAP3 was nullified and phenotypic evaluation through echocardiography indicated that young and old SLMAP3 knockout animals showed no remarkable changes in cardiac function. Furthermore, challenge with stressor isoproterenol had a similar response to wildtype and knockout mice in cardiac structure and function. Surprisingly the level of expression of SLMAP1 and SLMAP2 was maintained in the myocardium from SLMAP3 deficient mice. Interestingly the machinery involved in endosomal recycling was not impacted by the loss of SLMAP3. These data indicate that loss of SLMAP3 does not alter cardiac structure and function in the postnatal myocardium in the presence of SLMAP1 and SLMAP2.
89

Étude des stratégies de mouvement chez les parasitoïdes du genre Trichogramma : apports des techniques d’analyse d’images automatiques / Movement strategies of parasitoids from the genus Trichogramma : using automated image analysis methods

Burte, Victor 14 December 2018 (has links)
Les parasitoïdes du genre Trichogramma sont des micro-hyménoptères oophages très utilisés comme auxiliaires de lutte biologique. Ma thèse a pour objet la caractérisation phénotypique des stratégies de mouvement de cet auxiliaire, spécifiquement les mouvements impliqués dans l’exploration de l’espace et la recherche des œufs hôtes. Ces derniers sont des phénotypes de grande importance dans le cycle de vie des trichogrammes, et aussi des caractères d’intérêt pour évaluer leur efficacité en lutte biologique. Les trichogrammes étant des organismes de très petite taille (moins de 0,5 mm), difficilement observables, l’étude de leur mouvement peut tirer profit des avancées technologiques dans l’acquisition et l’analyse automatique des images. C’est cette stratégie que j’ai suivi en combinant un volet de développement méthodologique et un volet expérimental. Dans une première partie méthodologique, je présente trois grands types de méthodes d’analyse d’images que j’ai utilisées et contribué à développer au cours de ma thèse. Dans un second temps, je présente trois applications de ces méthodes à l’étude du mouvement chez le trichogramme. Premièrement, nous avons caractérisé au laboratoire les préférences d’orientation (phototaxie, géotaxie et leur interaction) lors de la ponte chez 18 souches de trichogramme, appartenant à 6 espèces. Ce type d’étude requérant le dénombrement d’un très grand nombre d’œufs (sains et parasités), il a été développé un nouvel outil dédié, sous forme d’un plugin ImageJ/FIJI mis à disposition de la communauté. Ce plugin flexible automatise et rend plus productible les tâches de dénombrement et d’évaluation de taux de parasitisme, rendant possible des screenings de plus grande ampleur. Une grande variabilité a pu être mise en évidence au sein du genre, y compris entre souches d’une même espèce. Cela suggère qu’en fonction de la strate végétale à protéger (herbacée, arbustive, arborée), il serait possible de sélectionner des souches afin d’optimiser leur exploitation de la zone ciblée. Dans un second temps, nous avons caractérisé les stratégies d’exploration (vitesses, trajectoires, ...) d’un ensemble de souches et d’espèces de trichogramme pour rechercher des traits propres à chaque souche ou espèce. Pour cela, j’ai mis en œuvre une méthode de tracking de groupes de trichogrammes sur enregistrement vidéo sur de courtes échelles de temps à l’aide du logiciel Ctrax et de scripts R. L’objectif était de développer un protocole de caractérisation haut-débit du mouvement de souches de trichogrammes et d’étudier la variabilité de ces traits au sein du genre. Enfin, nous avons conduit une étude de la dynamique de propagation dans l’espace de groupes de trichogrammes chez l’espèce T. cacoeciae, en mettant au point un dispositif expérimental innovant permettant de couvrir des échelles de temps et d’espace supérieures à celles habituellement imposées par les contraintes de laboratoire.Grâce à l’utilisation de prises de vue très haute résolution / basse fréquence et d’un pipeline d’analyse dédié, la diffusion des individus peut être suivie dans un tunnel de plus 6 mètres de long pendant toute une journée. J’ai notamment pu identifier un effet de la densité en individus ainsi que de la distribution des ressources sur la dynamique de propagation (coefficient de diffusion) des trichogrammes testés. / Parasitoids of the genus Trichogramma are oophagous micro-hymenoptera widely used as biological control agents. My PhD is about the phenotypic characterization of this auxiliary's movement strategies, specifically the movements involved in the exploration of space and the search for host eggs. These phenotypes have great importance in the life cycle of trichogramma, and also of characters of interest to evaluate their effectiveness in biological control program. Trichogramma being very small organisms (less than 0.5 mm), difficult to observe, the study of their movement can take advantage of technological advances in the acquisition and automatic analysis of images. This is the strategy I followed by combining a methodological development component and an experimental component. In a first methodological part, I present three main types of image analysis methods that I used and helped to develop during my thesis. In a second time, I present three applications of these methods to the study of the movement of Trichogramma. First, we characterized in the laboratory the orientation preferences (phototaxis, geotaxis and their interaction) during egg laying in 22 trichogram strains belonging to 6 species. This type of study requires the counting of a large number of eggs (healthy and parasitized), it was developed a new dedicated tool in the form of an ImageJ / FIJI plugin made available to the community. This flexible plugin automates and makes more productive the tasks of counting and evaluation of parasitism rate, making possible screenings of greater magnitude. A great variability could be highlighted within the genus, including between strains of the same species. This suggests that depending on the plant layer to be protected (grass, shrub, tree), it would be possible to select trichogramma’s strains to optimize their exploitation of the targeted area. In a second time, we characterized the exploration strategies (velocities, trajectories, ...) of a set of 22 strains from 7 trichogramma species to look for traits specific to each strain or species. I implemented a method for tracking a trichogramma group on video recorded on short time scales using the Ctrax software and R scripts. The aim was to develop a protocol for high-throughput characterization of trichogramma strains movement and to study the variability of these traits within the genus. Finally, we conducted a study of the propagation dynamics in trichogramma group from the species T. cacoeciae, by developing an innovative experimental device to cover scales of time and space greater than those usually imposed by laboratory constraints. Through the use of pictures taken at very high resolution / low frequency and a dedicated analysis pipeline, the diffusion of individuals can be followed in a tunnel longer than 6 meters during a whole day. In particular, I was able to identify the effect of the population density as well as the distribution of resources on the propagation dynamics (diffusion coefficient) and the parasitism efficiency of the tested strain.
90

Problématique des entrepôts de données textuelles : dr Warehouse et la recherche translationnelle sur les maladies rares / Textual data Warehouse challenge : Dr. Warehouse and translational research on rare diseases

Garcelon, Nicolas 29 November 2017 (has links)
La réutilisation des données de soins pour la recherche s’est largement répandue avec le développement d’entrepôts de données cliniques. Ces entrepôts de données sont modélisés pour intégrer et explorer des données structurées liées à des thesaurus. Ces données proviennent principalement d’automates (biologie, génétique, cardiologie, etc) mais aussi de formulaires de données structurées saisies manuellement. La production de soins est aussi largement pourvoyeuse de données textuelles provenant des comptes rendus hospitaliers (hospitalisation, opératoire, imagerie, anatomopathologie etc.), des zones de texte libre dans les formulaires électroniques. Cette masse de données, peu ou pas utilisée par les entrepôts classiques, est une source d’information indispensable dans le contexte des maladies rares. En effet, le texte libre permet de décrire le tableau clinique d’un patient avec davantage de précisions et en exprimant l’absence de signes et l’incertitude. Particulièrement pour les patients encore non diagnostiqués, le médecin décrit l’histoire médicale du patient en dehors de tout cadre nosologique. Cette richesse d’information fait du texte clinique une source précieuse pour la recherche translationnelle. Cela nécessite toutefois des algorithmes et des outils adaptés pour en permettre une réutilisation optimisée par les médecins et les chercheurs. Nous présentons dans cette thèse l'entrepôt de données centré sur le document clinique, que nous avons modélisé, implémenté et évalué. À travers trois cas d’usage pour la recherche translationnelle dans le contexte des maladies rares, nous avons tenté d’adresser les problématiques inhérentes aux données textuelles: (i) le recrutement de patients à travers un moteur de recherche adapté aux données textuelles (traitement de la négation et des antécédents familiaux), (ii) le phénotypage automatisé à partir des données textuelles et (iii) l’aide au diagnostic par similarité entre patients basés sur le phénotypage. Nous avons pu évaluer ces méthodes sur l’entrepôt de données de Necker-Enfants Malades créé et alimenté pendant cette thèse, intégrant environ 490 000 patients et 4 millions de comptes rendus. Ces méthodes et algorithmes ont été intégrés dans le logiciel Dr Warehouse développé pendant la thèse et diffusé en Open source depuis septembre 2017. / The repurposing of clinical data for research has become widespread with the development of clinical data warehouses. These data warehouses are modeled to integrate and explore structured data related to thesauri. These data come mainly from machine (biology, genetics, cardiology, etc.) but also from manual data input forms. The production of care is also largely providing textual data from hospital reports (hospitalization, surgery, imaging, anatomopathologic etc.), free text areas in electronic forms. This mass of data, little used by conventional warehouses, is an indispensable source of information in the context of rare diseases. Indeed, the free text makes it possible to describe the clinical picture of a patient with more precision and expressing the absence of signs and uncertainty. Particularly for patients still undiagnosed, the doctor describes the patient's medical history outside any nosological framework. This wealth of information makes clinical text a valuable source for translational research. However, this requires appropriate algorithms and tools to enable optimized re-use by doctors and researchers. We present in this thesis the data warehouse centered on the clinical document, which we have modeled, implemented and evaluated. In three cases of use for translational research in the context of rare diseases, we attempted to address the problems inherent in textual data: (i) recruitment of patients through a search engine adapted to textual (data negation and family history detection), (ii) automated phenotyping from textual data, and (iii) diagnosis by similarity between patients based on phenotyping. We were able to evaluate these methods on the data warehouse of Necker-Enfants Malades created and fed during this thesis, integrating about 490,000 patients and 4 million reports. These methods and algorithms were integrated into the software Dr Warehouse developed during the thesis and distributed in Open source since September 2017.

Page generated in 0.0456 seconds