• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 164
  • 65
  • 20
  • 15
  • 11
  • 7
  • 4
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 332
  • 332
  • 70
  • 48
  • 48
  • 45
  • 38
  • 36
  • 35
  • 34
  • 32
  • 31
  • 31
  • 31
  • 29
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
271

Approche intégrative du développement musculaire afin de décrire le processus de maturation en lien avec la survie néonatale / Integrative approach of muscular development to describe the maturation process related to the neonatal survival

Voillet, Valentin 29 September 2016 (has links)
Depuis plusieurs années, des projets d'intégration de données omiques se sont développés, notamment avec objectif de participer à la description fine de caractères complexes d'intérêt socio-économique. Dans ce contexte, l'objectif de cette thèse est de combiner différentes données omiques hétérogènes afin de mieux décrire et comprendre le dernier tiers de gestation chez le porc, période influençant la mortinatalité porcine. Durant cette thèse, nous avons identifié les bases moléculaires et cellulaires sous-jacentes de la fin de gestation, en particulier au niveau du muscle squelettique. Ce tissu est en effet déterminant à la naissance car impliqué dans l'efficacité de plusieurs fonctions physiologiques comme la thermorégulation et la capacité à se déplacer. Au niveau du plan expérimental, les tissus analysés proviennent de foetus prélevés à 90 et 110 jours de gestation (naissance à 114 jours), issus de deux lignées extrêmes pour la mortalité à la naissance, Large White et Meishan, et des deux croisements réciproques. Au travers l'application de plusieurs études statistiques et computationnelles (analyses multidimensionnelles, inférence de réseaux, clustering et intégration de données), nous avons montré l'existence de mécanismes biologiques régulant la maturité musculaire chez les porcelets, mais également chez d'autres espèces d'intérêt agronomique (bovin et mouton). Quelques gènes et protéines ont été identifiées comme étant fortement liées à la mise en place du métabolisme énergétique musculaire durant le dernier tiers de gestation. Les porcelets ayant une immaturité du métabolisme musculaire seraient sujets à un plus fort risque de mortalité à la naissance. Un second volet de cette thèse concerne l'imputation de données manquantes (tout un groupe de variables pour un individu) dans les méthodes d'analyses multidimensionnelles, comme l'analyse factorielle multiple (AFM) (ou multiple factor analysis (MFA)). Dans notre contexte, l'AFM fut particulièrement intéressante pour l'intégration de données d'un ensemble d'individus sur différents tissus (deux ou plus). Afin de conserver ces individus manquants pour tout un groupe de variables, nous avons développé une méthode, appelée MI-MFA (multiple imputation - MFA), permettant l'estimation des composantes de l'AFM pour ces individus manquants. / Over the last decades, some omics data integration studies have been developed to participate in the detailed description of complex traits with socio-economic interests. In this context, the aim of the thesis is to combine different heterogeneous omics data to better describe and understand the last third of gestation in pigs, period influencing the piglet mortality at birth. In the thesis, we better defined the molecular and cellular basis underlying the end of gestation, with a focus on the skeletal muscle. This tissue is specially involved in the efficiency of several physiological functions, such as thermoregulation and motor functions. According to the experimental design, tissues were collected at two days of gestation (90 or 110 days of gestation) from four fetal genotypes. These genotypes consisted in two extreme breeds for mortality at birth (Meishan and Large White) and two reciprocal crosses. Through statistical and computational analyses (descriptive analyses, network inference, clustering and biological data integration), we highlighted some biological mechanisms regulating the maturation process in pigs, but also in other livestock species (cattle and sheep). Some genes and proteins were identified as being highly involved in the muscle energy metabolism. Piglets with a muscular metabolism immaturity would be associated with a higher risk of mortality at birth. A second aspect of the thesis was the imputation of missing individual row values in the multidimensional statistical method framework, such as the multiple factor analysis (MFA). In our context, MFA was particularly interesting in integrating data coming from the same individuals on different tissues (two or more). To avoid missing individual row values, we developed a method, called MI-MFA (multiple imputation - MFA), allowing the estimation of the MFA components for these missing individuals.
272

A Generic BI Application for Real-time Monitoring of Care Processes

Baffoe, Shirley A. January 2013 (has links)
Patient wait times and care service times are key performance measures for care processes in hospitals. Managing the quality of care delivered by these processes in real-time is challenging. A key challenge is to correlate source medical events to infer the care process states that define patient wait times and care service times. Commercially available complex event processing engines do not have built in support for the concept of care process state. This makes it unnecessarily complex to define and maintain rules for inferring states from source medical events in a care process. Another challenge is how to present the data in a real-time BI dashboard and the underlying data model to use to support this BI dashboard. Data representation architecture can potentially lead to delays in processing and presenting the data in the BI dashboard. In this research, we have investigated the problem of real-time monitoring of care processes, performed a gap analysis of current information system support for it, researched and assessed available technologies, and shown how to most effectively leverage event driven and BI architectures when building information support for real-time monitoring of care processes. We introduce a state monitoring engine for inferring and managing states based on an application model for care process monitoring. A BI architecture is also leveraged for the data model to support the real-time data processing and reporting requirements of the application’s portal. The research is validated with a case study to create a real-time care process monitoring application for an Acute Coronary Syndrome (ACS) clinical pathway in collaboration with IBM and Osler hospital. The research methodology is based on design-oriented research.
273

Mediation on XQuery Views

Peng, Xiaobo 12 1900 (has links)
The major goal of information integration is to provide efficient and easy-to-use access to multiple heterogeneous data sources with a single query. At the same time, one of the current trends is to use standard technologies for implementing solutions to complex software problems. In this dissertation, I used XML and XQuery as the standard technologies and have developed an extended projection algorithm to provide a solution to the information integration problem. In order to demonstrate my solution, I implemented a prototype mediation system called Omphalos based on XML related technologies. The dissertation describes the architecture of the system, its metadata, and the process it uses to answer queries. The system uses XQuery expressions (termed metaqueries) to capture complex mappings between global schemas and data source schemas. The system then applies these metaqueries in order to rewrite a user query on a virtual global database (representing the integrated view of the heterogeneous data sources) to a query (termed an outsourced query) on the real data sources. An extended XML document projection algorithm was developed to increase the efficiency of selecting the relevant subset of data from an individual data source to answer the user query. The system applies the projection algorithm to decompose an outsourced query into atomic queries which are each executed on a single data source. I also developed an algorithm to generate integrating queries, which the system uses to compose the answers from the atomic queries into a single answer to the original user query. I present a proof of both the extended XML document projection algorithm and the query integration algorithm. An analysis of the efficiency of the new extended algorithm is also presented. Finally I describe a collaborative schema-matching tool that was implemented to facilitate maintaining metadata.
274

Uma abordagem de integração de dados de redes PPI e expressão gênica para priorizar genes relacionados a doenças complexas / An integrative approach combining PPI networks and gene expression to prioritize genes related to complex diseases

Sérgio Nery Simões 30 June 2015 (has links)
Doenças complexas são caracterizadas por serem poligênicas e multifatoriais, o que representa um desafio em relação à busca de genes relacionados a elas. Com o advento das tecnologias de sequenciamento em larga escala do genoma e das medições de expressão gênica (transcritoma), bem como o conhecimento de interações proteína-proteína, doenças complexas têm sido sistematicamente investigadas. Particularmente, baseando-se no paradigma Network Medicine, as redes de interação proteína-proteína (PPI -- Protein-Protein Interaction) têm sido utilizadas para priorizar genes relacionados às doenças complexas segundo suas características topológicas. Entretanto, as redes PPI são afetadas pelo viés da literatura, em que as proteínas mais estudadas tendem a ter mais conexões, degradando a qualidade dos resultados. Adicionalmente, métodos que utilizam somente redes PPI fornecem apenas resultados estáticos e não-específicos, uma vez que as topologias destas redes não são específicas de uma determinada doença. Neste trabalho, desenvolvemos uma metodologia para priorizar genes e vias biológicas relacionados à uma dada doença complexa, através de uma abordagem integrativa de dados de redes PPI, transcritômica e genômica, visando aumentar a replicabilidade dos diferentes estudos e a descoberta de novos genes associados à doença. Após a integração das redes PPI com dados de expressão gênica, aplicamos as hipóteses da Network Medicine à rede resultante para conectar genes sementes (relacionados à doença, definidos a partir de estudos de associação) através de caminhos mínimos que possuam maior co-expressão entre seus genes. Dados de expressão em duas condições (controle e doença) são usados separadamente para obter duas redes, em que cada nó (gene) dessas redes é pontuado segundo fatores topológicos e de co-expressão. Baseado nesta pontuação, desenvolvemos dois escores de ranqueamento: um que prioriza genes com maior alteração entre suas pontuações em cada condição, e outro que privilegia genes com a maior soma destas pontuações. A aplicação do método a três estudos envolvendo dados de expressão de esquizofrenia recuperou com sucesso genes diferencialmente co-expressos em duas condições, e ao mesmo tempo evitou o viés da literatura. Além disso, houve uma melhoria substancial na replicação dos resultados pelo método aplicado aos três estudos, que por métodos convencionais não alcançavam replicabilidade satisfatória. / Complex diseases are characterized as being poligenic and multifactorial, so this poses a challenge regarding the search for genes related to them. With the advent of high-throughput technologies for genome sequencing and gene expression measurements (transcriptome), as well as the knowledge of protein-protein interactions, complex diseases have been sistematically investigated. Particularly, Protein-Protein Interaction (PPI) networks have been used to prioritize genes related to complex diseases according to its topological features. However, PPI networks are affected by ascertainment bias, in which the most studied proteins tend to have more connections, degrading the quality of the results. Additionally, methods using only PPI networks can provide just static and non-specific results, since the topologies of these networks are not specific of a given disease. In this work, we developed a methodology to prioritize genes and biological pathways related to a given complex disease, through an approach that integrates data from PPI networks, transcriptomics and genomics, aiming to increase replicability of different studies and to discover new genes associated to the disease. The methodology integrates PPI network and gene expression data, and then applies the Network Medicine Hypotheses to the resulting network in order to connect seed genes (obtained from association studies) through shortest paths possessing larger coexpression among their genes. Gene expression data in two conditions (control and disease) are used to obtain two networks, where each node (gene) in these networks is rated according to topological and coexpression aspects. Based on this rating, we developed two ranking scores: one that prioritizes genes with the largest alteration between their ratings in each condition, and another that favors genes with the greatest sum of these scores. The application of this method to three studies involving schizophrenia expression data successfully recovered differentially co-expressed gene in two conditions, while avoiding the ascertainment bias. Furthermore, when applied to the three studies, the method achieved a substantial improvement in replication of results, while other conventional methods did not reach a satisfactory replicability.
275

Desenvolvimento da plataforma CaneRegNet para anotação funcional e análises do transcriptoma da cana-de-açúcar / Development of CaneRegNet platform for functional annotation and analysis of sugarcane transcriptome

Milton Yutaka Nishiyama Junior 13 April 2015 (has links)
A identificação de genes alvos, vias de sinalização e vias metabólicas para melhoramento de cana-de-açúcar associados a características de interesse, ainda são pouco conhecidos e estudados. Alguns estudos do transcriptoma através de plataformas de microarranjo têm buscado identificar listas de genes, para experimentos tecido- específico ou submetidos a condições de estresse bióticos e abióticos. Estudos pontuais destes dados tem sido associados a vias metabólicas ou vias de sinalização já descritas na literatura, de forma a identificar alterações relacionadas a padrões de expressão gênica. Porém, estas relações em cana-de-açúcar são pouco conhecidas e estudadas. O estudo e entendimento de cana-de-açúcar por meio da diversidade genética e de sua adaptação ao ambiente é um grande desafio, principalmente pela ausência de um genoma sequenciado e por possuir um genoma complexo. Apresentamos nossos resultados para tentar superar tais limitações e desafios para estudos de expressão gênica. Foram desenvolvidas metodologias para anotação funcional do transcriptoma, centradas na transferência de anotação, identificação de vias metabólicas e enzimas pelo método de similaridade bi-direcional, predição de genes full-length, análises de ortologia e desenho de oligonucleotídeos para microarranjos customizados, resultando no ORFeoma de cana-de-açúcar, na identificação e classificação de famílias de fatores de transcrição e identificação de genes ortólogos entre gramíneas. Além disso, desenvolvemos uma plataforma para processamento e análise automatizada de experimentos por microarranjo, para armazenamento, recuperação e integração com a anotação funcional. Adicionalmente desenvolvemos e implementamos métodos para seleção de genes diferencialmente e significativamente expressos, e abordagens para análise de enriquecimento de categorias, e escores de atividade de vias metabólicas. De forma a integrar a anotação funcional do transcriptoma aos estudos por expressão gênica, desenvolvemos a plataforma CaneRegNet e uma interface para integração desta rede de dados biológicos e conhecimentos, composta por aplicativos para consulta e prospecção de dados por análises de agrupamento e correlação entre experimentos de microarranjo, possibilitando a geração de novas hipóteses e predições dentro da organização da regulação celular. / The identification of target genes, metabolic and signaling pathways associated with characteristics of interest to the sugarcane improvement are still poorly known and studied. Some transcritptome studies through microarray platforms has tried to identify lists of genes, for tissue-specific experiments or subjected to conditions of biotic and abiotic stress. In the literature specific studies of these data has already been associated with metabolic or signaling pathway, in order to identify changes in these tracks related to patterns of gene expression. However, these relations are still little know and generally defined slightly. The study and understanding of sugarcane by means of genetic diversity and its adaptation to the environment is a major challenge, mainly due to the absence of a sequenced genome and by your complex genome. We present our results to surpass this barrier e challenges for the study of gene expression. Methodologies were developed for the transcriptome functional annotation, focused on the annotation transfer, identification of metabolic pathways and enzymes by the bi- directional method; prediction of full-length genes; ortology analysis and probe design for customized microarrays, resulting in the sugarcane ORFeome, the identification and classification of transcription factor families and identification of ortholog genes between grasses. Besides that, we have developed a plataform for automated processing and analysis for microarray experiments, to store, retrieve and integration with the functional annotation. Additionally, we have developed and implemented methods for identification of differentially and significantly expressed genes, and approaches for over-represented analysis and functional class scoring (FCS). To integrate the functional annotation and the studies by gene expression profile, we have developed the CaneRegNet platform and an interface to integrate this network of biological data and knowledge, composed by searching and data mining tools for clustering and correlations between microarray experiments, enabling the generation of new hypothesis and predictions around the organization of cellular regulation.
276

La pertinence du transport pour promouvoir l'activité physique : une prise en compte des défis liés à la mesure, à l'analyse empirique et à la simulation des changements de modes de transport / The relevance of transport to promote physical activity : addressing challenges related to the measurements and the observational analysis of transport-related physical activity, and the simulation of shifts in transportation mode

Brondeel, Ruben 16 December 2016 (has links)
L'activité physique a un impact important sur la santé populationnelle, et les comportements de transport constituent une partie substantielle de l'activité physique totale. Ce travail de thèse a pour objectif d'améliorer les mesures de l'activité physique liées au transport et d'utiliser ces nouvelles mesures dans des études de cas empirique sur l'activité physique liée au transport des adultes âgés de 35 à 83 ans résidant en Ile-de-France. Méthodes: Des données GPS et d'accéléromètre ont été collectée dans le cadre de " RECORD étude GPS " pour 236 participants. L'Enquête Globale Transport a recueilli des données sur une population de 21332 participants sur une période d'un jour. Les méthodes statistiques utilisées incluent Random Forests, des régressions binomiales négatives; et des systèmes d'information géographique. Résultats Les unités de temps plus courtes ont donné lieu à des estimations d'activité physique beaucoup plus importantes. Nous avons observé 18,9 min T-APMV par jour en moyenne dans cet échantillon représentatif de l'Ile-de -France. Les participants ayant un niveau d'éducation plus élevé ont plus de T-APMV que les participants moins instruits. Les personnes ayant un revenu du ménage plus élevé ont moins T-APMV par jour. Conclusion Ce travail renforce les recommandations de la littérature d'une harmonisation plus poussée des indicateurs de l'activité physique basés sur l'accéléromètre. Des interventions concernant les modes de transport peuvent avoir un effet important sur l'activité physique. / Background Physical activity has an important impact on various health outcomes, and transport accounts for a substantial part of total physical activity. This PhD work aimed to improve measures of transport-related physical activity and to report empirical findings on the transport-related physical activity of adults aged 35 to 83 years living in Ile-de-France. Methods The RECORD GPS Study collected GPS and accelerometer data for 236 participants over a 7-day period, resulting in the observation of 7425 trips. The Enquête Globale Transport) collected data over one day, resulting in the observation of 82084 trips for 21332 participants. The methods used include random forest prediction models, geographical information systems, and negative binomial regressions. Results Shorter epochs (time units) resulted in considerably larger estimates of moderate-to-vigorous physical activity MVPA. This finding supports calls from the literature for further harmonization of accelerometer-based indicators of physical activity. We observed an average 18.9 minutes of daily T-MVPA (95% confidence interval: 18.6; 19.2 minutes). Participants with a higher level of education did more T-MVPA than their less educated counterparts. In contrast, people with a higher household income did less T-MVPA per day. Conclusion This PhD work was the first study to combine a very detailed dataset - including GPS, accelerometer, and mobility behaviour data - and a large-scale transport survey. Transport interventions could have an important impact on physical activity for this population.
277

Frontiers in Crowdsourced Data Integration

Braunschweig, Katrin, Eberius, Julian, Thiele, Maik, Lehner, Wolfgang 26 November 2020 (has links)
There is an ever-increasing amount and variety of open web data available that is insufficiently examined or not considered at all in decision making processes. This is because of the lack of end-user friendly tools that help to reuse this public data and to create knowledge out of it. Therefore, we propose a schema-optional data repository that provides the flexibility necessary to store and gradually integrate heterogeneous web data. Based on this repository, we propose a semi-automatic schema enrichment approach that efficiently augments the data in a “pay-as-you-go” fashion. Due to the inherently appearing ambiguities we further propose a crowd-based verification component that is able to resolve such conflicts in a scalable manner. / Die stetig wachsende Zahl offen verfügbarer Webdaten findet momentan viel zu wenig oder gar keine Berücksichtigung in Entscheidungsprozessen. Der Grund hierfür ist insbesondere in der mangelnden Unterstützung durch anwenderfreundliche Werkzeuge zu finden, die diese Daten nutzbar machen und Wissen daraus genieren können. Zu diesem Zweck schlagen wir ein schemaoptionales Datenrepositorium vor, welches ermöglicht, heterogene Webdaten zu speichern sowie kontinuierlich zu integrieren und mit Schemainformation anzureichern. Auf Grund der dabei inhärent auftretenden Mehrdeutigkeiten, soll dieser Prozess zusätzlich um eine Crowd-basierende Verifikationskomponente unterstützt werden.
278

Systém pro integraci webových datových zdrojů / System for Web Data Source Integration

Kolečkář, David January 2020 (has links)
The thesis aims at designing and implementing a web application that will be used for the integration of web data sources. For data integration, a method using domain model of the target information system was applied. The work describes individual methods used for extracting information from web pages. The text describes the process of designing the architecture of the system including a description of the chosen technologies and tools. The main part of the work is implementation and testing the final web application that is written in Java and Angular framework. The outcome of the work is a web application that will allow its users to define web data sources and save data in the target database.
279

Semantic Enrichment of Ontology Mappings

Arnold, Patrick 15 December 2015 (has links)
Schema and ontology matching play an important part in the field of data integration and semantic web. Given two heterogeneous data sources, meta data matching usually constitutes the first step in the data integration workflow, which refers to the analysis and comparison of two input resources like schemas or ontologies. The result is a list of correspondences between the two schemas or ontologies, which is often called mapping or alignment. Many tools and research approaches have been proposed to automatically determine those correspondences. However, most match tools do not provide any information about the relation type that holds between matching concepts, for the simple but important reason that most common match strategies are too simple and heuristic to allow any sophisticated relation type determination. Knowing the specific type holding between two concepts, e.g., whether they are in an equality, subsumption (is-a) or part-of relation, is very important for advanced data integration tasks, such as ontology merging or ontology evolution. It is also very important for mappings in the biological or biomedical domain, where is-a and part-of relations may exceed the number of equality correspondences by far. Such more expressive mappings allow much better integration results and have scarcely been in the focus of research so far. In this doctoral thesis, the determination of the correspondence types in a given mapping is the focus of interest, which is referred to as semantic mapping enrichment. We introduce and present the mapping enrichment tool STROMA, which obtains a pre-calculated schema or ontology mapping and for each correspondence determines a semantic relation type. In contrast to previous approaches, we will strongly focus on linguistic laws and linguistic insights. By and large, linguistics is the key for precise matching and for the determination of relation types. We will introduce various strategies that make use of these linguistic laws and are able to calculate the semantic type between two matching concepts. The observations and insights gained from this research go far beyond the field of mapping enrichment and can be also applied to schema and ontology matching in general. Since generic strategies have certain limits and may not be able to determine the relation type between more complex concepts, like a laptop and a personal computer, background knowledge plays an important role in this research as well. For example, a thesaurus can help to recognize that these two concepts are in an is-a relation. We will show how background knowledge can be effectively used in this instance, how it is possible to draw conclusions even if a concept is not contained in it, how the relation types in complex paths can be resolved and how time complexity can be reduced by a so-called bidirectional search. The developed techniques go far beyond the background knowledge exploitation of previous approaches, and are now part of the semantic repository SemRep, a flexible and extendable system that combines different lexicographic resources. Further on, we will show how additional lexicographic resources can be developed automatically by parsing Wikipedia articles. The proposed Wikipedia relation extraction approach yields some millions of additional relations, which constitute significant additional knowledge for mapping enrichment. The extracted relations were also added to SemRep, which thus became a comprehensive background knowledge resource. To augment the quality of the repository, different techniques were used to discover and delete irrelevant semantic relations. We could show in several experiments that STROMA obtains very good results w.r.t. relation type detection. In a comparative evaluation, it was able to achieve considerably better results than related applications. This corroborates the overall usefulness and strengths of the implemented strategies, which were developed with particular emphasis on the principles and laws of linguistics.
280

Query-Time Data Integration

Eberius, Julian 10 December 2015 (has links)
Today, data is collected in ever increasing scale and variety, opening up enormous potential for new insights and data-centric products. However, in many cases the volume and heterogeneity of new data sources precludes up-front integration using traditional ETL processes and data warehouses. In some cases, it is even unclear if and in what context the collected data will be utilized. Therefore, there is a need for agile methods that defer the effort of integration until the usage context is established. This thesis introduces Query-Time Data Integration as an alternative concept to traditional up-front integration. It aims at enabling users to issue ad-hoc queries on their own data as if all potential other data sources were already integrated, without declaring specific sources and mappings to use. Automated data search and integration methods are then coupled directly with query processing on the available data. The ambiguity and uncertainty introduced through fully automated retrieval and mapping methods is compensated by answering those queries with ranked lists of alternative results. Each result is then based on different data sources or query interpretations, allowing users to pick the result most suitable to their information need. To this end, this thesis makes three main contributions. Firstly, we introduce a novel method for Top-k Entity Augmentation, which is able to construct a top-k list of consistent integration results from a large corpus of heterogeneous data sources. It improves on the state-of-the-art by producing a set of individually consistent, but mutually diverse, set of alternative solutions, while minimizing the number of data sources used. Secondly, based on this novel augmentation method, we introduce the DrillBeyond system, which is able to process Open World SQL queries, i.e., queries referencing arbitrary attributes not defined in the queried database. The original database is then augmented at query time with Web data sources providing those attributes. Its hybrid augmentation/relational query processing enables the use of ad-hoc data search and integration in data analysis queries, and improves both performance and quality when compared to using separate systems for the two tasks. Finally, we studied the management of large-scale dataset corpora such as data lakes or Open Data platforms, which are used as data sources for our augmentation methods. We introduce Publish-time Data Integration as a new technique for data curation systems managing such corpora, which aims at improving the individual reusability of datasets without requiring up-front global integration. This is achieved by automatically generating metadata and format recommendations, allowing publishers to enhance their datasets with minimal effort. Collectively, these three contributions are the foundation of a Query-time Data Integration architecture, that enables ad-hoc data search and integration queries over large heterogeneous dataset collections.

Page generated in 0.0789 seconds