311 |
Dopad procesní a datové integrace na efektivitu reportingu / Impact of the process and data integration on reporting efficiencySys, Bohuslav January 2013 (has links)
Nowadays, when the difference between failure and success is amount of the available information combined with exponential growth of the available information on web leads to rising need to track the quality of the data. This worldwide trend is not only global, but it affects even individuals and companies in particular. In comparison with the past these companies produce higher amount of data, which are more complex at the same time, all to get a better idea about the real world. This leads us to the main problem, when we not only need to gather the data, but we have to present them in such way, so they can serve the purpose for which they have been gathered. Therefore the purpose of this thesis is to focus on processes following the data gathering -- data quality and transformation processes. In the first part of the thesis we will define a basic concept and issues, followed by methods necessary for acquiring requested data in expected quality, which includes the required infrastructure. In the second part of the thesis we will define real-life example and use the knowledge from previous part to design usable solution and deploy it into use. In conclusion we will evaluate the design compared to the result acquired from its real-life utilization.
|
312 |
Systém statistických informací o trhu práce / System of Labour Market InformationDuspivová, Kateřina January 2010 (has links)
The main aim of this dissertation thesis is to present a new system of statistical information concerning the labour market in the Czech Republic with respect to the theoretical background as well as to latest trends in the labour market statistics. The structure of the thesis is as follows. In the first chapter, there is a theoretical framework of the labour market introduced. This framework interlinks relations among employees and employers and is neutral with respect to all the economic schools of thought. Besides, I describe an actual state of the labour market statistics in the Czech Republic and evaluate its compliance with the theoretical research. The second chapter is focused on the state of the art concerning labour market statistics from the point of view of both data integration and comprehensive systems of labour market indicators. In the first part of the third chapter, there is a proposal of the new system of statistical information concerning the labour market that complies with both the economic theory and latest trends in the labour market statistics. The most likely advantage of the new system is the fact that all the key aspects of the labour market (i.e. employment as well as remuneration) are surveyed and evaluated together. In addition to the generally known indicators of both economic activity and remuneration, I propose new indicators of job creation, job destruction, hires, separations, job reallocation and worker reallocation. In the second part of the third chapter, there is a proposal of the integrated data source that will allow us to quantify all indicators provided that there are no legal restrictions concerning data integration in the Czech Republic. In last two parts of the third chapter, there are discussed both the main issues concerning the implementation of the system and the pros and cons of the system. There have never been quantified the indicators concerning job and employee flows using the linked employer-employee data in the Czech Republic, so the pilot results are introduced in the fourth chapter. It is obvious that using the new system, we could prove some hypotheses that were impossible to prove using standard set of indicators. In the last part of the fourth chapter, there are worker and job flows balanced with respect to the stock information concerning labour market in the Czech Republic. In the fifth chapter, I investigate the possibility of a wider use of the new system in order to be able to identify and analyse an array of labour market phenomena in more detail. The thesis brings a new insight to the dynamics of the labour market compared to the generally known basic set of labour market indicators. The systematic approach, based on a wider use of linked employer-employee microdata combined with new indicators, has the advantages of a higher information capability as well as of complying with the requirements of the academics.
|
313 |
La problématique de l'information territoriale et ses enjeux majeurs dans les pays du Sud : stratégie, méthodologie et projet pilote dans un pays en développement, le Mali / The problematic of territorial information and its major challenges in the countries of the South : strategy, methodology and pilot project in a developing country, MaliDakouo, Alain Bessiba 26 June 2019 (has links)
En Afrique, la décentralisation s'effectue dans des contextes variables selon les pays: la nécessité de réformer l'Etat suite à une crise, la volonté d'instaurer une démocratie locale pour compenser le pouvoir central voire dictatorial, parfois même l'incapacité de l'Etat à fournir les services socioéconomiques de base comme la santé, l'éducation, l'eau potable etc.En Afrique de l'Ouest, la décentralisation s’est souvent accompagnée d’un redécoupage des territoires dans les années 1990. La plupart des pays d’Afrique de l'Ouest ont créé trois niveaux de collectivités territoriales: la Région, le Département (Cercle au Mali) et la Commune. Ceci entraîne un besoin de gestion et d’aménagement du territoire à plusieurs échelles (état des lieux, suivi de l’emprise spatiale de l’aménagement, assainissement, gestion des ressources naturelles, développement économique rural, santé, éducation, hydraulique, gestion des risques etc.). Dans le cadre d’un besoin d’informations croissant, les partenaires au développement reconnaissent l’utilité du Système d'Information Géographique (SIG), comme outil d'aide à la décision. La création de différents ministères en lien avec l’information géographique au Mali, nécessite la mise en commun de compétences s’articulant autour de la géomatique. En effet, si chaque Institution et ministère sectoriel dispose de données thématiques propres, leur valorisation se heurte à une forte dispersion et disparité des données géographiques et cartographiques.Comment mutualiser un même système géospatial et territorial entre une mairie, une collectivité territoriale, une ONG, l’Etat ainsi que d’autres partenaires travaillant sur une même emprise territoriale ? Quelle stratégie d’information territoriale pour un pays comme le Mali ?L’objectif de ce travail de thèse est de mettre en place, selon les concepts, méthodes, et technologies de la géographie actuelle et de la statistique, un outil d’aide à la décision dans un contexte de mutualisation et de décentralisation, conçu pour faire franchir un pas décisif au bénéfice de l'aménagement territorial local, par une mise en cohérence et une mise à disposition des données géolocalisées nécessaires à une politique efficace d'aménagement du territoire. Cette perspective à caractère stratégique implique de remonter jusqu'aux aspects de la répartition des pouvoirs, les rapports entre les logiciels libres et payants, l’information participative (aspects sociétal, social, contexte ethnique...) et le développement de l’information géographique au Mali. Une stratégie d’information est en fait un préalable indispensable de toute stratégie d’aménagement et de développement. Le projet de thèse se veut un projet innovant cherchant à fournir des réponses sur la mise en place d’une telle politique de gestion multi-sources et multi-acteurs de l’information spatiale dans un pays en développement. / In Africa, decentralization takes place in contexts that vary from country to country: the need to reform the Government following a crisis, the desire to establish local democracy to compensate for central or even dictatorial power, sometimes even the Government 's inability to provide basic socio-economic services such as health, education, drinking water, etc.In West Africa, decentralization was often accompanied by a redrawing of territories in the 1990s. Most West African countries have created three levels of local authorities: the Region, the Department (Cercle in Mali) and the Commune. This leads to a need for territory management and planning on several scales (inventory, monitoring of the environmental impact of development, sanitation, natural resource management, rural economic development, health, education, hydraulics and risk management). In the context of a growing need for information, development partners recognize the usefulness of the Geographic Information System (GIS) as a tool for decision making. The creation of different ministries in connection with geographic information in Mali requires pooling of skills centered on geomatics. Indeed, while each Institution and sectoral ministry has its own thematic data, their valuation is hampered by a high dispersion and disparity of geographical and cartographic data.How to share a common geospatial and territorial system across a town hall, a local authority, an NGO, the Government as well as other partners working on the same territorial? What territorial information strategy for a country like Mali?The aim of this thesis is to create, according to the concepts, methods, and technologies of current geography and statistics, a tool to support decision making in a context of overlapping responsibilities/actions and decentralization, designed to take a decisive step forward for the benefit of local territorial planning, by making coherent and available the geolocalized data necessary for an effective spatial planning policy. This strategic perspective implies going back to the distribution of powers, the ratio between free and paid software, participative information (societal, social, ethnic aspects, etc.) and the development of geographical information in Mali.An information strategy is in fact an essential prerequisite for any planning and development strategy. This thesis is an innovative project that will aim to provide answers on the implementation of such a strategy of multi-source and multi-stakeholder spatial information management in a developing country.
|
314 |
[en] OLAP2DATACUBE: AN ON-DEMAND TRANSFORMATION FRAMEWORK FROM OLAP TO RDF DATA CUBES / [pt] OLAP2DATACUBE: UM FRAMEWORK PARA TRANSFORMAÇÕES EM TEMPO DE EXECUÇÃO DE OLAP PARA CUBOS DE DADOS EM RDFPERCY ENRIQUE RIVERA SALAS 13 April 2016 (has links)
[pt] Dados estatísticos são uma das mais importantes fontes de informações,
relevantes para um grande número de partes interessadas nos domínios governamentais, científicos e de negócios. Um conjunto de dados estatísticos compreende uma coleção de observações feitas em alguns pontos através de um espaço lógico e muitas vezes é organizado como cubos de dados. A definição
adequada de cubos de dados, especialmente das suas dimensões, ajuda a processar
as observações e, mais importante, ajuda a combinar observações de
diferentes cubos de dados. Neste contexto, os princípios de Linked Data podem
ser proveitosamente aplicados na definição de cubos de dados, no sentido de
que os princípios oferecem uma estratégia para fornecer a semântica ausentes
nas dimensões, incluindo os seus valores. Nesta tese, descrevemos o processo e
a implementação de uma arquitetura de mediação, chamada OLAP2DataCube
On Demand Framework, que ajuda a descrever e consumir dados estatísticos,
expostos como triplas RDF, mas armazenados em bancos de dados relacionais.
O Framework possui um catálogo de descrições de Linked Data Cubes, criado
de acordo com os princípios de Linked Data. O catálogo tem uma descrição
padronizada para cada cubo de dados armazenado em bancos de dados (relacionais)
estatísticos conhecidos pelo Framework. O Framework oferece uma interface
para navegar pelas descrições dos Linked Data Cubes e para exportar os
cubos de dados como triplas RDF geradas por demanda a partir das fontes de
dados subjacentes. Também discutimos a implementação de operações sofisticadas
de busca de metadados, operações OLAP em cubo de dados, tais como
slice e dice, e operações de mashup sofisticadas de cubo de dados que criam
novos cubos através da combinação de outros cubos. / [en] Statistical data is one of the most important sources of information,
relevant to a large number of stakeholders in the governmental, scientific
and business domains alike. A statistical data set comprises a collection of
observations made at some points across a logical space and is often organized
as what is called a data cube. The proper definition of the data cubes,
especially of their dimensions, helps processing the observations and, more
importantly, helps combining observations from different data cubes. In this
context, the Linked Data principles can be profitably applied to the definition
of data cubes, in the sense that the principles offer a strategy to provide the
missing semantics of the dimensions, including their values. In this thesis we
describe the process and the implementation of a mediation architecture, called
OLAP2DataCube On Demand, which helps describe and consume statistical
data, exposed as RDF triples, but stored in relational databases. The tool
features a catalogue of Linked Data Cube descriptions, created according to the
Linked Data principles. The catalogue has a standardized description for each
data cube actually stored in each statistical (relational) database known to the
tool. The tool offers an interface to browse the linked data cube descriptions
and to export the data cubes as RDF triples, generated on demand from the
underlying data sources. We also discuss the implementation of sophisticated
metadata search operations, OLAP data cube operations, such as slice and
dice, and data cube mashup operations that create new cubes by combining
other cubes.
|
315 |
Une base de connaissance personnelle intégrant les données d'un utilisateur et une chronologie de ses activités / A personal knowledge base integrating user data and activity timelineMontoya, David 06 March 2017 (has links)
Aujourd'hui, la plupart des internautes ont leurs données dispersées dans plusieurs appareils, applications et services. La gestion et le contrôle de ses données sont de plus en plus difficiles. Dans cette thèse, nous adoptons le point de vue selon lequel l'utilisateur devrait se voir donner les moyens de récupérer et d'intégrer ses données, sous son contrôle total. À ce titre, nous avons conçu un système logiciel qui intègre et enrichit les données d'un utilisateur à partir de plusieurs sources hétérogènes de données personnelles dans une base de connaissances RDF. Le logiciel est libre, et son architecture innovante facilite l'intégration de nouvelles sources de données et le développement de nouveaux modules pour inférer de nouvelles connaissances. Nous montrons tout d'abord comment l'activité de l'utilisateur peut être déduite des données des capteurs de son téléphone intelligent. Nous présentons un algorithme pour retrouver les points de séjour d'un utilisateur à partir de son historique de localisation. À l'aide de ces données et de données provenant d'autres capteurs de son téléphone, d'informations géographiques provenant d'OpenStreetMap, et des horaires de transports en commun, nous présentons un algorithme de reconnaissance du mode de transport capable de retrouver les différents modes et lignes empruntés par un utilisateur lors de ses déplacements. L'algorithme reconnaît l'itinéraire pris par l'utilisateur en retrouvant la séquence la plus probable dans un champ aléatoire conditionnel dont les probabilités se basent sur la sortie d'un réseau de neurones artificiels. Nous montrons également comment le système peut intégrer les données du courrier électronique, des calendriers, des carnets d'adresses, des réseaux sociaux et de l'historique de localisation de l'utilisateur dans un ensemble cohérent. Pour ce faire, le système utilise un algorithme de résolution d'entité pour retrouver l'ensemble des différents comptes utilisés par chaque contact de l'utilisateur, et effectue un alignement spatio-temporel pour relier chaque point de séjour à l'événement auquel il correspond dans le calendrier de l'utilisateur. Enfin, nous montrons qu'un tel système peut également être employé pour faire de la synchronisation multi-système/multi-appareil et pour pousser de nouvelles connaissances vers les sources. Les résultats d'expériences approfondies sont présentés. / Typical Internet users today have their data scattered over several devices, applications, and services. Managing and controlling one's data is increasingly difficult. In this thesis, we adopt the viewpoint that the user should be given the means to gather and integrate her data, under her full control. In that direction, we designed a system that integrates and enriches the data of a user from multiple heterogeneous sources of personal information into an RDF knowledge base. The system is open-source and implements a novel, extensible framework that facilitates the integration of new data sources and the development of new modules for deriving knowledge. We first show how user activity can be inferred from smartphone sensor data. We introduce a time-based clustering algorithm to extract stay points from location history data. Using data from additional mobile phone sensors, geographic information from OpenStreetMap, and public transportation schedules, we introduce a transportation mode recognition algorithm to derive the different modes and routes taken by the user when traveling. The algorithm derives the itinerary followed by the user by finding the most likely sequence in a linear-chain conditional random field whose feature functions are based on the output of a neural network. We also show how the system can integrate information from the user's email messages, calendars, address books, social network services, and location history into a coherent whole. To do so, it uses entity resolution to find the set of avatars used by each real-world contact and performs spatiotemporal alignment to connect each stay point with the event it corresponds to in the user's calendar. Finally, we show that such a system can also be used for multi-device and multi-system synchronization and allow knowledge to be pushed to the sources. We present extensive experiments.
|
316 |
Ein Integrations- und Darstellungsmodell für verteilte und heterogene kontextbezogene InformationenGoslar, Kevin 23 November 2006 (has links)
Die "Kontextsensitivität" genannte systematische Berücksichtigung von Umweltinformationen durch Anwendungssysteme kann als Querschnittsfunktion im betrieblichen Umfeld in vielen Bereichen einen Nutzen stiften. Wirklich praxistaugliche kontextsensitive Anwendungssysteme, die sich analog zu einem mitdenkenden menschlichen Assistenten harmonisch in die ablaufenden Vorgänge in der Realwelt einbringen, haben einen enormen Bedarf nach umfassenden, d.h. diverse Aspekte der Realwelt beschreibenden Kontextinformationen, die jedoch prinzipbedingt verteilt in verschiedenen Datenquellen, etwa Kontexterfassungssystemen, Endgeräten sowie prinzipiell auch in beliebigen anderen, z.T. bereits existierenden Anwendungen entstehen. Ziel dieser Arbeit ist die Verringerung der Komplexität des Beschaffungsvorganges von verteilten und heterogenen Kontextinformationen durch Bereitstellung einer einfach verwendbaren Methode zur Darstellung eines umfassenden, aus verteilten und heterogenen Datenquellen zusammengetragenen Kontextmodells. Im Besonderen werden durch diese Arbeit zwei Probleme addressiert, zum einen daß ein Konsument von umfassenden Kontextinformationen mehrere Datenquellen sowohl kennen und zugreifen können und zum anderen über die zwischen den einzelnen Kontextinformationen in verschiedenen Datenquellen existierenden, zunächst nicht modellierten semantischen Verbindungen Bescheid wissen muß. Das dazu entwickelte Kontextinformationsintegrations- und -darstellungsverfahren kombiniert daher ein die Beschaffung und Integration von Kontextinformationen aus diversen Datenquellen modellierendes Informationsintegrationsmodell mit einem Kontextdarstellungsmodell, welches die abzubildende Realweltdomäne basierend auf ontologischen Informationen durch in problemspezifischer Weise erweiterte Verfahren des Semantic Web in einer möglichst intuitiven, wiederverwendbaren und modularen Weise modelliert. Nach einer fundierten Anforderungsanalyse des entwickelten Prinzips wird dessen Verwendung und Nutzen basierend auf der Skizzierung der wichtigsten allgemeinen Verwendungsmöglichkeiten von Kontextinformationen im betrieblichen Umfeld anhand eines komplexen betrieblichen Anwendungsszenarios demonstriert. Dieses beinhaltet ein Nutzerprofil, das von diversen Anwendungen, u.a. einem kontextsensitiven KFZ-Navigationssystem, einer Restaurantsuchanwendung sowie einem Touristenführer verwendet wird. Probleme hinsichtlich des Datenschutzes, der Integration in existierende Umgebungen und Abläufe sowie der Skalierbarkeit und Leistungsfähigkeit des Verfahrens werden ebenfalls diskutiert. / Context-awareness, which is the systematic consideration of information from the environment of applications, can provide significant benefits in the area of business and technology. To be really useful, i.e. harmonically support real-world processes as human assistants do it, practical applications need a comprehensive and detailed contextual information base that describes all relevant aspects of the real world. As a matter of principle, comprehensive contextual information arises in many places and data sources, e.g. in context-aware infrastructures as well as in "normal" applications, which may have knowledge about the context based on their functionality to support a certain process in the real world. This thesis facilitates the use of contextual information by reducing the complexity of the procurement process of distributed and heterogenous contextual information. Particularly, it addresses the two problems that a consumer of comprehensive contextual information needs to be aware of and able to access several different data sources and must know how to combine the contextual information taken from different and isolated data sources into a meaningful representation of the context. Especially the latter information cannot be modelled using the current state of the art. These problems are addressed by the development of an integration and representation model for contextual information that allows to compose comprehensive context models using information inside distributed and heterogeneous data sources. This model combines an information integration model for distributed and heterogenous information (which consists of an access model for heterogeneous data sources, an integration model and an information relation model) with a representation model for context that formalizes the representation of the respective real world domain, i.e. of the real world objects and their semantic relations in an intuitive, reusable and modular way based on ontologies. The resulting model consists of five layers that represent different aspects of the information integration solution. The achievement of the objectives is rated based on a requirement analysis of the problem domain. The technical feasibility and usefulness of the model is demonstrated by the implementation of an engine to support the approach as well as a complex application scenario consisting of a user profile that integrates information from several data sources and a couple of context-aware applications, e.g. a context-aware navigation system, a restaurant finder application as well as an enhanced tourist guide that use the user profile. Problems regarding security and social effects, the integration of this solution into existing environments and infrastructures as well as technical issues like the scalability and performance of this model are discussed too.
|
317 |
Digital Twin Knowledge Graphs for IoT Platforms : Towards a Virtual Model for Real-Time Knowledge Representation in IoT Platforms / Digital Twin Kunskapsgrafer för IoT-Plattformar : Mot en Virtuell Modell för Kunskapsrepresentation i Realtid i IoT-PlattformarJarabo Peñas, Alejandro January 2023 (has links)
This thesis presents the design and prototype implementation of a digital twin based on a knowledge graph for Internet of Things (IoT) platforms. The digital twin is a virtual representation of a physical object or system that must continually integrate and update knowledge in rapidly changing environments. The proposed knowledge graph is designed to store and efficiently query a large number of IoT devices in a complex logical structure, use rule-based reasoning to infer new facts, and integrate unanticipated devices into the existing logical structure in order to adapt to changing environments. The digital twin is implemented using the open-source TypeDB knowledge graph and tested in a simplified automobile production line environment. The main focus of the work is on the integration of unanticipated devices, for which a similarity metric is implemented to identify similar existing devices and determine the appropriate integration into the knowledge graph. The proposed digital twin knowledge graph is a promising solution for managing and integrating knowledge in rapidly changing IoT environments, providing valuable insights and support for decision-making. / I den här avhandlingen presenteras utformningen och prototypimplementeringen av en digital tvilling baserad på en kunskapsgraf för IoT-plattformar (Internet of Things). Den digitala tvillingen är en virtuell representation av ett fysiskt objekt eller system som måste integrera och uppdatera kunskap i snabbt föränderliga miljöer. Den föreslagna kunskapsgrafen är utformad för att lagra och effektivt söka efter en stor uppsättning IoT-enheter i en komplex logisk struktur, använda regelbaserade resonemang för att härleda nya fakta och integrera oväntade enheter i den befintliga logiska strukturen för att anpassa sig till föränderliga miljöer. Den digitala tvillingen genomförs med hjälp av kunskapsgrafen TypeDB med öppen källkod och testas i en förenklad miljö för bilproduktion. Huvudfokus ligger på integrationen av oväntade enheter, för vilka ett likhetsmått implementeras för att identifiera liknande befintliga enheter och bestämma lämplig integration i kunskapsgrafen. Den föreslagna kunskapsgrafen för digitala tvillingar är en lovande lösning för att hantera och integrera kunskap i snabbt föränderliga IoT-miljöer, vilket ger värdefulla insikter och stöd för beslutsfattande. / Esta tesis presenta el diseño e implementación de un prototipo de gemelo digital basado en un grafo de conocimiento para plataformas de Internet de las Cosas (IoT). El gemelo digital es una representación virtual de un objeto o sistema físico que debe integrar y actualizar continuamente el conocimiento en entornos que cambian rápidamente. El grafo de conocimiento propuesto está diseñado para almacenar y consultar eficientemente un gran número de dispositivos IoT en una estructura lógica compleja, utilizar el razonamiento basado en reglas para inferir nuevos hechos e integrar dispositivos imprevistos en la estructura lógica existente para adaptarse a los cambios del entorno. El gemelo digital se implementa utilizando el grafo de conocimiento de código abierto TypeDB y se prueba en un entorno simplificado basado en una línea de producción de automóviles. El objetivo principal del trabajo es la integración de dispositivos no previstos, para lo cual se implementa una métrica de similitud para identificar dispositivos existentes similares y determinar la integración adecuada en el grafo de conocimiento. El grafo de conocimiento propuesto es una solución prometedora para la gestión del conocimiento y la integración en entornos IoT que cambian rápidamente, proporcionando información valiosa y apoyo a la toma de decisiones.
|
318 |
Protein Interaction networks and their applications to protein characterization and cancer genes predictionAragüés Peleato, Ramón 13 July 2007 (has links)
La importancia de comprender los procesos biológicos ha estimulado el desarrollo de métodos para la detección de interacciones proteína-proteína. Esta tesis presenta PIANA (Protein Interactions And Network Analysis), un programa informático para la integración y el análisis de redes de interacción proteicas. Además, describimos un método que identifica motivos de interacción basándose en que las proteínas con parejas de interacción comunes tienden a interaccionar con esas parejas a través del mismo motivo de interacción. Encontramos que las proteínas altamente conectadas (i.e., hubs) con múltiples motivos tienen mayor probabilidad de ser esenciales para la viabilidad de la célula que los hubs con uno o dos motivos. Finalmente, presentamos un método que predice genes relacionados con cáncer mediante la integración de redes de interacción proteicas, datos de expresión diferenciada y propiedades estructurales, funcionales y evolutivas. El valor de predicción positiva es 71% con sensitividad del 1%, superando a otros métodos usados independientemente. / The importance of understanding cellular processes prompted the development of experimental approaches that detect protein-protein interactions. Here, we describe a software platform called PIANA (Protein Interactions And Network Analysis) that integrates interaction data from multiple sources and automates the analysis of protein interaction networks. Moreover, we describe a method that delineates interacting motifs by relying on the observation that proteins with common interaction partners tend to interact with these partners through the same interacting motif. We find that highly connected proteins (i.e., hubs) with multiple interacting motifs are more likely to be essential for cellular viability than hubs with one or two interacting motifs. Furthermore, we present a method that predicts cancer genes by integrating protein interaction networks, differential expression studies and structural, functional and evolutionary properties. For a sensitivity of 1%, the positive predictive value is 71%, which outperforms the use of any of the methods independently.
|
319 |
Integrative approaches to investigate the molecular basis of diseases and adverse drug reactions: from multivariate statistical analysis to systems biologyBauer-Mehren, Anna 08 November 2010 (has links)
Despite some great success, many human diseases cannot be effectively treated, prevented or cured, yet. Moreover, prescribed drugs are often not very efficient and cause undesired side effects. Hence, there is a need to investigate the molecular basis of diseases and adverse drug reactions in more detail. For this purpose, relevant biomedical data needs to be gathered, integrated and analysed in a meaningful way. In this regard, we have developed novel integrative analysis approaches based on both perspectives, classical multivariate statistics and systems biology. A novel multilevel statistical method has been developed for exploiting molecular and pharmacological information for a set of drugs in order to investigate undesired side effects. Systems biology approaches have been used to study the genetic basis of human diseases at a global scale. For this purpose, we have developed an integrated gene-disease association database and tools for user-friendly access and analysis. We showed that modularity applies for mendelian, complex and environmental diseases and identified disease-related core biological processes. We have constructed a workflow to investigate adverse drug reactions using our gene-disease association database. A detailed study of currently available pathway data has been performed to evaluate its applicability to build network models. Finally, a strategy to integrate information about sequence variations with biological pathways has been implemented to study the effect of the sequence variations onto biological processes. In summary, the developed methods are of immense practical value for other biomedical researchers and can aid to improve the understanding of the molecular basis of diseases and adverse drug reactions.A pesar de que existen tratamientos eficaces para las enfermedades, no hay todavía una cura o un tratamiento efectivo para muchas de ellas. Asimismo los medicamentos pueden ser ineficaces o causar efectos secundarios indeseables. Por lo tanto, es necesario investigar en profundidad las bases moleculares de las enfermedades y de los efectos secundarios de los medicamentos. Para ello, es necesario identificar y analizar de forma integrada los datos biomédicos relevantes. En este sentido, hemos desarrollado nuevos métodos de análisis e integración de datos biomédicos que van desde el análisis estadístico multivariante a la biología de sistemas. En primer lugar, hemos desarrollado un nuevo método estadístico multinivel para la explotación de la información molecular y farmacológica de un conjunto de drogas a fin de investigar efectos secundarios no deseados. Luego, hemos usado métodos de biología de sistemas para estudiar las bases genéticas de enfermedades humanas a escala global. Para ello, hemos integrado en una base de datos asociaciones entre genes y enfermedades y hemos desarrollado herramientas para el fácil acceso y análisis de los datos. Mostramos que las enfermedades mendelianas, complejas y ambientales presentan modularidad e identificamos los procesos biológicos relacionados con dichas enfermedades. Hemos construido una herramienta para investigar las reacciones adversas a los medicamentos basada en nuestra base de datos de asociaciones entre genes y enfermedades. Realizamos un estudio detallado de los datos disponibles sobre los procesos biológicos para evaluar su aplicabilidad en la construcción de modelos dinámicos. Por último, desarrollamos una estrategia para integrar la información sobre las variaciones de secuencia de genes con los procesos biológicos para estudiar el efecto de dichas variaciones en los procesos biológicos. En resumen, los métodos presentados en esta tesis constituyen una herramienta valiosa para otros investigadores y pueden ayudar a mejorar la comprensión de las bases moleculares de las enfermedades y de las reacciones adversas a los medicamentos.
|
320 |
Geometrische und stochastische Modelle zur Verarbeitung von 3D-Kameradaten am Beispiel menschlicher Bewegungsanalysen / Geometric and stochastic models for the processing of 3D camera data within the context of human motion analysesWestfeld, Patrick 15 June 2012 (has links) (PDF)
Die dreidimensionale Erfassung der Form und Lage eines beliebigen Objekts durch die flexiblen Methoden und Verfahren der Photogrammetrie spielt für ein breites Spektrum technisch-industrieller und naturwissenschaftlicher Einsatzgebiete eine große Rolle. Die Anwendungsmöglichkeiten reichen von Messaufgaben im Automobil-, Maschinen- und Schiffbau über die Erstellung komplexer 3D-Modelle in Architektur, Archäologie und Denkmalpflege bis hin zu Bewegungsanalysen in Bereichen der Strömungsmesstechnik, Ballistik oder Medizin. In der Nahbereichsphotogrammetrie werden dabei verschiedene optische 3D-Messsysteme verwendet. Neben flächenhaften Halbleiterkameras im Einzel- oder Mehrbildverband kommen aktive Triangulationsverfahren zur Oberflächenmessung mit z.B. strukturiertem Licht oder Laserscanner-Systeme zum Einsatz.
3D-Kameras auf der Basis von Photomischdetektoren oder vergleichbaren Prinzipien erzeugen durch die Anwendung von Modulationstechniken zusätzlich zu einem Grauwertbild simultan ein Entfernungsbild. Als Einzelbildsensoren liefern sie ohne die Notwendigkeit einer stereoskopischen Zuordnung räumlich aufgelöste Oberflächendaten in Videorate. In der 3D-Bewegungsanalyse ergeben sich bezüglich der Komplexität und des Rechenaufwands erhebliche Erleichterungen. 3D-Kameras verbinden die Handlichkeit einer Digitalkamera mit dem Potential der dreidimensionalen Datenakquisition etablierter Oberflächenmesssysteme. Sie stellen trotz der noch vergleichsweise geringen räumlichen Auflösung als monosensorielles System zur Echtzeit-Tiefenbildakquisition eine interessante Alternative für Aufgabenstellungen der 3D-Bewegungsanalyse dar.
Der Einsatz einer 3D-Kamera als Messinstrument verlangt die Modellierung von Abweichungen zum idealen Abbildungsmodell; die Verarbeitung der erzeugten 3D-Kameradaten bedingt die zielgerichtete Adaption, Weiter- und Neuentwicklung von Verfahren der Computer Vision und Photogrammetrie. Am Beispiel der Untersuchung des zwischenmenschlichen Bewegungsverhaltens sind folglich die Entwicklung von Verfahren zur Sensorkalibrierung und zur 3D-Bewegungsanalyse die Schwerpunkte der Dissertation. Eine 3D-Kamera stellt aufgrund ihres inhärenten Designs und Messprinzips gleichzeitig Amplituden- und Entfernungsinformationen zur Verfügung, welche aus einem Messsignal rekonstruiert werden. Die simultane Einbeziehung aller 3D-Kamerainformationen in jeweils einen integrierten Ansatz ist eine logische Konsequenz und steht im Vordergrund der Verfahrensentwicklungen. Zum einen stützen sich die komplementären Eigenschaften der Beobachtungen durch die Herstellung des funktionalen Zusammenhangs der Messkanäle gegenseitig, wodurch Genauigkeits- und Zuverlässigkeitssteigerungen zu erwarten sind. Zum anderen gewährleistet das um eine Varianzkomponentenschätzung erweiterte stochastische Modell eine vollständige Ausnutzung des heterogenen Informationshaushalts.
Die entwickelte integrierte Bündelblockausgleichung ermöglicht die Bestimmung der exakten 3D-Kamerageometrie sowie die Schätzung der distanzmessspezifischen Korrekturparameter zur Modellierung linearer, zyklischer und signalwegeffektbedingter Fehleranteile einer 3D-Kamerastreckenmessung. Die integrierte Kalibrierroutine gleicht in beiden Informationskanälen gemessene Größen gemeinsam, unter der automatischen Schätzung optimaler Beobachtungsgewichte, aus. Die Methode basiert auf dem flexiblen Prinzip einer Selbstkalibrierung und benötigt keine Objektrauminformation, wodurch insbesondere die aufwendige Ermittlung von Referenzstrecken übergeordneter Genauigkeit entfällt. Die durchgeführten Genauigkeitsuntersuchungen bestätigen die Richtigkeit der aufgestellten funktionalen Zusammenhänge, zeigen aber auch Schwächen aufgrund noch nicht parametrisierter distanzmessspezifischer Fehler. Die Adaptivität und die modulare Implementierung des entwickelten mathematischen Modells gewährleisten aber eine zukünftige Erweiterung. Die Qualität der 3D-Neupunktkoordinaten kann nach einer Kalibrierung mit 5 mm angegeben werden. Für die durch eine Vielzahl von meist simultan auftretenden Rauschquellen beeinflusste Tiefenbildtechnologie ist diese Genauigkeitsangabe sehr vielversprechend, vor allem im Hinblick auf die Entwicklung von auf korrigierten 3D-Kameradaten aufbauenden Auswertealgorithmen.
2,5D Least Squares Tracking (LST) ist eine im Rahmen der Dissertation entwickelte integrierte spatiale und temporale Zuordnungsmethode zur Auswertung von 3D-Kamerabildsequenzen. Der Algorithmus basiert auf der in der Photogrammetrie bekannten Bildzuordnung nach der Methode der kleinsten Quadrate und bildet kleine Oberflächensegmente konsekutiver 3D-Kameradatensätze aufeinander ab. Die Abbildungsvorschrift wurde, aufbauend auf einer 2D-Affintransformation, an die Datenstruktur einer 3D-Kamera angepasst. Die geschlossen formulierte Parametrisierung verknüpft sowohl Grau- als auch Entfernungswerte in einem integrierten Modell. Neben den affinen Parametern zur Erfassung von Translations- und Rotationseffekten, modellieren die Maßstabs- sowie Neigungsparameter perspektivbedingte Größenänderungen des Bildausschnitts, verursacht durch Distanzänderungen in Aufnahmerichtung. Die Eingabedaten sind in einem Vorverarbeitungsschritt mit Hilfe der entwickelten Kalibrierroutine um ihre opto- und distanzmessspezifischen Fehler korrigiert sowie die gemessenen Schrägstrecken auf Horizontaldistanzen reduziert worden. 2,5D-LST liefert als integrierter Ansatz vollständige 3D-Verschiebungsvektoren. Weiterhin können die aus der Fehlerrechnung resultierenden Genauigkeits- und Zuverlässigkeitsangaben als Entscheidungskriterien für die Integration in einer anwendungsspezifischen Verarbeitungskette Verwendung finden. Die Validierung des Verfahrens zeigte, dass die Einführung komplementärer Informationen eine genauere und zuverlässigere Lösung des Korrespondenzproblems bringt, vor allem bei schwierigen Kontrastverhältnissen in einem Kanal. Die Genauigkeit der direkt mit den Distanzkorrekturtermen verknüpften Maßstabs- und Neigungsparameter verbesserte sich deutlich. Darüber hinaus brachte die Erweiterung des geometrischen Modells insbesondere bei der Zuordnung natürlicher, nicht gänzlich ebener Oberflächensegmente signifikante Vorteile.
Die entwickelte flächenbasierte Methode zur Objektzuordnung und Objektverfolgung arbeitet auf der Grundlage berührungslos aufgenommener 3D-Kameradaten. Sie ist somit besonders für Aufgabenstellungen der 3D-Bewegungsanalyse geeignet, die den Mehraufwand einer multiokularen Experimentalanordnung und die Notwendigkeit einer Objektsignalisierung mit Zielmarken vermeiden möchten. Das Potential des 3D-Kamerazuordnungsansatzes wurde an zwei Anwendungsszenarien der menschlichen Verhaltensforschung demonstriert. 2,5D-LST kam zur Bestimmung der interpersonalen Distanz und Körperorientierung im erziehungswissenschaftlichen Untersuchungsgebiet der Konfliktregulation befreundeter Kindespaare ebenso zum Einsatz wie zur Markierung und anschließenden Klassifizierung von Bewegungseinheiten sprachbegleitender Handgesten. Die Implementierung von 2,5D-LST in die vorgeschlagenen Verfahren ermöglichte eine automatische, effektive, objektive sowie zeitlich und räumlich hochaufgelöste Erhebung und Auswertung verhaltensrelevanter Daten.
Die vorliegende Dissertation schlägt die Verwendung einer neuartigen 3D-Tiefenbildkamera zur Erhebung menschlicher Verhaltensdaten vor. Sie präsentiert sowohl ein zur Datenaufbereitung entwickeltes Kalibrierwerkzeug als auch eine Methode zur berührungslosen Bestimmung dichter 3D-Bewegungsvektorfelder. Die Arbeit zeigt, dass die Methoden der Photogrammetrie auch für bewegungsanalytische Aufgabenstellungen auf dem bisher noch wenig erschlossenen Gebiet der Verhaltensforschung wertvolle Ergebnisse liefern können. Damit leistet sie einen Beitrag für die derzeitigen Bestrebungen in der automatisierten videographischen Erhebung von Körperbewegungen in dyadischen Interaktionen. / The three-dimensional documentation of the form and location of any type of object using flexible photogrammetric methods and procedures plays a key role in a wide range of technical-industrial and scientific areas of application. Potential applications include measurement tasks in the automotive, machine building and ship building sectors, the compilation of complex 3D models in the fields of architecture, archaeology and monumental preservation and motion analyses in the fields of flow measurement technology, ballistics and medicine. In the case of close-range photogrammetry a variety of optical 3D measurement systems are used. Area sensor cameras arranged in single or multi-image configurations are used besides active triangulation procedures for surface measurement (e.g. using structured light or laser scanner systems).
The use of modulation techniques enables 3D cameras based on photomix detectors or similar principles to simultaneously produce both a grey value image and a range image. Functioning as single image sensors, they deliver spatially resolved surface data at video rate without the need for stereoscopic image matching. In the case of 3D motion analyses in particular, this leads to considerable reductions in complexity and computing time. 3D cameras combine the practicality of a digital camera with the 3D data acquisition potential of conventional surface measurement systems. Despite the relatively low spatial resolution currently achievable, as a monosensory real-time depth image acquisition system they represent an interesting alternative in the field of 3D motion analysis.
The use of 3D cameras as measuring instruments requires the modelling of deviations from the ideal projection model, and indeed the processing of the 3D camera data generated requires the targeted adaptation, development and further development of procedures in the fields of computer graphics and photogrammetry. This Ph.D. thesis therefore focuses on the development of methods of sensor calibration and 3D motion analysis in the context of investigations into inter-human motion behaviour. As a result of its intrinsic design and measurement principle, a 3D camera simultaneously provides amplitude and range data reconstructed from a measurement signal. The simultaneous integration of all data obtained using a 3D camera into an integrated approach is a logical consequence and represents the focus of current procedural development. On the one hand, the complementary characteristics of the observations made support each other due to the creation of a functional context for the measurement channels, with is to be expected to lead to increases in accuracy and reliability. On the other, the expansion of the stochastic model to include variance component estimation ensures that the heterogeneous information pool is fully exploited.
The integrated bundle adjustment developed facilitates the definition of precise 3D camera geometry and the estimation of range-measurement-specific correction parameters required for the modelling of the linear, cyclical and latency defectives of a distance measurement made using a 3D camera. The integrated calibration routine jointly adjusts appropriate dimensions across both information channels, and also automatically estimates optimum observation weights. The method is based on the same flexible principle used in self-calibration, does not require spatial object data and therefore foregoes the time-consuming determination of reference distances with superior accuracy. The accuracy analyses carried out confirm the correctness of the proposed functional contexts, but nevertheless exhibit weaknesses in the form of non-parameterized range-measurement-specific errors. This notwithstanding, the future expansion of the mathematical model developed is guaranteed due to its adaptivity and modular implementation. The accuracy of a new 3D point coordinate can be set at 5 mm further to calibration. In the case of depth imaging technology – which is influenced by a range of usually simultaneously occurring noise sources – this level of accuracy is very promising, especially in terms of the development of evaluation algorithms based on corrected 3D camera data.
2.5D Least Squares Tracking (LST) is an integrated spatial and temporal matching method developed within the framework of this Ph.D. thesis for the purpose of evaluating 3D camera image sequences. The algorithm is based on the least squares image matching method already established in photogrammetry, and maps small surface segments of consecutive 3D camera data sets on top of one another. The mapping rule has been adapted to the data structure of a 3D camera on the basis of a 2D affine transformation. The closed parameterization combines both grey values and range values in an integrated model. In addition to the affine parameters used to include translation and rotation effects, the scale and inclination parameters model perspective-related deviations caused by distance changes in the line of sight. A pre-processing phase sees the calibration routine developed used to correct optical and distance-related measurement specific errors in input data and measured slope distances reduced to horizontal distances. 2.5D LST is an integrated approach, and therefore delivers fully three-dimensional displacement vectors. In addition, the accuracy and reliability data generated by error calculation can be used as decision criteria for integration into an application-specific processing chain. Process validation showed that the integration of complementary data leads to a more accurate, reliable solution to the correspondence problem, especially in the case of difficult contrast ratios within a channel. The accuracy of scale and inclination parameters directly linked to distance correction terms improved dramatically. In addition, the expansion of the geometric model led to significant benefits, and in particular for the matching of natural, not entirely planar surface segments.
The area-based object matching and object tracking method developed functions on the basis of 3D camera data gathered without object contact. It is therefore particularly suited to 3D motion analysis tasks in which the extra effort involved in multi-ocular experimental settings and the necessity of object signalling using target marks are to be avoided. The potential of the 3D camera matching approach has been demonstrated in two application scenarios in the field of research into human behaviour. As in the case of the use of 2.5D LST to mark and then classify hand gestures accompanying verbal communication, the implementation of 2.5D LST in the proposed procedures for the determination of interpersonal distance and body orientation within the framework of pedagogical research into conflict regulation between pairs of child-age friends facilitates the automatic, effective, objective and high-resolution (from both a temporal and spatial perspective) acquisition and evaluation of data with relevance to behaviour.
This Ph.D. thesis proposes the use of a novel 3D range imaging camera to gather data on human behaviour, and presents both a calibration tool developed for data processing purposes and a method for the contact-free determination of dense 3D motion vector fields. It therefore makes a contribution to current efforts in the field of the automated videographic documentation of bodily motion within the framework of dyadic interaction, and shows that photogrammetric methods can also deliver valuable results within the framework of motion evaluation tasks in the as-yet relatively untapped field of behavioural research.
|
Page generated in 0.0649 seconds