Global ETD Search

81	Extracting a Relational Database Schema from a Document Database Wheeler, Jared Thomas 01 January 2017 (has links) As NoSQL databases become increasingly used, more methodologies emerge for migrating from relational databases to NoSQL databases. Meanwhile, there is a lack of methodologies that assist in migration in the opposite direction, from NoSQL to relational. As software is being iterated upon, use cases may change. A system which was originally developed with a NoSQL database may accrue needs which require Atomic, Consistency, Isolation, and Durability (ACID) features that NoSQL systems lack, such as consistency across nodes or consistency across re-used domain objects. Shifting requirements could result in the system being changed to utilize a relational database. While there are some tools available to transfer data between an existing document database and existing relational database, there has been no work for automatically generating the relational database based upon the data already in the NoSQL system. Not taking the existing data into account can lead to inconsistencies during data migration. This thesis describes a methodology to automatically generate a relational database schema from the implicit schema of a document database. This thesis also includes details of how the methodology is implemented, and what could be enhanced in future works. Thesis University of North Florida UNF SQL Relational Database Document Database NoSQL MongoDB Database Databases and Information Systems
82	Proposta de um modelo para projetos lógicos gráficos para BDOR com implementação no ArgoUML. / Proposal of a model for logical graphic projects for ORDB with implementation in the ArgoUML. Castro, Thiago Rais de 18 April 2011 (has links) Investigou-se neste trabalho a proposta de um Modelo Lógico Gráfico para suporte à fase do Projeto Lógico em BDORs. O Modelo Lógico Gráfico proposto é uma extensão da UML para o Diagrama de Classes. A extensão deu-se por meio da elaboração de um Perfil UML, o qual foi disponibilizado em XMI para ser empregado em ferramentas CASE de diferentes fabricantes. Desenvolveu-se dois módulos para a ferramenta CASE ArgoUML. Esses módulos têm por finalidade a automação do desenvolvimento em BDORs, onde, a partir de um esquema gráfico, projetado utilizando-se o Modelo Lógico Gráfico proposto, gera-se código no padrão da SQL:2003 e no dialeto SQL utilizado pelo Oracle 11g. Foi proposta uma arquitetura baseada na ANSI/SPARC e na MDA para o projeto em BDORs que relaciona as fases do Projeto com as tecnologias empregadas para suportá-las. Por meio dessa arquitetura, destacam-se os pontos onde houve contribuição deste trabalho e os pontos que serão alvos de futuras pesquisas. Esta dissertação difunde os recursos existentes em BDORs e facilita a elaboração do Projeto Lógico em BDORs ao disponibilizar o modelo gráfico proposto e ao automatizar seu desenvolvimento na ferramenta CASE ArgoUML. / A Logical Graphic Model was proposed to support the Logical Design phase in ORDB. The Logical Graphic Model proposed is an extension of the UML Class Diagram. The extension was obtained by the elaboration of a UML Profile, which was released in XMI to be used in CASE tools from different manufacturers. Two modules were developed for ArgoUML CASE tool. These modules were designed to automate the development in ORDB, where, from a logical graphic, projected using the proposed Logical Graphic Model, creates the code standard SQL: 2003 and the SQL dialect used by Oracle 11g. An architecture based on the ANSI / SPARC and MDA for the project in ORDBs was proposed. This architecture associates the Project phases with the technologies used to support them. Through this architecture, the contributions of this paper and the subjects that will be target for future researches are highlighted. This dissertation diffuses the existing resources in ORDBs and facilitates the development of the Logical Design in ORDB, by disposing the proposed graphic model and automating its development in the ArgoUML CASE tool. Architecture for ORDBs Arquitetura em BDORs Banco de Dados Objeto-Relacional CASE tool Ferramenta CASE Logical Graphic Model Modelo Lógico Gráfico Object-Relational Database
83	Semantic knowledge extraction from relational databases Mogotlane, Kgotatso Desmond 05 1900 (has links) M. Tech. (Information Technology, Department of Information and Communications Technology, Faculty of Applied an Computer Sciences), Vaal University of Technolog / One of the main research topics in Semantic Web is the semantic extraction of knowledge stored in relational databases through ontologies. This is because ontologies are core components of the Semantic Web. Therefore, several tools, algorithms and frameworks are being developed to enable the automatic conversion of relational databases into ontologies. Ontologies produced with these tools, algorithms and frameworks needs to be valid and competent for them to be useful in Semantic Web applications within the target knowledge domains. However, the main challenges are that many existing automatic ontology construction tools, algorithms, and frameworks fail to address the issue of ontology verification and ontology competency evaluation. This study investigates possible solutions to these challenges. The study began with a literature review in the semantic web field. The review let to the conceptualisation of a framework for semantic knowledge extraction to deal with the abovementioned challenges. The proposed framework had to be evaluated in a real life knowledge domain. Therefore, a knowledge domain was chosen as a case study. The data was collected and the business rules of the domain analysed to develop a relational data model. The data model was further implemented into a test relational database using Oracle RDBMS. Thereafter, Protégé plugins were applied to automatically construct ontologies from the relational database. The resulting ontologies are further validated to match their structures against existing conceptual database-to-ontology mapping principles. The matching results show the performance and accuracy of Protégé plugins in automatically converting relational databases into ontologies. Finally, the study evaluated the resulting ontologies against the requirements of the knowledge domain. The requirements of the domain are modelled with competency questions (CQs) and mapped to the ontology using SPARQL queries design, execution and analysis against users’ views of CQs answers. Experiments show that, although users have different views of the answers to CQs, the execution of the SPARQL translations of CQs against the ontology does produce outputs instances that satisfy users’ expectations. This indicates that Protégé plugins generated ontology from relational database embodies domain and semantic features to be useful in Semantic Web applications. Semantic web Ontologies Ontology verification Ontology competency evalutation Knowledge domain Relational database Database-to-ontology mapping principles Protégé Oracle RDBMS SPARQL 005.7 Semantic web Ontologies (Information retrieval)
84	Proposta de um modelo para projetos lógicos gráficos para BDOR com implementação no ArgoUML. / Proposal of a model for logical graphic projects for ORDB with implementation in the ArgoUML. Thiago Rais de Castro 18 April 2011 (has links) Investigou-se neste trabalho a proposta de um Modelo Lógico Gráfico para suporte à fase do Projeto Lógico em BDORs. O Modelo Lógico Gráfico proposto é uma extensão da UML para o Diagrama de Classes. A extensão deu-se por meio da elaboração de um Perfil UML, o qual foi disponibilizado em XMI para ser empregado em ferramentas CASE de diferentes fabricantes. Desenvolveu-se dois módulos para a ferramenta CASE ArgoUML. Esses módulos têm por finalidade a automação do desenvolvimento em BDORs, onde, a partir de um esquema gráfico, projetado utilizando-se o Modelo Lógico Gráfico proposto, gera-se código no padrão da SQL:2003 e no dialeto SQL utilizado pelo Oracle 11g. Foi proposta uma arquitetura baseada na ANSI/SPARC e na MDA para o projeto em BDORs que relaciona as fases do Projeto com as tecnologias empregadas para suportá-las. Por meio dessa arquitetura, destacam-se os pontos onde houve contribuição deste trabalho e os pontos que serão alvos de futuras pesquisas. Esta dissertação difunde os recursos existentes em BDORs e facilita a elaboração do Projeto Lógico em BDORs ao disponibilizar o modelo gráfico proposto e ao automatizar seu desenvolvimento na ferramenta CASE ArgoUML. / A Logical Graphic Model was proposed to support the Logical Design phase in ORDB. The Logical Graphic Model proposed is an extension of the UML Class Diagram. The extension was obtained by the elaboration of a UML Profile, which was released in XMI to be used in CASE tools from different manufacturers. Two modules were developed for ArgoUML CASE tool. These modules were designed to automate the development in ORDB, where, from a logical graphic, projected using the proposed Logical Graphic Model, creates the code standard SQL: 2003 and the SQL dialect used by Oracle 11g. An architecture based on the ANSI / SPARC and MDA for the project in ORDBs was proposed. This architecture associates the Project phases with the technologies used to support them. Through this architecture, the contributions of this paper and the subjects that will be target for future researches are highlighted. This dissertation diffuses the existing resources in ORDBs and facilitates the development of the Logical Design in ORDB, by disposing the proposed graphic model and automating its development in the ArgoUML CASE tool. Arquitetura em BDORs Banco de Dados Objeto-Relacional Ferramenta CASE Modelo Lógico Gráfico Architecture for ORDBs CASE tool Logical Graphic Model Object-Relational Database
85	Modelagem e implementação de banco de dados clínicos e moleculares de pacientes com câncer e seu uso para identificação de marcadores em câncer de pâncreas / Database design and implementation of clinical and molecular data of cancer patients and its application for biomarker discovery in pancreatic cancer Bertoldi, Ester Risério Matos 20 October 2017 (has links) O adenocarcinoma pancreático (PDAC) é uma neoplasia de difícil diagnóstico precoce e cujo tratamento não tem apresentado avanços expressivos desde a última década. As tecnologias de sequenciamento de nova geração (next generation sequencing - NGS) podem trazer importantes avanços para a busca de novos marcadores para diagnóstico de PDACs, podendo também contribuir para o desenvolvimento de terapias individualizadas. Bancos de dados são ferramentas poderosas para integração, padronização e armazenamento de grandes volumes de informação. O objetivo do presente estudo foi modelar e implementar um banco de dados relacional (CaRDIGAn - Cancer Relational Database for Integration and Genomic Analysis) que integra dados disponíveis publicamente, provenientes de experimentos de NGS de amostras de diferentes tipos histopatológicos de PDAC, com dados gerados por nosso grupo no IQ-USP, facilitando a comparação entre os mesmos. A funcionalidade do CaRDIGAn foi demonstrada através da recuperação de dados clínicos e dados de expressão gênica de pacientes a partir de listas de genes candidatos, associados com mutação no oncogene KRAS ou diferencialmente expressos em tumores identificados em dados de RNAseq gerados em nosso grupo. Os dados recuperados foram utilizados para a análise de curvas de sobrevida que resultou na identificação de 11 genes com potencial prognóstico no câncer de pâncreas, ilustrando o potencial da ferramenta para facilitar a análise, organização e priorização de novos alvos biomarcadores para o diagnóstico molecular do PDAC. / Pancreatic Ductal Adenocarcinoma (PDAC) is a type of cancer difficult to diagnose early on and treatment has not improved over the last decade. Next Generation Sequencing (NGS) technology may contribute to discover new biomarkers, develop diagnose strategies and personalised therapy applications. Databases are powerfull tools for data integration, normalization and storage of large data volumes. The main objective of this study was the design and implementation of a relational database to integrate publicly available data of NGS experiments of PDAC pacients with data generated in by our group at IQ-USP, alowing comparisson between both data sources. The database was called CaRDIGAn (Cancer Relational Database for Integration and Genomic Analysis) and its funcionalities were tested by retrieving clinical and expression data of public data of genes differencially expressed genes in our samples or genes associated with KRAS mutation. The output of those queries were used to fit survival curves of patients, which led to the identification of 11 genes potencially usefull for PDAC prognosis. Thus, CaRDIGAn is a tool for data storage and analysis, with promissing applications to identification and priorization of new biomarkers for molecular diagnosis in PDAC. Banco de dados Cancer Câncer de pâncreas CaRDIGAn CaRDIGAn Database Database design Ensembl ICGC Modelo entidade-relacionamento NGS Pancreatic ductal adenocarcinoma Relational database TCGA
86	XML και σχεσιακές βάσεις δεδομένων: πλαίσιο αναφοράς και αξιολόγησης / XML and relational databases: a frame of report and evaluation Παλιανόπουλος, Ιωάννης 16 May 2007 (has links) Η eXtensible Markup Language (XML) είναι εμφανώς το επικρατέστερο πρότυπο για αναπαράσταση δεδομένων στον Παγκόσμιο Ιστό. Αποτελεί μια γλώσσα περιγραφής δεδομένων, κατανοητή τόσο από τον άνθρωπο, όσο και από τη μηχανή. Η χρήση της σε αρχικό στάδιο περιορίστηκε στην ανταλλαγή δεδομένων, αλλά λόγω της εκφραστικότητάς της (σε αντίθεση με το σχεσιακό μοντέλο) μπορεί να αποτελέσει ένα αποτελεσματικό \"όχημα\" μεταφοράς και αποθήκευσης πληροφορίας. Οι σύγχρονες εφαρμογές κάνουν χρήση της τεχνολογίας XML εξυπηρετώντας ανάγκες διαλειτουργικότητας και επικοινωνίας. Ωστόσο, θεωρείται βέβαιο ότι η χρήση της σε επίπεδο υποδομής θα ενδυναμώσει περαιτέρω τις σύγχρονες εφαρμογές. Σε επίπεδο υποδομής, μια βάση δεδομένων που διαχειρίζεται την γλώσσα XML είναι σε θέση να πολλαπλασιάσει την αποδοτικότητά της, εφόσον η βάση δεδομένων μετατρέπεται σε βάση πληροφορίας. Έτσι, όσο οι εφαρμογές γίνονται πιο σύνθετες και απαιτητικές, η ενδυνάμωση των βάσεων δεδομένων με τεχνολογίες που φέρουν/εξυπηρετούν τη σημασιολογία των προβλημάτων υπόσχεται αποτελεσματικότερη αντιμετώπιση στο παραπάνω μέτωπο. Αλλά ποιος είναι ο καλύτερος τρόπος αποδοτικού χειρισμού των XML εγγράφων (XML documents); Με μια πρώτη ματιά η απάντηση είναι προφανής. Εφόσον ένα XML έγγραφο αποτελεί παράδειγμα μιας σχετικά νέας τεχνολογίας, γιατί να μη χρησιμοποιηθούν ειδικά συστήματα για το χειρισμό της; Αυτό είναι πράγματι μια βιώσιμη προσέγγιση και υπάρχει σημαντική δραστηριότητα στην κοινότητα των βάσεων δεδομένων που εστιάζει στην εκμετάλλευση αυτής της προσέγγισης. Μάλιστα, για το σκοπό αυτό, έχουν δημιουργηθεί ειδικά συστήματα βάσεων δεδομένων, οι επονομαζόμενες \"Εγγενείς XML Βάσεις Δεδομένων\" (Native XML Databases). Όμως, το μειονέκτημα της χρήσης τέτοιων συστημάτων είναι ότι αυτή η προσέγγιση δεν αξιοποιεί την πολυετή ερευνητική δραστηριότητα που επενδύθηκε για την τεχνολογία των σχεσιακών βάσεων δεδομένων. Είναι πράγματι γεγονός ότι δεν αρκεί η σχεσιακή τεχνολογία και επιβάλλεται η ανάγκη για νέες τεχνικές; Ή μήπως με την κατάλληλη αξιοποίηση των υπαρχόντων συστημάτων μπορεί να επιτευχθεί ποιοτική ενσωμάτωση της XML; Σε αυτήν την εργασία γίνεται μια μελέτη που αφορά στην πιθανή χρησιμοποίηση των σχεσιακών συστημάτων βάσεων δεδομένων για το χειρισμό των XML εγγράφων. Αφού αναλυθούν θεωρητικά οι τρόποι με τους οποίους γίνεται αυτό, στη συνέχεια εκτιμάται πειραματικά η απόδοση σε δύο από τα πιο δημοφιλή σχεσιακά συστήματα βάσεων δεδομένων. Σκοπός είναι η χάραξη ενός πλαισίου αναφοράς για την αποτίμηση και την αξιολόγηση των σχεσιακών βάσεων δεδομένων που υποστηρίζουν XML (XML-enabled RDBMSs). / The eXtensible Markup Language (XML) is obviously the prevailing model for data representation in the World Wide Web (WWW). It is a data description language comprehensible by both humans and computers. Its usage in an initial stage was limited to the exchange of data, but it can constitute an effective \"vehicle\" for transporting, handling and storing of information, due to its expressiveness (contrary to the relational model). Contemporary applications make heavy use of the XML technology in order to support communication and interoperability . However, supporting XML at the infrastructure level would reduce application development time, would make applications almost automatically complient to standards and would make them less error prone. In terms of infrastructure, a database able to handle XML properly would be beneficial to a wide range of applications thus multiplying its efficiency. In this way, as long as the applications become more complex and demanding, the strengthening of databases with technologies that serve the nature of problems, promises more effective confrontation with this topic. But how can XML documents be supported at the infrastructure level? At a first glance, the question is rhetorical. Since XML constitutes a relatively new technology, new XML-aware infrastructures can be built from scratch. This is indeed a viable approach and there is a considerable activity in the research community of databases, which focuses on the exploitation of this approach. In particular, this is the reason why special database systems have been created, called \"Native XML Databases\". However, the disadvantage of using such systems is that this approach does not build on existing knowledge currently present in the relational database field. The research question would be whether relational technology is able to support correctly XML data. In this thesis, we present a study concerned with the question whether relational database management systems (RDBMSs) provide suitable ground for handling XML documents. Having theoretically analyzed the ways with which RDBMSs handle XML, the performance in two of the most popular relational database management systems is then experimentally assessed. The aim is to draw a frame of report on the assessment and the evaluation of relational database management systems that support XML (XML-enabled RDBMSs). Έγγραφο-κεντρικό Δεδομένο-κεντρικό Μέθοδοι αποθήκευσης Ανάκτηση Μετροπρόγραμμα Ρυθμοαπόδοση 005.756 XML Relational database management systems Document-centric Data-centric Storage methods Retrieval Benchmark Throughput
87	Μαζική ανάλυση δεδομένων κυτταρομετρίας ροής με τη χρήση σχεσιακών βάσεων δεδομένων Αθανασοπούλου, Πολυξένη 31 August 2012 (has links) Η κυτταρομετρία ροής (Flow Cytometry–FC), είναι μία σύγχρονη αυτοματοποιημένη τεχνική ανάλυσης των φυσικοχημικών χαρακτηριστικών των κυττάρων και των σωματιδίων, η οποία επιτρέπει την μεμονωμένη μέτρησή τους, καθώς διέρχονται σε νηματική ροή από ένα σταθερό σημείο, που προσπίπτει ακτίνα laser. Η ουσιαστική χρήση της FC είναι η προσφορά της σε διάγνωση και παρακολούθηση ασθενών με νοσήματα που συνοδεύονται από παρουσία παθολογικών κυττάρων σε διάφορα βιολογικά υγρά ή και στερεούς ιστούς κατάλληλα επεξεργασμένους. Το αποτέλεσμα της κυτταρομετρικής ανάλυσης είναι μία πληθώρα μετρήσεων φθορισμού, καθώς και των δύο μετρήσεων πρόσθιου (Forward Scatter–FS) και πλάγιου (Side Scatter-SS) σκεδασμού, που εξαρτώνται από τα φυσικά χαρακτηριστικά κάθε κυττάρου. Μετά την ανάλυση των δεδομένων από τον Ηλεκτρονικό Υπολογιστή (Η/Υ) του κυτταρομετρητή, τα αποτελέσματα παρουσιάζονται υπό τη μορφή μονοπαραμετρικών ή πολυπαραμετρικών κατανομών. Στην ανάλυση που χρησιμοποιήθηκε (με χρήση 5 φθοριοχρωμάτων), ο κυτταρομετρητής ροής παρήγαγε 7 τιμές για κάθε ένα από τα 30.000 κύτταρα περίπου που μετρήθηκαν σε κάθε πρωτόκολλο. Με τη χρήση των Η/Υ μπορούμε να αναλύσουμε γρήγορα και αξιόπιστα όλον αυτό τον μεγάλο όγκο δεδομένων εφαρμόζοντας μοντέλα βάσεων δεδομένων. Η βασική δομή του σχεσιακού μοντέλου δεδομένων αναπαριστάται με ένα πίνακα, στον οποίο αποθηκεύονται δεδομένα, σε στήλες και γραμμές, τα οποία αφορούν μία συγκεκριμένη οντότητα. Οι σχέσεις των πινάκων περιγράφουν τoν τρόπο σύνδεσης διαφορετικών οντοτήτων, οι οποίες συνδυαστικά δημιουργούν λογικούς πίνακες, που με τη σειρά τους περιγράφουν πιο σύνθετες οντότητες. Κατά αυτόν τον τρόπο μπορούμε να κάνουμε περαιτέρω συγκρίσεις μεταξύ των εξετάσεων των ασθενών, που ίσως καταλήξουν σε ευνοϊκά συμπεράσματα, όσον αφορά την πρόγνωση και την θεραπεία κυρίως των νεοπλασματικών νοσημάτων του αίματος. Ο ρόλος της FC σε αιματολογικά νοσήματα όπως τα μυελοδυσπλαστικά σύνδρομα (ΜΔΣ) είναι ακόμα υπό διερεύνηση. Τα ΜΔΣ είναι νοσήματα με σημαντική κλινική και αιματολογική ετερογένεια, κάτι που καθιστά σαφή την ανάγκη μαζικής ανάλυσης των δεδομένων τους, για την αναγνώριση ομοιόμορφων υποομάδων με κοινά γνωρίσματα και άρα ενός πληροφοριακού μοντέλου ανάλυσης που θα διευκολύνει την λήψη των κατάλληλων θεραπευτικών επιλογών.Η παρούσα εργασία ασχολείται με τις πολυπαραμετρικές εξετάσεις των ΜΔΣ, την πληροφορία των οποίων είναι ικανή να παρέχει η FC. Θα γίνει προσπάθεια να καταγραφούν αναλυτικά όλα τα απαραίτητα βήματα, έτσι ώστε σε δεύτερο χρόνο να αναλυθεί μαζικά όλη αυτή η πληροφορία μέσω ενός σχεσιακού μοντέλου βάσεων δεδομένων. / Flow cytometry (Flow Cytometry-FC), is a modern automated technical analysis of the physicochemical characteristics of cells and particles, which allows the individual measuring them as they pass in threaded flow from a fixed point, incident beam laser. The effective use of FC is offering a diagnosis and monitoring of patients with diseases associated with the presence of abnormal cells in various biological fluids and solid tissues or processed properly. The results of cytometric analysis is a plethora of fluorescence measurements and measurements of both anterior (Forward Scatter-FS) and lateral (Side Scatter-SS) dispersion, which depends on the physical characteristics of each cell. After analyzing the data from the PC (H / H) on the cytometer, the results presented in the form monoparametric or multi parameter distributions. The analysis used (using fluorochrome 5), the flow cytometer produced 7 values for each of the 30,000 or so which cells were measured in each protocol. By using the H / H can be analyzed quickly and reliably throughout this large volume of data by applying models of databases. The basic structure of relational data model is represented by a table that stores data in columns and rows, which relate to a specific entity. Relations ton of tables describing how to connect different entities, which in combination create logical tables, which in turn describe more complex entities. In this way we can make further comparisons between the examinations of patients, which may lead to favorable conclusions regarding the prognosis and treatment of neoplastic diseases, especially blood. The role of FC in hematological diseases such as myelodysplastic syndromes (MDS) is still under investigation. MDS is a disease with significant clinical and haematological heterogeneity, which makes clear the need for mass analysis of their data, to identify subgroups with common standard features and thus an informative analysis model that will facilitate the adoption of appropriate therapeutic epilogon.I present work dealing with multivariate MDS tests, information which is capable of providing the FC. I try to record in detail all the necessary steps so that a second time to analyze all this mass of information via a relational database model. Κυτταρομετρία ροής Ανάλυση δεδομένων 616.075 82 Flow cytometry Myelodysplastic syndromes (MDS) Relational database models Data analysis
88	Návrh databáze pro připojení systému SAP jako zdroje dat pro webovou aplikaci / Database design for connecting SAP as a data source for a Web application MARHOUN, Lukáš January 2016 (has links) The thesis deals with connecting SAP ERP system via local database system MS SQL Server using the tools SAP BI, data synchronization between systems and advanced usage of T-SQL language for preparing data for web applications and reports written in PHP. The thesis contains a brief overview of the SAP system and the possibility of connecting to the SAP system. The general principles of described solution can be used in conjunction with other systems and programming languages.
89	Modelagem e implementação de banco de dados clínicos e moleculares de pacientes com câncer e seu uso para identificação de marcadores em câncer de pâncreas / Database design and implementation of clinical and molecular data of cancer patients and its application for biomarker discovery in pancreatic cancer Ester Risério Matos Bertoldi 20 October 2017 (has links) O adenocarcinoma pancreático (PDAC) é uma neoplasia de difícil diagnóstico precoce e cujo tratamento não tem apresentado avanços expressivos desde a última década. As tecnologias de sequenciamento de nova geração (next generation sequencing - NGS) podem trazer importantes avanços para a busca de novos marcadores para diagnóstico de PDACs, podendo também contribuir para o desenvolvimento de terapias individualizadas. Bancos de dados são ferramentas poderosas para integração, padronização e armazenamento de grandes volumes de informação. O objetivo do presente estudo foi modelar e implementar um banco de dados relacional (CaRDIGAn - Cancer Relational Database for Integration and Genomic Analysis) que integra dados disponíveis publicamente, provenientes de experimentos de NGS de amostras de diferentes tipos histopatológicos de PDAC, com dados gerados por nosso grupo no IQ-USP, facilitando a comparação entre os mesmos. A funcionalidade do CaRDIGAn foi demonstrada através da recuperação de dados clínicos e dados de expressão gênica de pacientes a partir de listas de genes candidatos, associados com mutação no oncogene KRAS ou diferencialmente expressos em tumores identificados em dados de RNAseq gerados em nosso grupo. Os dados recuperados foram utilizados para a análise de curvas de sobrevida que resultou na identificação de 11 genes com potencial prognóstico no câncer de pâncreas, ilustrando o potencial da ferramenta para facilitar a análise, organização e priorização de novos alvos biomarcadores para o diagnóstico molecular do PDAC. / Pancreatic Ductal Adenocarcinoma (PDAC) is a type of cancer difficult to diagnose early on and treatment has not improved over the last decade. Next Generation Sequencing (NGS) technology may contribute to discover new biomarkers, develop diagnose strategies and personalised therapy applications. Databases are powerfull tools for data integration, normalization and storage of large data volumes. The main objective of this study was the design and implementation of a relational database to integrate publicly available data of NGS experiments of PDAC pacients with data generated in by our group at IQ-USP, alowing comparisson between both data sources. The database was called CaRDIGAn (Cancer Relational Database for Integration and Genomic Analysis) and its funcionalities were tested by retrieving clinical and expression data of public data of genes differencially expressed genes in our samples or genes associated with KRAS mutation. The output of those queries were used to fit survival curves of patients, which led to the identification of 11 genes potencially usefull for PDAC prognosis. Thus, CaRDIGAn is a tool for data storage and analysis, with promissing applications to identification and priorization of new biomarkers for molecular diagnosis in PDAC. Banco de dados Câncer de pâncreas CaRDIGAn Ensembl ICGC Modelo entidade-relacionamento NGS TCGA Cancer CaRDIGAn Database Database design Pancreatic ductal adenocarcinoma Relational database
90	Metadados de Bancos de Dados Relacionais: Extração e Exposição com o Protocolo OAI-PMH / Metadata of Relacional Database: Extraction and ExpositionWith OAI-PMH Protocol KOWATA, Elisabete Tomomi 11 September 2011 (has links) Made available in DSpace on 2014-07-29T14:57:50Z (GMT). No. of bitstreams: 1 Dissertacao Elisabete T Kowata.pdf: 2397519 bytes, checksum: df1ed4bd74a16c5e66a0ff4d7f8f9825 (MD5) Previous issue date: 2011-09-11 / Information about a particular subject can be stored in different repositories such as databases, digital libraries, spreadsheets, text files, web pages etc. In this context of heterogeneous data sources, to query, possibly in natural language, to integrate information and to promote interoperability are tasks that depend, among other factors, on the prior knowledge that an user has regarding location, owner, content description of each information source etc. More specifically, in the case of database, this information are not usually stored in a catalogue of the database management system and to obtain is necessary to resort to the administrator s knowledge database. Another factor is the absence of search engines to databases in the web that access and make available the information in those repositories, data are limited due to the organizations themselves. In a shared information environment, it is highly relevant to make possible access to metadata that describe a data source, regardlessly of the device and format in which is stored. This study aims to describe a mechanism to promote interoperability of relational databases with other sources of information through the extraction and exposing of metadata using OAI-PMH / Informações sobre um determinado assunto podem estar armazenadas em diferentes repositórios como banco de dados, bibliotecas digitais, planilhas eletrônicas, arquivos textos, páginas na web etc. Nesse contexto de fontes de dados heterogêneas, consultar, possivelmente em linguagem natural, integrar informações e promover interoperabilidade são tarefas que dependem, dentre outros fatores, do conhecimento prévio que um usuário tem sobre a localização, o proprietário, a descrição do conteúdo de cada fonte de informação. Mais especificamente, no caso de bancos de dados, essas informações não são, em geral, armazenadas no catálogo de um sistema gerenciador de bancos de dados; para obtê-las é necessário recorrer ao conhecimento do administrador desse banco. Outro fator que evidencia essa dependência é a ausência de mecanismos de busca a bancos de dados na web que acessam e tornam disponíveis as informações contidas nesses repositórios, devido ao fato desses dados estarem limitados às próprias organizações. Em um ambiente de compartilhamento de informações, é altamente relevante tornar possível o acesso aos metadados que descrevem uma fonte de dados, independentemente do meio e do formato em que esteja armazenada. Este trabalho tem como objetivo descrever um mecanismo para promover interoperabilidade de bancos de dados relacionais com outras fontes de informações, por meio da extração e exposição dos metadados usando o protocolo OAI-PMH. Bancos de Dados Relacionais Metadados Interoperabilidade Dublin Core OAI-PMH Relational Database Metadata Interoperability Dublin Core OAI-PMH

Search results