1 |
DATA MIGRATION FROM STANDARD SQL TO NoSQL2013 November 1900 (has links)
Currently two major database management systems are in use for dealing with data, the Relational Database Management System (RDBMS) also knows as standard SQL databases and the NoSQL databases. The RDBMS databases deal with structured data and the NoSQL databases with unstructured or semi-structured data. The RDBMS databases have been popular for many years but the NoSQL type is gaining popularity with the introduction of the internet and social media. Data flow from SQL to NoSQL or vice versa is very much possible in the near future due to the growing popularity of the NoSQL databases. The goal of this thesis is to analyze the data structures of the RDBMS and the NoSQL databases and to suggest a Graphical User Interface (GUI) tool that migrates the data from SQL to NoSQL databases. The relational databases have been in use and have dominated the industry for many years. In contrast, the NoSQL databases were introduced with the increased usage of the internet, social media, and cloud computing. The traditional relational databases guarantee data integrity whereas high availability and scalability are the main advantages of the NoSQL databases. This thesis presents a comparison of these two technologies. It compares the data structure and data storing techniques of the two technologies. The SQL databases store data differently as compared to the NoSQL databases due to their specific demands. The data stored in the relational databases is highly structured and normalized in most environments whereas the data in the NoSQL databases are mostly unstructured. This difference of the data structure helps in meeting the specific demands of these two systems. The NoSQL DBs are scalable with high availability due to the simpler data model but does not guarantee data consistency at all times. On the other hand the RDBMS systems are not easily scalable and available at the same time due to the complex data model but guarantees data consistency. This thesis uses CouchDB and MySQL to represent the NoSQL and standard SQL databases respectively. The aim of the iii research in this document is to suggest a methodology for data migration from the RDBMS databases to the document-based NoSQL databases. Data migration between the RDBMS and the NoSQL systems is anticipated because both systems are currently in use by many industry leaders. This thesis presents a Graphical User Interface as a starting point that enables the data migration from the RDBMS to the NoSQL databases. MySQL and CouchDB are used as the test databases for the relational and NoSQL systems respectively. This thesis presents an architecture and methodology to achieve this objective.
|
2 |
SQL database design static analysisDooms, Joshua Harold 14 October 2014 (has links)
Static analysis of database design and implementation is not a new idea. Many researchers have covered the topic in detail and defined a number of metrics that are well known within the research community. Unfortunately, unlike the use of metrics in code development, the use of these metrics has not been widely adopted within the development community. It seems that a disjunction exists between the research into database design metrics and the actual use of databases in industry. This paper describes new metrics that can be used in industry to ensure that a database's current implementation supports long term scalability, to support easily developed and maintainable code, or to guide developers towards functions or design elements that can be modified to improve scalability of their data systems. In addition, this paper describes the production of a tool designed to extract these metrics from SQL Server and includes feedback from professionals regarding the usefulness of the tool and the measures contained within its output. / text
|
3 |
AN IMPLEMENTATION OF A COMPLETE XML SYSTEM FOR TELEMETRY SYSTEM CONFIGURATIONPortnoy, Michael 10 1900 (has links)
ITC/USA 2005 Conference Proceedings / The Forty-First Annual International Telemetering Conference and Technical Exhibition / October 24-27, 2005 / Riviera Hotel & Convention Center, Las Vegas, Nevada / Creating a generic, multi-vendor data exchange system for transmitting telemetry configurations
between various systems is a daunting task. To date many different systems have been proposed
including relational databases (RDBMS), TMATS, and several different XML schemas.
Although many of these systems have been implemented, a complete, flexible solution has not
been developed.
This paper describes an implementation that is currently in use for exporting and importing a
complete telemetry system via XML. Using this system, an engineer can import an entire
telemetry configuration, a partial telemetry configuration, or even just a single measurement
(parameter). As a result, the gap between user database systems and the airborne instrumentation
vendor’s configuration software (IVCS) is seamlessly bridged. This provides many benefits
including: the ability to rapidly change configurations, data entry error avoidance, version
control, the protection of sensitive information, and configuration reusability. This system
allows for the configuration of all aspects of the telemetry setup including data acquisition
hardware, transmitters, ground stations, and recorders. In addition, the recorder settings and the
definition of the data that are to be recorded are coupled and linked to the rest of the telemetry
configuration, which facilitates future data recovery.
|
4 |
Management of Big Annotations in Relational Database Management SystemsIbrahim, Karim 24 April 2014 (has links)
Annotations play a key role in understanding and describing the data, and annotation management has become an integral component in most emerging applications such as scientific databases. Scientists need to exchange not only data but also their thoughts, comments and annotations on the data as well. Annotations represent comments, Lineage of data, description and much more. Therefore, several annotation management techniques have been proposed to efficiently and abstractly handle the annotations. However, with the increasing scale of collaboration and the extensive use of annotations among users and scientists, the number and size of the annotations may far exceed the size of the original data itself. However, current annotation management techniques don’t address large scale annotation management. In this work, we propose three chapters to that tackle the Big annotations from three different perspectives (1) User-Centric Annotation Propagation, (2) Proactive Annotation Management and (3) InsightNotes Summary-Based Querying. We capture users' preferences in profiles and personalizes the annotation propagation at query time by reporting the most relevant annotations (per tuple) for each user based on time plan. We provide three Time-Based plans, support static and dynamic profiles for each user. We support a proactive annotation management which suggests data tuples to be annotated in case new annotation has a reference to a data value and user doesn’t annotate the data precisely. Moreover, we provide an extension on the InsightNotes: Summary-Based Annotation Management in Relational Databases by adding query language that enable the user to query the annotation summaries and add predicates on the annotation summaries themselves. Our system is implemented inside PostgreSQL.
|
5 |
What can the .NET RDBMS developer do? A brief survey of impedance mismatch solutions for the .NET developerFiduk, Kenneth Walter, 1980- 26 August 2010 (has links)
Nearly all modern software applications, from the simplest website user account system to the most complex, enterprise-level, completely-integrated infrastructure, utilize some sort of backend data storage and business logic that interacts with the backend. The ubiquitous nature of this backend/business dichotomy makes sense as the need to both store and manipulate data can be traced as far back as the Turing Machine in Computer Science. The most commonly used technologies for these two aspects are Relational Database Management Systems (RDBMS) for backend and Object-Oriented Programming (OOP) for business logic. However, these two methodologies are not immediately compatible and the inherent differences between data represented in RDBMS and data represented in OOP are not trivial.
Taking a .NET developer’s perspective, this report aims to explore the RDBMS/OO dichotomy and its inherent issues. Schema management theory and algebra are discussed to gain better perspective of the domain and a survey of existing solutions for the .NET environment is explored. Additionally, methods outside the mainstream are discussed. The advantages and disadvantages of each are weighed and presented to the reader to help aid in design implementations in the future. / text
|
6 |
Jämförelse av JavaScript och PHP : När data lagras som JSONObjekt i relationsdatabaser eller NoSql-databaser / Comparison of JavaScript and PHP : When data is stored as JSONObject in relational databases or NoSQL-databasesHonkavaara Dahl, Anton January 2016 (has links)
Trafiken till webbapplikationer ökar därför ställs det högre krav på att svarstiderna för användaren hålls nere även fast det blir högre trafik och mer data som behöver laddas till webbapplikationerna. Där det vanligaste språket för att hämta data ifrån databaser är PHP som är ett språk som varit med i många år. När det nu finns möjlighet till att skiva all kod i ett och samma språk som JavaScript med Node.js är frågan hur PHP står sig emot JavaScript i svarstider för användaren. Det blir också vanligare med NoSql databaser istället för RDBMS Detta arbete gör ett experiment där JavaScript och PHP ställs emot varandra och kollar hur dom påverkar svarstider. Där även programmeringspåken tar användning av en RDBMS och en NoSql databas. Där resultatet visar på att det inte går att skilja dom åt om man kollar på svarstider.
|
7 |
Time Series databaser för sensorsystem : En experimentell studie av prestanda för Time Series databaser för sensorsystem som grundas på: NoSQL eller RDBMS. / Time Series databases for sensor systemsWarrén, Linus, Tallkvist, Daniel January 2019 (has links)
Purpose – The purpose of this study is to recommend a database and its belonging database model which is optimized for a sensor system. There is a lack of comparisons for databases and data models for bigger sensor systems. The study also brings scientific support for whom wishes to build a sensor system like the one which is included in this paper. Method – This paper starts with a literature study, which purpose is to choose the databases and the database models to be included in the comparison. To achieve the purpose of the study, a quantitative approach has been chosen. The study follows the steps that defines an experimental study within software development according to Shari Lawrence Pfleeger. Four predefined cases are used to compare the databases and the different database models which has been obtained in the literature study. Findings – The literature study shows that Time Series DBMS is the recommended database model to use for implementing sensor systems. The findings of the study also show that TimescaleDB is the preferable database over InfluxDB in four of four predefined cases. The null hypothesis which has been admitted is rejected and the alternative hypothesis is accepted at 1% significance level. Implications – The implications of the paper is to enhance the knowledge about Time Series DBMS, specifically of TimescaleDB and InfluxDB for sensor systems. The result can be implemented and used when resembling sensor systems are created. According to the result of the experiment it is shown that TimescaleDB is better than InfluxDB for sensor systems with similar datastructure. Limitations – Two Time Series DBMS (TimescaleDB and InfluxDB) were used in the experiments in this paper. The experiments was is carried out in Azure and is limited to 10 vCPU:s that a standard account have access to. There were not many beacons available to use for creating testdata. Files with corresponding data that the beacon sends out was created to simulate beacons. Keywords – Time Series DBMS, NoSQL, RDBMS, TimescaleDB, InfluxDB, Sensor systems / Syfte – I problembeskrivningen framgår att det finns brist på vetenskapligt underlag för vilken sorts databas som är optimal att använda för ett sensorsystem. Det saknas jämförelser av prestanda mellan olika databaser och datamodeller i större sensorsystem. Studiens syfte är: ”Att rekommendera en databas och tillhörande databasmodell som är optimerad för ett sensorsystem” Metod – Studien inleds med en litteraturstudie för att genom teorin välja databas och databasmodeller som ska ingå i studien. För att uppnå syftet har en kvantitativ ansats valts. Studien följer de steg som Shari Lawrence Pfleeger definierar som en experimentell studie inom mjukvaruutveckling. Fyra fördefinierade fall används för att jämföra databaserna med olika databasmodeller som erhållits i litteraturstudien. Resultat - Litteraturstudien visar att Time Series DBMS är den databasmodell som rekommenderas att användas i ett sensorsystem. Studiens resultat visar att TimescaleDB presterar bättre än InfluxDB i fyra av fyra fördefinierade fall. Nollhypotesen som har ställts upp förkastas och en mothypotes antas vid 1% signifikansnivå. Implikationer - Studiens implikationer är att öka och fylla vissa kunskapshål kring Time Series DBMS, specifikt TimescaleDB och InfluxDB för sensorsystem. Resultatet kan tillämpas och användas när liknande sensorsystem skall implementeras. Enligt experimentets resultat visar det att TimescaleDB är bättre än InfluxDB för sensorsystem med liknande struktur. Begränsningar – Två Time Series DBMS (TimescaleDB och InfluxDB) ingår i denna studie som experimenten utfördes på. Experimenten utföres i Azure och var begränsade av de 10 vCPU:erna ett standardkonto har tillgång till att använda. Det fanns inte tillgång till ett stort antal beacons för att generera data till experimenten, så filer med motsvarande data skapades för att simulera beacons. Nyckelord - Time Series DBMS, NoSQL, RDBMS, TimescaleDB, InfluxDB, Sensorsystem
|
8 |
Desenvolvimento de operadores de agrupamento por similaridade em SGBD relacionais / Development of similarity group operators in Relational DBMSLaverde, Natan de Almeida 16 May 2018 (has links)
O operador de agrupamento e as funções de agregação são as principais ferramentas utilizadas para sumarizar dados em um Sistema de Gerenciamento de Base de Dados Relacionais (SGBDR). O operador de agrupamento funciona criando partições nos dados utilizando comparações por identidade, e permite que sejam aplicadas funções de agregação que retornam um único valor representando o grupo como um todo. Entretanto, para dados métricos, agrupamento utilizando identidade tem pouca utilidade. Neste caso, adotar o conceito de similaridade é frequentemente uma abordagem mais promissora. A literatura apresenta alguns operadores que podem agrupar os dados utilizando similaridade. Todos eles utilizam um limiar de valor de distância para atribuir os elementos aos grupos. No entanto, estes operadores não obtêm resultados satisfatórios quando a distribuição dos dados apresenta variações significativas na densidade de objetos em diferentes regiões do espaço. Para alcançar melhores resultados nestas situações, propusemos um novo operador que atribui os grupos utilizando uma eleição envolvendo grupos já atribuídos. Também propusemos generalizações, para os operadores existentes e propostos, para trabalhar com uma quantidade de vizinhos mais próximos e aproximação dos vizinhos mais próximos ao invés de um limiar de distância. Para possibilitar a inclusão destes operadores em SGBDR, propusemos uma extensão à Structured Query Language (SQL) e novas funções de agregação. Implementamos estes operadores em nosso framework em C++ usando a biblioteca Arboretum. Para avaliar os métodos propostos, analisamos tanto qualidade dos resultados quanto tempo de execução, utilizando conjuntos de dados reais e sintéticos. Os operadores propostos alcançaram melhores resultados quanto à qualidade de resultados, e mantiveram os tempos de execução similares. Os operadores que utilizam aproximação aos vizinhos mais próximos produziram resultados de qualidade similar quando comparados aos operadores que utilizando os vizinhos mais próximos, podendo ser executados em menor tempo que estes. / The grouping operator and aggregation functions are the primary tools used to summarize data inside a Relational Database Management Systems (RDBMS). The grouping operator works creating partitions in data using identity comparisons, and allow applying aggregation functions that return a single value that represent the entire group. However, for metric data, grouping by identity is seldom useful. In this case, adopting the concept of the similarity is often a better approach. The literature presents few operators that can group data using similarity. All of them use a distance threshold value to assign the elements in groups. However, these operators do not achieve satisfactory results when the data distribution present a significant variation in the density of objects in different regions of the space. To achieve better results in these situations, we have proposed a novel operator that assign groups using an election involving already assigned groups. We also proposed generalizations to existing and proposed operators to work with an amount of nearest neighbors and approximate neighbors instead of a distance threshold. To support these operators in RDBMS, we propose an extension to Structured Query Language (SQL) and new aggregation functions. Our proposed algorithms can run the proposed and existing operators. We implemented these operators in our framework in C++ using Arboretum library. To evaluate the proposed methods, we assess both results quality and the execution time, using both real and synthetic datasets. The proposed operators achieved better results comparing the quality and maintained similar executing time. The operators that use the approximate nearest neighbors produced similar quality results comparing with the operators that use the exact neighbors and can execute faster than that.
|
9 |
Uma experiência de consultas com palavras-chave em fontes de dados heterogêneas na web / An experience of keywords searching in heterogeneous data sources on the webFilgueiras, Alison Carlos 29 July 2013 (has links)
Submitted by Erika Demachki (erikademachki@gmail.com) on 2014-10-17T17:58:28Z
No. of bitstreams: 2
Dissertação - Alison Carlos Filgueiras - 2013.pdf: 3916567 bytes, checksum: 312992aa8f3f3d2a95d036654378912e (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Jaqueline Silva (jtas29@gmail.com) on 2014-10-17T20:31:36Z (GMT) No. of bitstreams: 2
Dissertação - Alison Carlos Filgueiras - 2013.pdf: 3916567 bytes, checksum: 312992aa8f3f3d2a95d036654378912e (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2014-10-17T20:31:36Z (GMT). No. of bitstreams: 2
Dissertação - Alison Carlos Filgueiras - 2013.pdf: 3916567 bytes, checksum: 312992aa8f3f3d2a95d036654378912e (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Previous issue date: 2013-07-29 / Context: keyword research is a highly used feature for retrieval of information
through the search engines available on the Internet. Much of the information in the
world, however, is not achieved by conventional search to be stored in databases,
relational most. The integrated search information from di erent data sources is
explored by several studies, still, no studies were found to bring e ective solutions
when it includes, among these data sources, relational databases. Objective: The
emphasis of this study is to present a solution for retrieval of information stored
in heterogeneous data sources using the OAI-PMH as a mechanism to enable
interoperability. Method: Implementing a system that runs queries for keywords
in heterogeneous data sources from the collection of metadata exposed to OAIPMH
data providers in. Furthermore, the proposal is for a web service that uses
public methods to allow information relational databases are returned without the
need for additional e orts, such as knowledge of the structure of the database
or use SQL. Results: The simulations produced a return of information from
metadata of digital objects and relational databases, obtained from data providers.
The query execution examples was successful in retrieving information on all data
sources surveyed. Conclusion: This work proposes a solution for information retrieval
stored in heterogeneous data sources. The proposed solution was feasible to allow
consultation by keywords in digital libraries and relational databases using the
OAI-PMH. The proposed web service enabled information relational databases were
obtained by external applications, without requiring / Contexto: Consulta com palavras-chave e um recurso altamente utilizado para
recupera ção de informa ções atrav és dos motores de busca dispon íveis na Internet.
Grande parte da informa ção existente no mundo, no entanto, não e alcan çada pelos
processos convencionais de busca por estar armazenada em bancos de dados, na
maioria relacionais. A busca integrada de informa ções de diferentes fontes de dados
e explorada por diversos trabalhos, entretanto, não foram encontrados estudos que
trouxessem solu ções efetivas quando se inclui, dentre essas fontes de dados, bancos
de dados relacionais. Objetivo: A ênfase deste estudo e apresentar uma solu ção para
recupera ção de informação armazenada em fontes de dados heterogêneas, utilizando
o protocolo OAI-PMH como mecanismo para viabilizar interoperabilidade.M étodo:
Implementa ção de um sistema que executa consultas por palavras-chave em fontes
de dados heterogêneas a partir da coleta de metadados expostos com o protocolo
OAI-PMH em provedores de dados. Al ém disso, e apresentada uma proposta de um
web service que utiliza m étodos p úblicos para permitir que as informa ções de bancos
de dados relacionais sejam retornadas sem a necessidade de esfor ços adicionais, tais
como conhecimento da estrutura do banco de dados ou uso de SQL. Resultados: As
simula ções produziram o retorno de informa ções a partir de metadados de objetos
digitais e bancos de dados relacionais, obtidos a partir de provedores de dados. A
execu ção de consultas exemplos foi bem sucedida na recupera ção de informa ções
em todas as fontes de dados pesquisadas. Conclusão: Este trabalho apresenta
uma proposta de solu ção para recupera ção de informa ção armazenada em fontes
de dados heterogêneas. A solu ção proposta mostrou-se vi ável ao permitir a consulta
por palavras-chave em bibliotecas digitais e bancos de dados relacionais utilizando o
protocolo OAI-PMH. O web service proposto permitiu que informa ções de bancos de
dados relacionais fossem obtidas por aplica ções externas, sem que estas necessitem
conhecer a estrutura dos bancos de dados consultados ou uma linguagem de consulta
como SQL.
|
10 |
Strategies for Encoding XML Documents in Relational Databases: Comparisons and Contrasts.Leonard, Jonathan Lee 06 May 2006 (has links)
The rise of XML as a de facto standard for document and data exchange has created a need to store and query XML documents in relational databases, today's de facto standard for data storage. Two common strategies for storing XML documents in relational databases, a process known as document shredding, are Interval encoding and ORDPATH Encoding. Interval encoding, which uses a fixed mapping for shredding XML documents, tends to favor selection queries, at a potential cost of O(N) for supporting insertion queries. ORDPATH Encoding, which uses a looser mapping for shredding XML, supports fixed-cost insertions, at a potential cost of longer-running selection queries. Experiments conducted for this research suggest that the breakeven point between the two algorithms occurs when users offer an average 1 insertion to every 5.6 queries, relative to documents of between 1.5 MB and 4 MB in size. However, heterogeneous tests of varying mixes of selects and inserts indicate that Interval always outperforms ORDPATH for mixes ranging from 76% selects to 88% selects. Queries for this experiment and sample documents were drawn from the XMark benchmark suite.
|
Page generated in 0.0315 seconds