• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Storage and Transformation for Data Analysis Using NoSQL / Lagring och transformation för dataanalys med hjälp av NoSQL

Nilsson, Christoffer, Bengtson, John January 2017 (has links)
It can be difficult to choose the right NoSQL DBMS, and some systems lack sufficient research and evaluation. There are also tools for moving and transforming data between DBMS' in order to combine or use different systems for different use cases. We have described a use case, based on requirements related to the quality attributes Consistency, Scalability, and Performance. For the Performance attribute, focus is fast insertions and full-text search queries on a large dataset of forum posts. The evaluation was performed on two NoSQL DBMS' and two tools for transforming data between them. The DBMS' are MongoDB and Elasticsearch, and the transformation tools are NotaQL and Compose's Transporter. The purpose is to evaluate three different NoSQL systems, pure MongoDB, pure Elasticsearch and a combination of the two. The results show that MongoDB is faster when performing simple full-text search queries, but otherwise slower. This means that Elasticsearch is the primary choice regarding insertion and complex full-text search query performance. MongoDB is however regarded as a more stable and well-tested system. When it comes to scalability, MongoDB is better suited for a system where the dataset increases over time due to its simple addition of more shards. While Elasticsearch is better for a system which starts off with a large amount of data since it has faster insertion speeds and a more effective process for data distribution among existing shards. In general NotaQL is not as fast as Transporter, but can handle aggregations and nested fields which Transporter does not support. A combined system using MongoDB as primary data store and Elasticsearch as secondary data store could be used to achieve fast full-text search queries for all types of expressions, simple and complex.

Page generated in 0.024 seconds