• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Performance Measurement and Analysis of Transactional Web Archiving

Maharshi, Shivam 19 July 2017 (has links)
Web archiving is necessary to retain the history of the World Wide Web and to study its evolution. It is important for the cultural heritage community. Some organizations are legally obligated to capture and archive Web content. The advent of transactional Web archiving makes the archiving process more efficient, thereby aiding organizations to archive their Web content. This study measures and analyzes the performance of transactional Web archiving systems. To conduct a detailed analysis, we construct a meaningful design space defined by the system specifications that determine the performance of these systems. SiteStory, a state-of-the-art transactional Web archiving system, and local archiving, an alternative archiving technique, are used in this research. We experimentally evaluate the performance of these systems using the Greek version of Wikipedia deployed on dedicated hardware on a private network. Our benchmarking results show that the local archiving technique uses a Web server’s resources more efficiently than SiteStory for one data point in our design space. Better performance than SiteStory in such scenarios makes our archiving solution favorable to use for transactional archiving. We also show that SiteStory does not impose any significant performance overhead on the Web server for the rest of the data points in our design space. / Master of Science
2

Možnosti In-memory reportingových nástrojů / Possibilities of In-memory reporting tools

Cígler, Lukáš January 2013 (has links)
Diploma thesis focuses on in-memory data processing, its use in reporting and Business Intelligence (BI) in general. The main goal of the theoretical part is to introduce the in memory principles, highlight the differences from hard drive data processing and overview possible implementations of in-memory technology in BI solution. The output of this section is an analysis of advantages and disadvantages of in-memory solutions in various perspectives. The practical part of the thesis consists of the performance benchmark that compares the performance of data processing using the in-memory principles and conventional hard drive methods. The performance comparison is realized in the reporting tools environment, QlikView for in-memory approach and Reporting Services for hard drive based method. Several data sets are used for testing in both mentioned tools. End of the chapter provides the assessment of testing results and discusses the strengths and weaknesses of both principles of data processing. The conclusion of this work discusses the advantages and disadvantages of in-memory data processing and defines the key questions that company management should ask before investing in innovation of the present BI solution. Moreover the conclusion contains recommendations for possible further follow-up work.
3

A COMPARISON OF DATA INGESTION PLATFORMS IN REAL-TIME STREAM PROCESSING PIPELINES

Tallberg, Sebastian January 2020 (has links)
In recent years there has been an increasing demand for real-time streaming applications that handle large volumes of data with low latency. Examples of such applications include real-time monitoring and analytics, electronic trading, advertising, fraud detection, and more. In a streaming pipeline the first step is ingesting the incoming data events, after which they can be sent off for processing. Choosing the correct tool that satisfies application requirements is an important technical decision that must be made. This thesis focuses entirely on the data ingestion part by evaluating three different platforms: Apache Kafka, Apache Pulsar and Redis Streams. The platforms are compared both on characteristics and performance. Architectural and design differences reveal that Kafka and Pulsar are more suited for use cases involving long-term persistent storage of events, whereas Redis is a potential solution when only short-term persistence is required. They all provide means for scalability and fault tolerance, ensuring high availability and reliable service. Two metrics, throughput and latency, were used in evaluating performance in a single node cluster. Kafka proves to be the most consistent in throughput but performs the worst in latency. Pulsar manages high throughput with low message sizes but struggles with larger message sizes. Pulsar performs the best in overall average latency across all message sizes tested, followed by Redis. The tests also show Redis being the most inconsistent in terms of throughput potential between different message sizes

Page generated in 0.2521 seconds