Global ETD Search

1	Characterizing Popularity Dynamics of User-generated Videos: A Category-based Study of YouTube 2013 August 1900 (has links) Understanding the growth pattern of content popularity has become a subject of immense interest to Internet service providers, content makers and on-line advertisers. This understanding is also important for the sustainable development of content distribution systems. As an approach to comprehend the characteristics of this growth pattern, a significant amount of research has been done in analyzing the popularity growth patterns of YouTube videos. Unfortunately, no work has been done that intensively investigates the popularity patterns of YouTube videos based on video object category. In this thesis, an in-depth analysis of the popularity pattern of YouTube videos is performed, considering the categories of videos. Metadata and request patterns were collected by employing category-specific YouTube crawlers. The request patterns were observed for a period of five months. Results confirm that the time varying popularity of di fferent YouTube categories are conspicuously diff erent, in spite of having sets of categories with very similar viewing patterns. In particular, News and Sports exhibit similar growth curves, as do Music and Film. While for some categories views at early ages can be used to predict future popularity, for some others predicting future popularity is a challenging task and require more sophisticated techniques, e.g., time-series clustering. The outcomes of these analyses are instrumental towards designing a reliable workload generator, which can be further used to evaluate diff erent caching policies for YouTube and similar sites. In this thesis, workload generators for four of the YouTube categories are developed. Performance of these workload generators suggest that a complete category-specific workload generator can be developed using time-series clustering. Patterns of users' interaction with YouTube videos are also analyzed from a dataset collected in a local network. This shows the possible ways of improving the performance of Peer-to-Peer video distribution technique along with a new video recommendation method. YouTube categories growth patterns of on-line content clustering algorithms K-SC algorithm workload generation.
2	Generating and Analyzing Synthetic Workloads using Iterative Distillation Kurmas, Zachary Alan 14 May 2004 (has links) The exponential growth in computing capability and use has produced a high demand for large, high-performance storage systems. Unfortunately, advances in storage system research have been limited by (1) a lack of evaluation workloads, and (2) a limited understanding of the interactions between workloads and storage systems. We have developed a tool, the Distiller that helps address both limitations. Our thesis is as follows: Given a storage system and a workload for that system, one can automatically identify a set of workload characteristics that describes a set of synthetic workloads with the same performance as the workload they model. These representative synthetic workloads increase the number of available workloads with which storage systems can be evaluated. More importantly, the characteristics also identify those workload properties that affect disk array performance, thereby highlighting the interactions between workloads and storage systems. This dissertation presents the design and evaluation of the Distiller. Specifically, our contributions are as follows. (1) We demonstrate that the Distiller finds synthetic workloads with at most 10% error for six out of the eight workloads we tested. (2) We also find that all of the potential error metrics we use to compare workload performance have limitations. Additionally, although the internal threshold that determines which attributes the Distiller chooses has a small effect on the accuracy of the final synthetic workloads, it has a large effect on the Distiller's running time. Similarly, (3) we find that we can reduce the precision with which we measure attributes and only moderately reduce the resulting synthetic workload's accuracy. Finally, (4) we show how to use the information contained in the chosen attributes to predict the performance effects of modifying the storage system's prefetch length and stripe unit size. Workload characterization Synthetic workload generation Performance measurement Disk array Storage system
3	MACHINE LEARNING-ASSISTED LOAD TESTING Isaku, Erblin January 2021 (has links) The increasing worldwide demand for software systems involved in society has led to the need where not only functionality is fundamental and addressed, but end-user satisfaction in terms of availability, throughput, and response time is essential and should be preserved. Thus, systems must be evaluated at preset load levels to assess the non-functional quality problems from the closest perspective of real application use. In this context, where the problem involves a high and complex search space, a search-based approach for load test generation is required. This thesis proposes and evaluates an evolutionary search-based approach for load test generation using multi-objective optimization methods consisting of selection, crossover, and mutation operators. In this thesis, load testing is addressed as a multi-objective optimization problem by using four different evolutionary algorithms: Non-dominated Storing Genetic Algorithm II (NSGA-II), Pareto Archived Evolution Strategy (PAES), The Strength Pareto Evolutionary Algorithm 2 (SPEA2), Multi-Objective Cellular Genetic Algorithm (MOCell) as well as a Random Search algorithm. Additionally, this study demonstrates the applicability of the proposed approach by running several experiments, aiming to compare the algorithms’ efficiency based on different quality indicators such as hypervolume, spread, and epsilon. Experimental results show that evolutionary search-based methods can be used to generate effective workloads. Since, all algorithms have found the optimal workload, having the hypervolume values to zero, we believe that the objectives of the problem could be combined as a single objective, hence scalarization techniques can be applicable. Based on the other quality indicators (Spread and Epsilon respectively), NSGA-II and MOCell tend to perform better compared to other algorithms. Finally, the study concludes that multi-objective evolutionary algorithms can be used for load testing purpose, obtaining better results in generating optimal workloads than an existing (adapted) model-free reinforcement learning approach. Performance testing load testing search-based testing workload generation machine learning evolutionary algorithms reinforcement learning Computer Sciences Datavetenskap (datalogi)
4	Automated Performance Test Generation and Comparison for Complex Data Structures - Exemplified on High-Dimensional Spatio-Temporal Indices Menninghaus, Mathias 23 August 2018 (has links) There exist numerous approaches to index either spatio-temporal or high-dimensional data. None of them is able to efficiently index hybrid data types, thus spatio-temporal and high-dimensional data. As the best high-dimensional indexing techniques are only able to index point-data and not now-relative data and the best spatio-temporal indexing techniques suffer from the curse of dimensionality, this thesis introduces the Spatio-Temporal Pyramid Adapter (STPA). The STPA maps spatio-temporal data on points, now-values on the median of the data set and indexes them with the pyramid technique. For high-dimensional and spatio-temporal index structures no generally accepted benchmark exists. Most index structures are only evaluated by custom benchmarks and compared to a tiny set of competitors. Benchmarks may be biased as a structure may be created to perform well in a certain benchmark or a benchmark does not cover a certain speciality of the investigated structures. In this thesis, the Interface Based Performance Comparison (IBPC) technique is introduced. It automatically generates test sets with a high code coverage on the system under test (SUT) on the basis of all functions defined by a certain interface which all competitors support. Every test set is performed on every SUT and the performance results are weighted by the achieved coverage and summed up. These weighted performance results are then used to compare the structures. An implementation of the IBPC, the Performance Test Automation Framework (PTAF) is compared to a classic custom benchmark, a workload generator whose parameters are optimized by a genetic algorithm and a specific PTAF alternative which incorporates the specific behavior of the systems under test. This is done for a set of two high-dimensional spatio-temporal indices and twelve variants of the R-tree. The evaluation indicates that PTAF performs at least as good as the other approaches in terms of minimal test cases with a maximized coverage. Several case studies on PTAF demonstrate its widespread abilities. performance tests test case generation benchmark generation workload generation indexing access methods high-dimensional spatio-temporal subway-track planning performance comparison algorithm engineering performance measurement java software engineering 54.62 - Datenstrukturen 54.52 - Software engineering D.2.5 - Testing and Debugging D.2.8 - Metrics E.1 - DATA STRUCTURES D.2.2 - Design Tools and Techniques ddc:004

1

Page generated in 0.1003 seconds