1 |
Workload characterization and customer interaction at e-commerce web serversWang, Qing 27 October 2004
Electronic commerce servers have a significant presence in today's Internet. Corporations want to maintain high availability, sufficient capacity, and satisfactory performance for their E-commerce Web systems, and want to provide satisfactory services to customers. Workload characterization and the analysis of customers' interactions with Web sites are the bases upon which to analyze server performance, plan system capacity, manage system resources, and personalize services at the Web site. To date, little empirical evidence has been discovered that identifies the characteristics for Web workloads of E-commerce systems and the behaviours of customers.
This thesis analyzes the Web access logs at public Web sites for three organizations: a car rental company, an IT company, and the Computer
Science department of the University of Saskatchewan. In these case studies, the
characteristics of Web workloads are explored at the request level, functionlevel, resource level, and session level; customers' interactions
with Web sites are analyzed by identifying
and characterizing session groups.
The main E-commerce Web workload characteristics and performance implications are: i) The requests for dynamic Web objects are an important
part of the workload. These requests should be characterized separately since the system processes them differently; ii) Some popular image files, which are embedded in the same Web page, are always requested together. If these files are requested and sent in a bundle, a system will greatly reduce the overheads in processing requests for these files; iii) The
percentage of requests for each Web page category tends to be stable in the workload when the time scale is large enough. This observation is helpful in forecasting workload composition; iv) the Secure Socket Layer protocol (SSL) is heavily used and most Web objects are either requested primarily through SSL or primarily not through SSL; and v) Session groups of different characteristics are identified for all logs. The analysis of session groups may be helpful in improving system performance, maximizing revenue throughput of the system, providing better services to customers, and managing and planning system resources.
A hybrid clustering algorithm, which is a combination of the minimum spanning tree method and k-means clustering algorithm, is proposed to identify session clusters. Session clusters obtained using the three session representations
Pages Requested, Navigation Pattern, and Resource Usage are similar enough so that it is possible to use different session representations interchangeably to produce similar groupings. The grouping based on one session representation is believed to be sufficient to answer questions in server performance, resource management, capacity planning and Web site personalization, which previously would have required multiple different groupings. Grouping by Pages Requested is recommended since it is the simplest and data on Web pages requested is relatively easy to obtain in HTTP logs.
|
2 |
Workload characterization and customer interaction at e-commerce web serversWang, Qing 27 October 2004 (has links)
Electronic commerce servers have a significant presence in today's Internet. Corporations want to maintain high availability, sufficient capacity, and satisfactory performance for their E-commerce Web systems, and want to provide satisfactory services to customers. Workload characterization and the analysis of customers' interactions with Web sites are the bases upon which to analyze server performance, plan system capacity, manage system resources, and personalize services at the Web site. To date, little empirical evidence has been discovered that identifies the characteristics for Web workloads of E-commerce systems and the behaviours of customers.
This thesis analyzes the Web access logs at public Web sites for three organizations: a car rental company, an IT company, and the Computer
Science department of the University of Saskatchewan. In these case studies, the
characteristics of Web workloads are explored at the request level, functionlevel, resource level, and session level; customers' interactions
with Web sites are analyzed by identifying
and characterizing session groups.
The main E-commerce Web workload characteristics and performance implications are: i) The requests for dynamic Web objects are an important
part of the workload. These requests should be characterized separately since the system processes them differently; ii) Some popular image files, which are embedded in the same Web page, are always requested together. If these files are requested and sent in a bundle, a system will greatly reduce the overheads in processing requests for these files; iii) The
percentage of requests for each Web page category tends to be stable in the workload when the time scale is large enough. This observation is helpful in forecasting workload composition; iv) the Secure Socket Layer protocol (SSL) is heavily used and most Web objects are either requested primarily through SSL or primarily not through SSL; and v) Session groups of different characteristics are identified for all logs. The analysis of session groups may be helpful in improving system performance, maximizing revenue throughput of the system, providing better services to customers, and managing and planning system resources.
A hybrid clustering algorithm, which is a combination of the minimum spanning tree method and k-means clustering algorithm, is proposed to identify session clusters. Session clusters obtained using the three session representations
Pages Requested, Navigation Pattern, and Resource Usage are similar enough so that it is possible to use different session representations interchangeably to produce similar groupings. The grouping based on one session representation is believed to be sufficient to answer questions in server performance, resource management, capacity planning and Web site personalization, which previously would have required multiple different groupings. Grouping by Pages Requested is recommended since it is the simplest and data on Web pages requested is relatively easy to obtain in HTTP logs.
|
3 |
Automated Storage Layout for Database SystemsOzmen, Oguzhan 08 1900 (has links)
Modern storage systems are complex. Simple direct-attached storage devices are giving way to storage systems that are flexible, network-attached, consolidated and virtualized. Today, storage systems have their own administrators, who use specialized tools and expertise to configure
and manage storage resources. As a result, database administrators are no longer in direct control of the design and configuration of their database systems' underlying storage resources.
This introduces problems because database physical design and storage configuration are closely related tasks, and the separation
makes it more difficult to achieve a good end-to-end design. For instance, the performance of a database system depends strongly on the storage layout of database objects, such as tables and indexes, and the separation makes it hard to design a storage layout that is tuned to the I/O workload generated by the database system. In this thesis we address this problem and attempt to close the information gap between database and storage tiers by addressing the problem of predicting the storage (I/O) workload that will be generated by a database management system. Specifically, we show how to translate a database workload description, together with a database physical design, into a characterization of the I/O workload that will result. Such a characterization can directly be used by a storage configuration tool and thus enables effective end-to-end design and configuration
spanning both the database and storage tiers.
We then introduce our storage layout optimization tool, which leverages such workload characterizations to generate an optimized layout for a given set of database objects. We formulate the layout problem as a non-linear programming (NLP) problem and
use the I/O characterization as input to an NLP solver. We have incorporated our I/O estimation technique into the PostgreSQL database management system and our layout optimization technique into a database layout advisor. We present an empirical assessment of the cost of both tools as well as the efficacy and accuracy of their results.
|
4 |
Automated Storage Layout for Database SystemsOzmen, Oguzhan 08 1900 (has links)
Modern storage systems are complex. Simple direct-attached storage devices are giving way to storage systems that are flexible, network-attached, consolidated and virtualized. Today, storage systems have their own administrators, who use specialized tools and expertise to configure
and manage storage resources. As a result, database administrators are no longer in direct control of the design and configuration of their database systems' underlying storage resources.
This introduces problems because database physical design and storage configuration are closely related tasks, and the separation
makes it more difficult to achieve a good end-to-end design. For instance, the performance of a database system depends strongly on the storage layout of database objects, such as tables and indexes, and the separation makes it hard to design a storage layout that is tuned to the I/O workload generated by the database system. In this thesis we address this problem and attempt to close the information gap between database and storage tiers by addressing the problem of predicting the storage (I/O) workload that will be generated by a database management system. Specifically, we show how to translate a database workload description, together with a database physical design, into a characterization of the I/O workload that will result. Such a characterization can directly be used by a storage configuration tool and thus enables effective end-to-end design and configuration
spanning both the database and storage tiers.
We then introduce our storage layout optimization tool, which leverages such workload characterizations to generate an optimized layout for a given set of database objects. We formulate the layout problem as a non-linear programming (NLP) problem and
use the I/O characterization as input to an NLP solver. We have incorporated our I/O estimation technique into the PostgreSQL database management system and our layout optimization technique into a database layout advisor. We present an empirical assessment of the cost of both tools as well as the efficacy and accuracy of their results.
|
5 |
Data Movement Energy Characterization of Emerging Smartphone Workloads for Mobile PlatformsJanuary 2014 (has links)
abstract: A benchmark suite that is representative of the programs a processor typically executes is necessary to understand a processor's performance or energy consumption characteristics. The first contribution of this work addresses this need for mobile platforms with MobileBench, a selection of representative smartphone applications. In smartphones, like any other portable computing systems, energy is a limited resource. Based on the energy characterization of a commercial widely-used smartphone, application cores are found to consume a significant part of the total energy consumption of the device. With this insight, the subsequent part of this thesis focuses on the portion of energy that is spent to move data from the memory system to the application core's internal registers. The primary motivation for this work comes from the relatively higher power consumption associated with a data movement instruction compared to that of an arithmetic instruction. The data movement energy cost is worsened esp. in a System on Chip (SoC) because the amount of data received and exchanged in a SoC based smartphone increases at an explosive rate. A detailed investigation is performed to quantify the impact of data movement
on the overall energy consumption of a smartphone device. To aid this study, microbenchmarks that generate desired data movement patterns between different levels of the memory hierarchy are designed. Energy costs of data movement are then computed by measuring the instantaneous power consumption of the device when the micro benchmarks are executed. This work makes an extensive use of hardware performance counters to validate the memory access behavior of microbenchmarks and to characterize the energy consumed in moving data. Finally, the calculated energy costs of data movement are used to characterize the portion of energy that MobileBench applications spend in moving data. The results of this study show that a significant 35% of the total device energy is spent in data movement alone. Energy is an increasingly important criteria in the context of designing architectures for future smartphones and this thesis offers insights into data movement energy consumption. / Dissertation/Thesis / Masters Thesis Computer Science 2014
|
6 |
Filecules: A New Granularity for Resource Management in GridsDoraimani, Shyamala 26 March 2007 (has links)
Grids provide an infrastructure for seamless, secure access to a globally distributed set of shared computing resources. Grid computing has reached the stage where deployments are run in production mode. In the most active Grid community, the scientific community, jobs are data and compute intensive. Scientific Grid deployments offer the opportunity for revisiting and perhaps updating traditional beliefs related to workload models and hence reevaluate traditional resource management techniques.
In this thesis, we study usage patterns from a large-scale scientificGrid collaboration in high-energy physics. We focus mainly on data usage, since data is the major resource for this class of applications. We perform a detailed workload characterization which led us to propose a new data abstraction, filecule, that groups correlated files. We characterize filecules and show that they are an appropriate data granularity for resource management.
In scientific applications, job scheduling and data staging are tightly coupled. The only algorithm previously proposed for this class of applications, Greedy Request Value (GRV), uses a function that assigns a relative value to a job. We wrote a cache simulator that uses the same technique of combining cache replacement with job reordering to evaluate and compare quantitatively a set of alternative solutions. These solutions are combinations of Least Recently Used (LRU) and GRV from the cache replacement space with First-Come First-Served (FCFS) and the GRV-specific job reordering from the scheduling space. Using real workload from the DZero Experiment at Fermi National Accelerator Laboratory, we measure and compare performance based on byte hit rate, cache change, job waiting time, job waiting queue length, and scheduling overhead.
Based on our experimental investigations, we propose a new technique that combines LRU for cache replacement and job scheduling based onthe relative request value. This technique incurs less data transfer costs than the GRV algorithm and shorter job processing delays than FCFS. We also propose using filecules for data management to further improve the results obtained from the above LRU and GRV combination.
We show that filecules can be identified in practical situations and demonstrate how the accuracy of filecule identification influences caching performance.
|
7 |
Generating and Analyzing Synthetic Workloads using Iterative DistillationKurmas, Zachary Alan 14 May 2004 (has links)
The exponential growth in computing capability and use has produced a
high demand for large, high-performance storage systems.
Unfortunately, advances in storage system research have been limited
by (1) a lack of evaluation workloads, and (2) a limited understanding
of the interactions between workloads and storage systems. We have
developed a tool, the Distiller that helps address both
limitations.
Our thesis is as follows: Given a storage system and a workload for
that system, one can automatically identify a set of workload
characteristics that describes a set of synthetic workloads with the
same performance as the workload they model. These representative
synthetic workloads increase the number of available workloads with
which storage systems can be evaluated. More importantly, the
characteristics also identify those workload properties that affect
disk array performance, thereby highlighting the interactions between
workloads and storage systems.
This dissertation presents the design and evaluation of the Distiller.
Specifically, our contributions are as follows. (1) We demonstrate
that the Distiller finds synthetic workloads with at most 10% error
for six out of the eight workloads we tested. (2) We also find that
all of the potential error metrics we use to compare workload
performance have limitations. Additionally, although the internal
threshold that determines which attributes the Distiller chooses has a
small effect on the accuracy of the final synthetic workloads, it has
a large effect on the Distiller's running time. Similarly, (3) we find
that we can reduce the precision with which we measure attributes and
only moderately reduce the resulting synthetic workload's
accuracy. Finally, (4) we show how to use the information contained in
the chosen attributes to predict the performance effects of modifying
the storage system's prefetch length and stripe unit size.
|
8 |
Characterization and optimization of JavaScript programs for mobile systemsSrikanth, Aditya 09 October 2013 (has links)
JavaScript has permeated into every aspect of the web experience in today's world, making it highly crucial to process it as quickly as possible. With the proliferation of HTML5 and its associated mobile web applications, the world is slowly but surely moving into an age where majority of the webpages will involve complex computations and manipulations within the JavaScript engine. Recent techniques like Just-in-Time (JIT) compilation have become commonplace in popular browsers like Chrome and Firefox, and there is an ongoing effort to further optimize them in the context of mobile systems.
In order to fully take advantage of JavaScript-heavy webpages, it is important to first characterize the interaction of these webpages (both existing pages and modern HTML5 pages) with the different components of the JavaScript engine, viz. the interpreter, the method JIT, the optimizing compiler and the garbage collector. In this thesis, the aforementioned characterization work was leveraged to identify the limits of JavaScript optimizations. Subsequently, a particular optimization, i.e. Register Allocation heuristics was explored in detail on different types of JavaScript programs. This was primarily because the majority of the time (an average of 52.81%) spent in the optimizing compiler is for the register allocation stage alone. By varying the heuristics for register assignment, interval priority and spill selection, a clear idea is obtained about how it impacts certain types of programs more than others. This thesis also gives a preliminary insight into JavaScript applications and benchmarks, showing that these applications tend to be register-intensive, with large live intervals and sparse uses, and sensitive to array and string manipulations. A statically-selected optimal register allocation scheme outperforms the default register allocation scheme resulting in 9.1% performance improvement and 11.23% reduction in execution time on a representative mobile system. / text
|
9 |
Resource Allocation using Adaptive Characterization of Online, Data-Intensive WorkloadsKelley, Jaimie 30 October 2017 (has links)
No description available.
|
10 |
Accurate workload design for web performance evaluationPeña Ortiz, Raúl 13 February 2013 (has links)
Las nuevas aplicaciones y servicios web, cada vez má¡s populares en nuestro día a día, han cambiado completamente la forma en la que los usuarios interactúan con la Web.
En menos de media década, el papel que juegan los usuarios ha evolucionado de meros consumidores pasivos de información a activos colaboradores en la creación de contenidos dinámicos, típicos de la Web actual.
Y, además, esta tendencia se espera que aumente y se consolide con el paso del tiempo.
Este comportamiento dinámico de los usuarios es una de las principales claves en la definición de cargas de trabajo adecuadas para estimar con precisión el rendimiento de los sistemas web.
No obstante, la dificultad intrínseca a la caracterización del dinamismo del usuario y su aplicación en un modelo de carga, propicia que muchos trabajos de investigación sigan todavía empleando cargas no representativas de las navegaciones web actuales.
Esta tesis doctoral se centra en la caracterización y reproducción, para estudios de evaluación de prestaciones, de un tipo de carga web más realista, capaz de imitar el comportamiento de los usuarios de la Web actual.
El estado del arte en el modelado y generación de cargas para los estudios de prestaciones de la Web presenta varias carencias en relación a modelos y aplicaciones software que representen los diferentes niveles de dinamismo del usuario.
Este hecho nos motiva a proponer un modelo más preciso y a desarrollar un nuevo generador de carga basado en este nuevo modelo.
Ambas propuestas han sido validadas en relación a una aproximación tradicional de generación de carga web.
Con este fin, se ha desarrollado un nuevo entorno de experimentación con la capacidad de reproducir cargas web tradicionales y dinámicas, mediante la integración del generador propuesto con un benchmark de uso común.
En esta tesis doctoral también se analiza y evalúa por primera vez, según nuestro saber y entender, el impacto que tiene el empleo de cargas de trabajo dinámicas en las métrica / Peña Ortiz, R. (2013). Accurate workload design for web performance evaluation [Tesis doctoral]. Editorial Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/21054
|
Page generated in 0.126 seconds