• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 14
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 34
  • 11
  • 10
  • 7
  • 6
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Effective task assignment strategies for distributed systems under highly variable workloads

Broberg, James Andrew, james@broberg.com.au January 2007 (has links)
Heavy-tailed workload distributions are commonly experienced in many areas of distributed computing. Such workloads are highly variable, where a small number of very large tasks make up a large proportion of the workload, making the load very hard to distribute effectively. Traditional task assignment policies are ineffective under these conditions as they were formulated based on the assumption of an exponentially distributed workload. Size-based task assignment policies have been proposed to handle heavy-tailed workloads, but their applications are limited by their static nature and assumption of prior knowledge of a task's service requirement. This thesis analyses existing approaches to load distribution under heavy-tailed workloads, and presents a new generalised task assignment policy that significantly improves performance for many distributed applications, by intelligently addressing the negative effects on performance that highly variable workloads cause. Many problems associated with the modelling and optimisations of systems under highly variable workloads were then addressed by a novel technique that approximated these workloads with simpler mathematical representations, without losing any of their pertinent original properties. Finally, we obtain advance queuing metrics (such as the variance of key measurements like waiting time and slowdown that are difficult to obtain analytically) through rigorous simulation.
12

Automatic generation of synthetic workloads for multicore systems

Ganesan, Karthik 11 July 2012 (has links)
When designing a computer system, benchmark programs are used with cycle accurate performance/power simulators and HDL level simulators to evaluate novel architectural enhancements, perform design space exploration, understand the worst-case power characteristics of various designs and find performance bottlenecks. This research effort is directed towards automatically generating synthetic benchmarks to tackle three design challenges: 1) For most of the simulation related purposes, full runs of modern real world parallel applications like the PARSEC, SPLASH suites cannot be used as they take machine weeks of time on cycle accurate and HDL level simulators incurring a prohibitively large time cost 2) The second design challenge is that, some of these real world applications are intellectual property and cannot be shared with processor vendors for design studies 3) The most significant problem in the design stage is the complexity involved in fixing the maximum power consumption of a multicore design, called the Thermal Design Power (TDP). In an effort towards fixing this maximum power consumption of a system at the most optimal point, designers are used to hand-crafting possible code snippets called power viruses. But, this process of trying to manually write such maximum power consuming code snippets is very tedious. All of these aforementioned challenges has lead to the resurrection of synthetic benchmarks in the recent past, serving as a promising solution to all the challenges. During the design stage of a multicore system, availability of a framework to automatically generate system-level synthetic benchmarks for multicore systems will greatly simplify the design process and result in more confident design decisions. The key idea behind such an adaptable benchmark synthesis framework is to identify the key characteristics of real world parallel applications that affect the performance and power consumption of a real program and create synthetic executable programs by varying the values for these characteristics. Firstly, with such a framework, one can generate miniaturized synthetic clones for large target (current and futuristic) parallel applications enabling an architect to use them with slow low-level simulation models (e.g., RTL models in VHDL/Verilog) and helps in tailoring designs to the targeted applications. These synthetic benchmark clones can be distributed to architects and designers even if the original applications are intellectual property, when they are not publicly available. Lastly, such a framework can be used to automatically create maximum power consuming code snippets to be able to help in fixing the TDP, heat sinks, cooling system and other power related features of the system. The workload cloning framework built using the proposed synthetic benchmark generation methodology is evaluated to show its superiority over the existing cloning methodologies for single-core systems by generating miniaturized clones for CPU2006 and ImplantBench workloads with only an average error of 2.9% in performance for up to five orders of magnitude of simulation speedup. The correlation coefficient predicting the sensitivity to design changes is 0.95 and 0.98 for performance and power consumption. The proposed framework is evaluated by cloning parallel applications implemented based on p-threads and OpenMP in the PARSEC benchmark suite. The average error in predicting performance is 4.87% and that of power consumption is 2.73%. The correlation coefficient predicting the sensitivity to design changes is 0.92 for performance. The efficacy of the proposed synthetic benchmark generation framework for power virus generation is evaluation on SPARC, Alpha and x86 ISAs using full system simulators and also using real hardware. The results show that the power viruses generated for single-core systems consume 14-41% more power compared to MPrime on SPARC ISA. Similarly, the power viruses generated for multicore systems consume 45-98%, 40-89% and 41-56% more power than PARSEC workloads, running multiple copies of MPrime and multithreaded SPECjbb respectively. / text
13

Karių ir ištvermę ugdančių lengvaatlečių širdies ir kraujagyslių funkcijos ypatybės atliekant aerobinius krūvius / Peculiarities of cardiovascular function during the aerobic workloads in solders and well trained endurance runners groups

Juodeškienė, Inga 15 May 2006 (has links)
SUMMARY Lithuanian army physical training mission-is to form a healthy, strong, well physically prepared soldier, ready to fulfil all raised tactical assignments. The aim of this research was to evaluate the endurance of soldiers, developing the cardiovascular function abilities in performing of aerobic loads. Investigation was done in Kaunas city “Vytautas Magnus Keepers” battalion stadium. The request was given and the permission of the battalion commander was received in order to perform the biomedical research. Six athletes elevating high technique endurance participated in this research. The participants voluntarily agreed to perform the testing. All the participants (soldiers and runners) after the individual warming-up, ran a 5km cross-country (the task of the exercise was endurance training: even running in an aerobic zone). During the second research the runners were running a 12km cross-country (the task of the exercise was endurance training: even running in an aerobic zone). Using the heart rate (HR) pulse monitor “Polar-S810” The changes of HR was measured (the instantaneous account of HR and the average (HRmax), an account of the chosen length (each 1000m) and the duration (t) of chosen running length (each 1000m running length). The index of biological load value was counted of those registered indicators. In the third research the participants in LKKA laboratory of Kinesiology performed the search of qualification: the test of Roufier physical load, while... [to full text]
14

Harnessing Data Parallel Hardware for Server Workloads

Agrawal, Sandeep R. January 2015 (has links)
<p>Trends in increasing web traffic demand an increase in server throughput while preserving energy efficiency and total cost of ownership. Present work in optimizing data center efficiency primarily focuses on using general purpose processors, however these might not be the most efficient platforms for server workloads. Data parallel hardware achieves high energy efficiency by amortizing instruction costs across multiple data streams, and high throughput by enabling massive parallelism across independent threads. These benefits are considered traditionally applicable to scientific workloads, and common server tasks like page serving or search are considered unsuitable for a data parallel execution model.</p><p>Our work builds on the observation that server workload execution patterns are not completely unique across multiple requests. For a high enough arrival rate, a server has the opportunity to launch cohorts of similar requests on data parallel hardware, improving server performance and power/energy efficiency. We present a framework---called Rhythm---for high throughput servers that can exploit similarity across requests to improve server performance and power/energy efficiency by launching data parallel executions for request cohorts. An implementation of the SPECWeb Banking workload using Rhythm on NVIDIA GPUs provides a basis for evaluation. </p><p>Similarity search is another ubiquitous server workload that involves identifying the nearest neighbors to a given query across a large number of points. We explore the performance, power and dollar benefits of using accelerators to perform similarity search for query cohorts in very high dimensions under tight deadlines, and demonstrate an implementation on GPUs that searches across a corpus of billions of documents and is significantly cheaper than commercial deployments. We show that with software and system modifications, data parallel designs can greatly outperform common task parallel implementations.</p> / Dissertation
15

User experience driven CPU frequency scaling on mobile devices : towards better energy efficiency

Seeker, Volker Günter January 2017 (has links)
With the development of modern smartphones, mobile devices have become ubiquitous in our daily lives. With high processing capabilities and a vast number of applications, users now need them for both business and personal tasks. Unfortunately, battery technology did not scale with the same speed as computational power. Hence, modern smartphone batteries often last for less than a day before they need to be recharged. One of the most power hungry components is the central processing unit (CPU). Multiple techniques are applied to reduce CPU energy consumption. Among them is dynamic voltage and frequency scaling (DVFS). This technique reduces energy consumption by dynamically changing CPU supply voltage depending on the currently running workload. Reducing voltage, however, also makes it necessary to reduce the clock frequency, which can have a significant impact on task performance. Current DVFS algorithms deliver a good user experience, however, as experiments conducted later in this thesis will show, they do not deliver an optimal energy efficiency for an interactive mobile workload. This thesis presents methods and tools to determine where energy can be saved during mobile workload execution when using DVFS. Furthermore, an improved DVFS technique is developed that achieves a higher energy efficiency than the current standard. One important question when developing a DVFS technique is: How much can you slow down a task to save energy before the negative effect on performance becomes intolerable? The ultimate goal when optimising a mobile system is to provide a high quality of experience (QOE) to the end user. In that context, task slowdowns become intolerable when they have a perceptible effect on QOE. Experiments conducted in this thesis answer this question by identifying workload periods in which performance changes are directly perceptible by the end user and periods where they are imperceptible, namely interaction lags and interaction idle periods. Interaction lags are the time it takes the system to process a user interaction and display a corresponding response. Idle periods are the periods between interactions where the user perceives the system as idle and ready for the next input. By knowing where those periods are and how they are affected by frequency changes, a more energy efficient DVFS governor can be developed. This thesis begins by introducing a methodology that measures the duration of interaction lags as perceived by the user. It uses them as an indicator to benchmark the quality of experience for a workload execution. A representative benchmark workload is generated comprising 190 minutes of interactions collected from real users. In conjunction with this QOE benchmark, a DVFS Oracle study is conducted. It is able to find a frequency profile for an interactive mobile workload which has the maximum energy savings achievable without a perceptible performance impact on the user. The developed Oracle performance profile achieves a QOE which is indistinguishable from always running on the fastest frequency while needing 45% less energy. Furthermore, this Oracle is used as a baseline to evaluate how well current mobile frequency governors are performing. It shows that none of these governors perform particularly well and up to 32% energy savings are possible. Equipped with a benchmark and an optimisation baseline, a user perception aware DVFS technique is developed in the second part of this thesis. Initially, a runtime heuristic is introduced which is able to detect interaction lags as the user would perceive them. Using this heuristic, a reinforcement learning driven governor is developed which is able to learn good frequency settings for interaction lag and idle periods based on sample observations. It consumes up to 22% less energy than current standard governors on mobile devices, and maintains a low impact on QOE.
16

An Intelligent Framework for Energy-Aware Mobile Computing Subject to Stochastic System Dynamics

January 2017 (has links)
abstract: User satisfaction is pivotal to the success of mobile applications. At the same time, it is imperative to maximize the energy efficiency of the mobile device to ensure optimal usage of the limited energy source available to mobile devices while maintaining the necessary levels of user satisfaction. However, this is complicated due to user interactions, numerous shared resources, and network conditions that produce substantial uncertainty to the mobile device's performance and power characteristics. In this dissertation, a new approach is presented to characterize and control mobile devices that accurately models these uncertainties. The proposed modeling framework is a completely data-driven approach to predicting power and performance. The approach makes no assumptions on the distributions of the underlying sources of uncertainty and is capable of predicting power and performance with over 93% accuracy. Using this data-driven prediction framework, a closed-loop solution to the DEM problem is derived to maximize the energy efficiency of the mobile device subject to various thermal, reliability and deadline constraints. The design of the controller imposes minimal operational overhead and is able to tune the performance and power prediction models to changing system conditions. The proposed controller is implemented on a real mobile platform, the Google Pixel smartphone, and demonstrates a 19% improvement in energy efficiency over the standard frequency governor implemented on all Android devices. / Dissertation/Thesis / Doctoral Dissertation Computer Engineering 2017
17

Die invloed van die werksomstandighede van hoërskoolonderwysers op hulle houding teenoor hulle werk en hulle motiveringsvlak

Botha, Jan Jakobus 23 July 2008 (has links)
The influence of the work circumstances of high school teachers on their attitude towards their work and their levels of motivation. 1Introduction The working circumstances of high school teachers have changed over the past few years, and more specifically since the implementation of OBE and the rationalisation of teachers. The research presumption is that, as a result of the changes, teachers’ workloads have increased and consequently, they have less time to complete more work. The latter includes preparation for the various learning areas, as well as OBE administration. 2Background Certain elements in teachers’ work circumstances which influence attitudes include the following: The change in work circumstances has resulted in a greater workload which consequently led to an increase in stress levels. Higher stress levels are experienced by teachers worldwide, and not only in South Africa. Teachers have to work longer hours to complete their work. Many teachers don’t experience job satisfaction in their present circumstances. Changes in education influence the attitudes, achievements and performance of teachers. This causes a decrease in their motivation levels. Statistics prove that teachers have a low morale country wide. 3Research Methodology Research question: How did the changed work circumstances of teachers influence their attitude towards their work and their motivation levels? This is a qualitative study in which interviews, six in total, with teachers with more than eight years’ experience of parallel medium (former Model C schools) and black schools were used. / Dissertation (MEd (Educational Leadership))--University of Pretoria, 2006. / Education Management and Policy Studies / MEd / unrestricted
18

Cache Characterization and Performance Studies Using Locality Surfaces

Sorenson, Elizabeth Schreiner 14 July 2005 (has links) (PDF)
Today's processors commonly use caches to help overcome the disparity between processor and main memory speeds. Due to the principle of locality, most of the processor's requests for data are satisfied by the fast cache memory, resulting in a signficant performance improvement. Methods for evaluating workloads and caches in terms of locality are valuable for cache design. In this dissertation, we present a locality surface which displays both temporal and spatial locality on one three-dimensional graph. We provide a solid, mathematical description of locality data and equations for visualization. We then use the locality surface to examine the locality of a variety of workloads from the SPEC CPU 2000 benchmark suite. These surfaces contain a number of features that represent sequential runs, loops, temporal locality, striding, and other patterns from the input trace. The locality surface can also be used to evaluate methodologies that involve locality. For example, we evaluate six synthetic trace generation methods and find that none of them accurately reproduce an original trace's locality. We then combine a mathematical description of caches with our locality definition to create cache characterization surfaces. These new surfaces visually relate how references with varying degrees of locality function in a given cache. We examine how varying the cache size, line size, and associativity affect a cache's response to different types of locality. We formally prove that the locality surface can predict the miss rate in some types of caches. Our locality surface matches well with cache simulation results, particularly caches with large associativities. We can qualitatively choose prudent values for cache and line size. Further, the locality surface can predict the miss rate with 100% accuracy for some fully associative caches and with some error for set associative caches. One drawback to the locality surface is the time intensity of the stack-based algorithm. We provide a new parallel algorithm that reduces the computation time significantly. With this improvement, the locality surface becomes a viable and valuable tool for characterizing workloads and caches, predicting cache simulation results, and evaluating any procedure involving locality.
19

PERFORMANCE ANALYSIS OF DECISION SUPPORT WORKLOADS FOR THE DESKTOP ENVIRONMENT

KAVALANEKAR, SWAROOP V. 02 September 2003 (has links)
No description available.
20

Power Provisioning for Diverse Datacenter Workloads

Li, Jing 26 September 2011 (has links)
No description available.

Page generated in 0.0508 seconds