Spelling suggestions: "subject:"workload partitioning"" "subject:"workload artitioning""
1 |
Dynamic Power-Aware Techniques for Real-Time Multicore Embedded SystemsMarch Cabrelles, José Luis 30 March 2015 (has links)
The continuous shrink of transistor sizes has allowed more complex and powerful devices
to be implemented in the same area, which provides new capabilities and functionalities.
However, this complexity increase comes with a considerable rise in power consumption.
This situation is critical in portable devices where the energy budget is limited and,
hence, battery lifetime defines the usefulness of the system. Therefore, power consumption
has become a major concern in the design of real-time multicore embedded systems.
This dissertation proposes several techniques aimed to save energy without sacrifying
real-time schedulability in this type of systems. The proposed techniques deal with
different main components of the system. In particular, the techniques affect the task
partitioner and the scheduler, as well as the memory controller.
Some of the techniques are especially tailored for multicores with shared Dynamic Voltage
and Frequency Scaling (DVFS) domains. Workload balancing among cores in a
given domain has a strong impact on power consumption, since all the cores sharing a
DVFS domain must run at the speed required by the most loaded core.
In this thesis, a novel workload partitioning algorithm is proposed, namely Loadbounded
Resource Balancing (LRB). The proposal allocates tasks to cores to balance
a given resource (processor or memory) consumption among cores, improving real-time
schedulability by increasing overlapping between processor and memory. However, distributing
tasks in this way regardless the individual core utilizations could lead to unfair
load distributions. That is, one of the cores could become much loaded than the others.
To avoid this scenario, when a given utilization threshold is exceeded, tasks are assigned
to the least loaded core.
Unfortunately, workload partitioning alone is sometimes not able to achieve a good workload
balance among cores. Therefore, this work also explores novel task migration
approaches. Two task migration heuristics are proposed. The first heuristic, referred to
as Single Option Migration (SOM ), attempts to perform only one migration when the
workload changes to improve utilization balance. Three variants of the SOM algorithm
have been devised, depending on the point of time the migration attempt is performed:
when a task arrives to the system (SOMin), when a task leaves the system (SOMout), and
in both cases (SOMin−out). The second heuristic, referred to as Multiple Option Migration
(MOM ) explores an additional alternative workload partitioning before performing
the migration attempt.
Regarding the memory controller, memory controller scheduling policies are devised.
Conventional policies used in Non Real-Time (NRT) systems are not appropriate
for systems providing support for both Hard Real-Time (HRT) and Soft Real-Time
(SRT) tasks. Those policies can introduce variability in the latencies of the memory
requests and, hence, cause an HRT deadline miss that could lead to a critical failure of
the real-time system. To deal with this drawback, a simple policy, referred to as HR-
first, which prioritizes requests of HRT tasks, is proposed. In addition, a more advanced
approach, namely ATR-first, is presented. ATR-first prioritizes only those requests of
HRT tasks that are necessary to ensure real-time schedulability, improving the Quality
of Service (QoS) of SRT tasks.
Finally, this thesis also tackles dynamic execution time estimation. The accuracy
of this estimation is important to avoid deadline misses of HRT tasks but also to increase
QoS in SRT systems. Besides, it can also help to improve the schedulability of the systems
and reduce power consumption. The Processor-Memory (Proc-Mem) model, that
dynamically predicts the execution time of real-time application for each frequency level,
is proposed. This model measures at the first hyperperiod, making use of Performance
Monitoring Counters (PMCs) at run-time, the portion of time that each core is performing
computation (CPU ), waiting for memory (MEM ), or both (OVERLAP). This
information will be used to estimate the execution time at any other working frequency / March Cabrelles, JL. (2014). Dynamic Power-Aware Techniques for Real-Time Multicore Embedded Systems [Tesis doctoral]. Editorial Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/48464
|
2 |
Auto-tuning Hybrid CPU-GPU Execution of Algorithmic Skeletons in SkePUÖhberg, Tomas January 2018 (has links)
The trend in computer architectures has for several years been heterogeneous systems consisting of a regular CPU and at least one additional, specialized processing unit, such as a GPU.The different characteristics of the processing units and the requirement of multiple tools and programming languages makes programming of such systems a challenging task. Although there exist tools for programming each processing unit, utilizing the full potential of a heterogeneous computer still requires specialized implementations involving multiple frameworks and hand-tuning of parameters.To fully exploit the performance of heterogeneous systems for a single computation, hybrid execution is needed, i.e. execution where the workload is distributed between multiple, heterogeneous processing units, working simultaneously on the computation. This thesis presents the implementation of a new hybrid execution backend in the algorithmic skeleton framework SkePU. The skeleton framework already gives programmers a user-friendly interface to algorithmic templates, executable on different hardware using OpenMP, CUDA and OpenCL. With this extension it is now also possible to divide the computational work of the skeletons between multiple processing units, such as between a CPU and a GPU. The results show an improvement in execution time with the hybrid execution implementation for all skeletons in SkePU. It is also shown that the new implementation results in a lower and more predictable execution time compared to a dynamic scheduling approach based on an earlier implementation of hybrid execution in SkePU.
|
Page generated in 0.1137 seconds