• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Dynamic Power-Aware Techniques for Real-Time Multicore Embedded Systems

March Cabrelles, José Luis 30 March 2015 (has links)
The continuous shrink of transistor sizes has allowed more complex and powerful devices to be implemented in the same area, which provides new capabilities and functionalities. However, this complexity increase comes with a considerable rise in power consumption. This situation is critical in portable devices where the energy budget is limited and, hence, battery lifetime defines the usefulness of the system. Therefore, power consumption has become a major concern in the design of real-time multicore embedded systems. This dissertation proposes several techniques aimed to save energy without sacrifying real-time schedulability in this type of systems. The proposed techniques deal with different main components of the system. In particular, the techniques affect the task partitioner and the scheduler, as well as the memory controller. Some of the techniques are especially tailored for multicores with shared Dynamic Voltage and Frequency Scaling (DVFS) domains. Workload balancing among cores in a given domain has a strong impact on power consumption, since all the cores sharing a DVFS domain must run at the speed required by the most loaded core. In this thesis, a novel workload partitioning algorithm is proposed, namely Loadbounded Resource Balancing (LRB). The proposal allocates tasks to cores to balance a given resource (processor or memory) consumption among cores, improving real-time schedulability by increasing overlapping between processor and memory. However, distributing tasks in this way regardless the individual core utilizations could lead to unfair load distributions. That is, one of the cores could become much loaded than the others. To avoid this scenario, when a given utilization threshold is exceeded, tasks are assigned to the least loaded core. Unfortunately, workload partitioning alone is sometimes not able to achieve a good workload balance among cores. Therefore, this work also explores novel task migration approaches. Two task migration heuristics are proposed. The first heuristic, referred to as Single Option Migration (SOM ), attempts to perform only one migration when the workload changes to improve utilization balance. Three variants of the SOM algorithm have been devised, depending on the point of time the migration attempt is performed: when a task arrives to the system (SOMin), when a task leaves the system (SOMout), and in both cases (SOMin−out). The second heuristic, referred to as Multiple Option Migration (MOM ) explores an additional alternative workload partitioning before performing the migration attempt. Regarding the memory controller, memory controller scheduling policies are devised. Conventional policies used in Non Real-Time (NRT) systems are not appropriate for systems providing support for both Hard Real-Time (HRT) and Soft Real-Time (SRT) tasks. Those policies can introduce variability in the latencies of the memory requests and, hence, cause an HRT deadline miss that could lead to a critical failure of the real-time system. To deal with this drawback, a simple policy, referred to as HR- first, which prioritizes requests of HRT tasks, is proposed. In addition, a more advanced approach, namely ATR-first, is presented. ATR-first prioritizes only those requests of HRT tasks that are necessary to ensure real-time schedulability, improving the Quality of Service (QoS) of SRT tasks. Finally, this thesis also tackles dynamic execution time estimation. The accuracy of this estimation is important to avoid deadline misses of HRT tasks but also to increase QoS in SRT systems. Besides, it can also help to improve the schedulability of the systems and reduce power consumption. The Processor-Memory (Proc-Mem) model, that dynamically predicts the execution time of real-time application for each frequency level, is proposed. This model measures at the first hyperperiod, making use of Performance Monitoring Counters (PMCs) at run-time, the portion of time that each core is performing computation (CPU ), waiting for memory (MEM ), or both (OVERLAP). This information will be used to estimate the execution time at any other working frequency / March Cabrelles, JL. (2014). Dynamic Power-Aware Techniques for Real-Time Multicore Embedded Systems [Tesis doctoral]. Editorial Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/48464 / TESIS
2

Auto-tuning Hybrid CPU-GPU Execution of Algorithmic Skeletons in SkePU

Öhberg, Tomas January 2018 (has links)
The trend in computer architectures has for several years been heterogeneous systems consisting of a regular CPU and at least one additional, specialized processing unit, such as a GPU.The different characteristics of the processing units and the requirement of multiple tools and programming languages makes programming of such systems a challenging task. Although there exist tools for programming each processing unit, utilizing the full potential of a heterogeneous computer still requires specialized implementations involving multiple frameworks and hand-tuning of parameters.To fully exploit the performance of heterogeneous systems for a single computation, hybrid execution is needed, i.e. execution where the workload is distributed between multiple, heterogeneous processing units, working simultaneously on the computation. This thesis presents the implementation of a new hybrid execution backend in the algorithmic skeleton framework SkePU. The skeleton framework already gives programmers a user-friendly interface to algorithmic templates, executable on different hardware using OpenMP, CUDA and OpenCL. With this extension it is now also possible to divide the computational work of the skeletons between multiple processing units, such as between a CPU and a GPU. The results show an improvement in execution time with the hybrid execution implementation for all skeletons in SkePU. It is also shown that the new implementation results in a lower and more predictable execution time compared to a dynamic scheduling approach based on an earlier implementation of hybrid execution in SkePU.

Page generated in 0.126 seconds