• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Power Profiling of different Heterogeneous Computers

Atla, Prashant January 2017 (has links)
Context: In the present world, there is an increase in the usage of com- munication services. The growth in the usage and services relying on the communication network has brought in the increase in energy consumption for all the resources involved like computers and other networking compo- nent. Energy consumption has become an other efficient metric, so there is a need of efficient networking services in various fields which can be obtained by using the efficient networking components like computers. For that pur- pose we have to know about the energy usage behavior of that component. Similarly as there is a growth in use of large data-centers there is a huge requirement of computation resources. So for an efficient use of these re- sources we need the measurement of each component of the system and its contribution towards the total power consumption of the system. This can be achieved by power profiling of different heterogeneous computers for es- timating and optimizing the usage of the resources. Objectives: In this study, we investigate the power profiles of different heterogeneous computers, under each system component level by using a predefined workload. The total power consumption of each system compo- nent is measured and evaluated using the open energy monitor(OEM). Methods: In oder to perform the power profile an experimental test bed is implemented. Experiments with different workload on each component are conducted on all the computers. The power for all the system under test is measured by using the OEM which is connected to each system under test(SUT). Results: From the results obtained, the Power profiles of different SUT’s are tabulated and analyzed. The power profiles are done in component level under different workload scenarios for four different heterogeneous comput- ers. From the results and analysis it can be stated that there is a variation in power consumed by each component of a computer based on its con- figuration. From the results we evaluate the property of super positioning principle.
2

Cooperative Execution of Opencl Programs on Multiple Heterogeneous Devices

Pandit, Prasanna Vasant January 2013 (has links) (PDF)
Computing systems have become heterogeneous with the increasing prevalence of multi-core CPUs, Graphics Processing Units (GPU) and other accelerators in them. OpenCL has emerged as an attractive programming framework for heterogeneous systems. However, utilizing mul- tiple devices in OpenCL is a challenge as it requires the programmer to explicitly map data and computation to each device. Utilizing multiple devices simultaneously to speed up execu- tion of a kernel is even more complex, as the relative execution time of the kernel on different devices can vary significantly. Also, after each kernel execution, a coherent version of the data needs to be established. This means that, in order to utilize all devices effectively, the programmer has to spend considerable time and effort to distribute work across all devices, keep track of modified data in these devices and correctly perform a merging step to put the data together. Further, the relative performance of a program may vary across different inputs, which means a statically determined work distribution may not work well. In this work, we present FluidiCL, an OpenCL runtime that takes a program written for a single device and uses multiple heterogeneous devices to execute each kernel. The runtime performs dynamic work distribution and cooperatively executes each kernel on all available devices. Since we consider a setup with devices having discrete address spaces, our solution ensures that execution of OpenCL work-groups on devices is adjusted by taking into account the overheads for data management. The data transfers and data merging needed to ensure coherence are handled transparently without requiring any effort from the programmer. Flu- idiCL also does not require prior training or profiling and is completely portable across dif- ferent machines. Because it is dynamic, the runtime is able to adapt to system load. We have developed several optimizations for improving the performance of FluidiCL. We evaluate the runtime across different sets of devices. On a machine with an Intel quad-core processor and an NVidia Fermi GPU, FluidiCL shows a geomean speedup of nearly 64% over the GPU, 88% over the CPU and 14% over the best of the two devices in each benchmark. In all benchmarks, performance of our runtime comes to within 13% of the best of the two devices. FluidiCL shows similar results on a machine with a quad-core CPU and an NVidia Kepler GPU, with up to 26% speedup over the best of the two. We also present results considering an Intel Xeon Phi accelerator and a CPU and find that FluidiCL performs up to 45% faster than the best of the two devices. We extend FluidiCL from a CPU–GPU scenario to a three-device setup hav- ing a quad-core CPU, an NVidia Kepler GPU and an Intel Xeon Phi accelerator and find that FluidiCL obtains a geomean improvement of 6% in kernel execution time over the best of the three devices considered in each case.

Page generated in 0.1008 seconds