• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 60
  • 7
  • 6
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 96
  • 42
  • 29
  • 28
  • 19
  • 18
  • 17
  • 13
  • 11
  • 11
  • 10
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Design of low-cost multi-thread unified shader architecture

Sun, Ya-hsien 14 February 2011 (has links)
In order to increase the data-path utilization of the programmable graphics processor units (GPU) which often stall by waiting for the execution results of those long-latency instructions, multi-thread technique is very often used in the design of GPU. This thesis proposes a multi-thread single unified core GPU design which owns several key features. First, its processor core can execute not only the vertex and fragment shading programs, but also the software rasteriation module which is mostly implemented by a individual hardware module in other GPU designs. Next, the thread-switching policy in our design is based on the non-preempt blocked scheduling. Normally, whether an instruction will be stalled cannot be detected until it enters the instruction-decode stage. In order to achieve zero-penalty thread switching, a single assistant bit will be padded to each instruction in a thread to tell if the next instruction in the same thread will be stalled or not. This mechanism can help achieve a speed-up of 1.4 in some benchmarks used in this thesis. The register file used in GPU processor is usually equipped with up to four access ports, such that it will occupy a significant portion of the entire GPU especially for muti-thread designs where the register set has to be duplicated by several copies. The implementation cost of the register file can be reduced by decreasing its access port number to two based on the proposed multi-bank approach in this thesis. Our experimental results show that this approach can help reduce the overall gate count by 26.12%. Finally, the rest of fixed-pipeline fragment operation is realized by an iterative time-sharing architecture in order to further save the silicon area. The overall gate count of the proposed GPU is 600K.
22

Design and Implementation of Cloud Data Backup System with Load Balance Strategy

Tsai, Chia-ping 15 August 2012 (has links)
The fast growing bandwidth has made the development of cloud storage. More and more resource has put in cloud storage. In this thesis, we proposed a new cloud storage that consists of a single main server and multiple data servers. The main server controls system-wide activities such as data server management. It also periodically communicates with each data server and collects its state. Data servers store data on local disks as Windows files. In order to response to the large number of data access, Selection of the server which is necessary to offer equalized performance. In this paper, we propose a server selection algorithm using different parameters to get the performance metrics which enables us to balance multi-resource from server-side. We design new cloud storage and implement the algorithm. According to upload experiment, the difference between the maximum and the minimum free space when using our algorithm is less than 5GB. But using the random mode, the free space difference is increased as time, and the maximum is 30GB. In the mixed experiment, we added the download mode, and our algorithm is fewer than 10GB. The result of the random mode approximated to the first experiment. Finally, our algorithm obtains 10% and 3% speedup in upload throughput by upload experiment and mixed experiment, 10% speedup in download throughput by mixed experiment.
23

Robust multithreaded applications

Napper, Jeffrey Michael 29 August 2008 (has links)
This thesis discusses techniques for improving the fault tolerance of multithreaded applications. We consider the impact on fault tolerance methods of sharing address space and resources. We develop techniques in two broad categories: conservative multithreaded fault-tolerance (C-MTFT), which recovers an entire application on the failure of a single thread, and optimistic multithreaded fault-tolerance (OMTFT), which recovers threads independently as necessary. In the latter category, we provide a novel approach to recover hung threads while improving recovery time by managing access to shared resources so that hung threads can be restarted while other threads continue execution. / text
24

A study of simulation and verification of a many-core architecture on two modern reconfigurable platforms

Krepis, Dimitrij. January 2007 (has links)
Thesis (M.E.E.)--University of Delaware, 2007. / Principal faculty advisor: Guang R. Gao, Dept. of Electrical and Computer Engineering. Includes bibliographical references.
25

Resource management techniques for performance and energy efficiency in multithreaded processors

Sharkey, Joseph James. January 2006 (has links)
Thesis (Ph. D.)--State University of New York at Binghamton, Department of Computer Science, 2006. / Includes bibliographical references (leaves 171-182).
26

Robust multithreaded applications

Napper, Jeffrey Michael. January 1900 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2008. / Vita. Includes bibliographical references.
27

The dynamic speculation and performance prediction of parallel loops /

Zier, David A. January 1900 (has links)
Thesis (Ph. D.)--Oregon State University, 2009. / Printout. Includes bibliographical references (leaves 103-109). Also available on the World Wide Web.
28

A compiler framework for loop nest software-pipelining

Douillet, Alban. January 2006 (has links)
Thesis (Ph.D.)--University of Delaware, 2006. / Principal faculty advisor: Guang R. Gao, Dept. of Electrical and Computer Engineering. Includes bibliographical references.
29

Instruction fetching, scheduling, and forwarding in a dynamic multithreaded processor /

Browning, Adam W. January 1900 (has links)
Thesis (M.S.)--Oregon State University, 2007. / Printout. Includes bibliographical references (leaves 36-37). Also available on the World Wide Web.
30

Performance analysis of multithreaded sorting algorithms

Nordin, Henrik, Jouper, Kevin January 2015 (has links)
Context. Almost all of the modern computers today have a CPU withmultiple cores, providing extra computational power. In the new ageof big data, parallel execution is essential to improve the performanceto an acceptable level. With parallelisation comes new challenges thatneeds to be considered. Objectives. In this work, parallel algorithms are compared and analysedin relation to their sequential counterparts, using the Java platform.Through this, find the potential speedup for multithreading andwhat factors affects the performance. In addition, provide source codefor multithreaded algorithms with proven time complexities. Methods. A literature study was conducted to gain knowledge anddeeper understanding into the aspects of sorting algorithms and thearea of parallel computing. An experiment followed of implementing aset of algorithms from which data could be gather through benchmarkingand testing. The data gathered was studied and analysed with itscorresponding source code to prove the validity of parallelisation. Results. Multithreading does improve performance, with two threadsin average providing a speedup of up to 2x and four threads up to3x. However, the potential speedup is bound to the available physicalthreads of the CPU and dependent of balancing the workload. Conclusions. The importance of workload balancing and using thecorrect number of threads in relation to the problem to be solved,needs to be carefully considered in order to utilize the extra resourcesavailable to its full potential.

Page generated in 0.0955 seconds