• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 119
  • 37
  • 28
  • 7
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 294
  • 179
  • 121
  • 102
  • 100
  • 68
  • 47
  • 42
  • 40
  • 40
  • 40
  • 37
  • 36
  • 35
  • 35
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Backtracking model languages /

Muff, Urs C. January 2000 (has links)
Thesis (M.S.)--University of Colorado, 2000. / Includes bibliographical references (leaf 62).
102

A compiler for parallel execution of numerical Python programs on graphics processing units

Garg, Rahul. January 2009 (has links)
Thesis (M. Sc.)--University of Alberta, 2009. / Title from PDF file main screen (viewed on Oct. 19, 2009). "A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science, Department of Computing Science, University of Alberta." Includes bibliographical references.
103

Reducing the costs of comparisons within conditional transfers of control

Kreahling, William C. Whalley, David B. January 2005 (has links)
Thesis (Ph. D.)--Florida State University, 2005. / Advisor: Dr. David Whalley, Florida State University, College of Arts and Sciences, Dept. of Computer Science. Title and description from dissertation home page (viewed Sept. 19, 2005). Document formatted into pages; contains xii, 96 pages. Includes bibliographical references.
104

Reducing the WCET of applications on low end embedded systems

Zhao, Wankang, Whalley, David B. January 2005 (has links)
Thesis (Ph. D.)--Florida State University, 2005. / Advisor: Dr. David Whalley, Florida State University, College of Arts and Sciences, Dept. of Computer Science. Title and description from dissertation home page (viewed Sept. 29, 2005). Document formatted into pages; contains viii, 95 pages. Includes bibliographical references.
105

Compiler optimization of value communication for thread-level speculation /

Zhai, Antonia. January 1900 (has links)
Thesis (Ph. D.)--Carnegie Mellon University, 2005. / "January 13, 2005." Includes bibliographical references.
106

Incremental compilation in language-based environments /

Cook, Philip John. January 2006 (has links) (PDF)
Thesis (Ph.D.) - University of Queensland, 2006. / Includes bibliography.
107

Compiler directed speculation for embedded clustered EPIC machines

Pillai, Satish, Jacome, Margarida F., January 2004 (has links) (PDF)
Thesis (Ph. D.)--University of Texas at Austin, 2004. / Supervisor: Margarida F. Jacome. Vita. Includes bibliographical references.
108

Improving CGRA Utilization by Enabling Multi-threading for Power-efficient Embedded Systems

January 2011 (has links)
abstract: Performance improvements have largely followed Moore's Law due to the help from technology scaling. In order to continue improving performance, power-efficiency must be reduced. Better technology has improved power-efficiency, but this has a limit. Multi-core architectures have been shown to be an additional aid to this crusade of increased power-efficiency. Accelerators are growing in popularity as the next means of achieving power-efficient performance. Accelerators such as Intel SSE are ideal, but prove difficult to program. FPGAs, on the other hand, are less efficient due to their fine-grained reconfigurability. A middle ground is found in CGRAs, which are highly power-efficient, but largely programmable accelerators. Power-efficiencies of 100s of GOPs/W have been estimated, more than 2 orders of magnitude greater than current processors. Currently, CGRAs are limited in their applicability due to their ability to only accelerate a single thread at a time. This limitation becomes especially apparent as multi-core/multi-threaded processors have moved into the mainstream. This limitation is removed by enabling multi-threading on CGRAs through a software-oriented approach. The key capability in this solution is enabling quick run-time transformation of schedules to execute on targeted portions of the CGRA. This allows the CGRA to be shared among multiple threads simultaneously. Analysis shows that enabling multi-threading has very small costs but provides very large benefits (less than 1% single-threaded performance loss but nearly 300% CGRA throughput increase). By increasing dynamism of CGRA scheduling, system performance is shown to increase overall system performance of an optimized system by almost 350% over that of a single-threaded CGRA and nearly 20x faster than the same system with no CGRA in a highly threaded environment. / Dissertation/Thesis / M.S. Computer Science 2011
109

Paralelização de programas sisal para sistemas MPI / Parallelization of sisal programs for MPI systems

Raul Junji Nakashima 15 March 1996 (has links)
Este trabalho teve como finalidade a implementação de um método para a paralelização parcial de programas, escritos na linguagem funcional, SISAL utilizando as bibliotecas do padrão MPI (Message Passing Interface). Para tal, propusemos a transformação dos programas SISAL através do particionamento do loop paralelo forall, através do método de particionamento slice e a utilização do modelo de implementação do paralelismo SPMD (Single Program Multiple Data) no estilo de programas mestre/escravo. A validação de nossa proposta foi obtida através da realização de testes onde foram comparados os resultados obtidos com os programas originais e os programas com as alterações propostas / This work describes a method for the partial parallelization of SISAL programs into programs with calls to MPI routines. We focused on the parallelization of the forall loop (through slicing of the index range). The generated code is a master/slave SPMD program. The work was validated through the compilation of some simple SISAL programs and comparison of the results with an unmodified version
110

LALP: uma linguagem para exploração do paralelismo de loops em computação reconfigurável / LALP: a language for parallelism of loops exploitation in reconfigurable computing

Ricardo Menotti 23 June 2010 (has links)
A computação reconfigurável tem se tornado cada vez mais importante em sistemas computacionais embarcados e de alto desempenho. Ela permite níveis de desempenho próximos aos obtidos com circuitos integrados de aplicação específica (ASIC), enquanto ainda mantém flexibilidade de projeto e implementação. No entanto, para programar eficientemente os dispositivos, é necessária experiência em desenvolvimento e domínio de linguagem de descrição de hardware (HDL), tais como VHDL ou Verilog. As técnicas empregadas na compilação em alto nível (por exemplo, a partir de programas em C) ainda possuem muitos pontos em aberto a serem resolvidos antes que se possa obter resultados eficientes. Muitos esforços em se obter um mapeamento direto de algoritmos em hardware se concentram em loops, uma vez que eles representam as regiões computacionalmente mais intensivas de muitos programas. Uma técnica particularmente útil para isto é a de loop pipelining, a qual geralmente é adaptada de técnicas de software pipelining. A aplicação dessas técnicas está fortemente relacionada ao escalonamento das instruções, o que frequentemente impede o uso otimizado dos recursos presentes nos FPGAs modernos. Esta tese descreve uma abordagem alternativa para o mapeamento direto de loops descritos em uma linguagem de alto nível para FPGAs. Diferentemente de outras abordagens, esta técnica não é proveniente das técnicas de software pipelining. Nas arquiteturas obtidas o controle das operações é distribuído, tornando desnecessária uma máquina de estados finitos para controlar a ordem das operações, o que permitiu a obtenção de implementações eficientes. A especificação de um bloco de hardware é feita por meio de uma linguagem de domínio específico (LALP), especialmente concebida para suportar a aplicação das técnicas. Embora a sintaxe da linguagem lembre C, ela contém certas construções que permitem intervenções do programador para garantir ou relaxar dependências de dados, conforme necessário, e assim otimizar o desempenho do hardware gerado / Reconfigurable computing is becoming increasingly important in embedded and high-performance computing systems. It allows performance levels close to the ones obtained with Application-Specific Integrated circuits (ASIC), while still keeping design and implementation flexibility. However, to efficiently program devices, one needs the expertise of hardware developers in order master hardware description languages (HDL) such as VHDL or Verilog. Attempts to furnish a high-level compilation flow (e.g., from C programs) still have to address open issues before broader efficient results can be obtained. Many efforts trying to achieve a direct of algorithms into hardware concentrate on loops since they represent the most computationally intensive regions of many application codes. A particularly useful technique for this purpose is loop pipelining, which is usually adapted from software pipelining techniques. The application of this technique is strongly related to instruction scheduling, whic often prevents an optimized use of the resources present in modern FPGAs. This thesis decribes an alternative approach to direct mapping loops described in high-level labguages onto FPGAs. Different from oyher approaches, this technique does not inherit from software pipelining techniques. The control is distributed over operations, thus a finite state machine is not necessary to control the order of operations, allowing efficient harware implementations. The specification of a hardware block is done by means of LALP, a domain specific language specially designed to help the application of the techniques. While the language syntax resembles C, it contains certain constructs that allow programmer interventions to enforce or relax data dependences as needed, and so optimize the performance of the generated hardware

Page generated in 0.0286 seconds