Spelling suggestions: "subject:"modulo cheduling"" "subject:"modulo ascheduling""
1 |
Genetic Algorithm for Integrated SoftwarePipeliningCai, Zesi January 2012 (has links)
The purpose of the thesis was to study the feasibility of using geneticalgorithm (GA) to do the integrated software pipelining (ISP). Different from phasedcode generation, ISP is a technique which integrates instruction selection, instructionscheduling, and register allocation together when doing code generation. ISP is able toprovide a lager solution space than phased way does, which means that ISP haspotential to generate more optimized code than phased code generation. However,integrated compiling costs more than phased compiling. GA is stochastic beam searchalgorithm which can accelerate the solution searching and find an optimized result.An experiment was designed for verifying feasibility of implementing GA for ISP(GASP). The implemented algorithm analyzed data dependency graphs of loop bodies,created genes for the graphs and evolved, generated schedules, calculated andevaluated fitness, and obtained optimized codes. The fitness calculation wasimplemented by calculating the maximum value between the smallest possibleresource initiation interval and the smallest possible recurrence initiation interval. Theexperiment was conducted by generating codes from data dependency graphsprovided in FFMPEG and comparing the performance between GASP and integerlinear programming (ILP). The results showed that out of eleven cases that ILP hadgenerated code, GASP performed close to ILP in seven cases. In all twelve cases thatILP did not have result, GASP did generate optimized code. To conclude, the studyindicated that GA was feasible of being implemented for ISP. The generated codesfrom GASP performed similar with the codes from ILP. And for the dependencygraphs that ILP could not solve in a limited time, GASP could also generate optimizedresults.
|
2 |
Integrated Software PipeliningEriksson, Mattias January 2009 (has links)
<p>In this thesis we address the problem of integrated software pipelining for clustered VLIW architectures. The phases that are integrated and solved as one combined problem are: cluster assignment, instruction selection, scheduling, register allocation and spilling.</p><p>As a first step we describe two methods for integrated code generation of basic blocks. The first method is optimal and based on integer linear programming. The second method is a heuristic based on genetic algorithms.</p><p>We then extend the integer linear programming model to modulo scheduling. To the best of our knowledge this is the first time anybody has optimally solved the modulo scheduling problem for clustered architectures with instruction selection and cluster assignment integrated.</p><p>We also show that optimal spilling is closely related to optimal register allocation when the register files are clustered. In fact, optimal spilling is as simple as adding an additional virtual register file representing the memory and have transfer instructions to and from this register file corresponding to stores and loads.</p><p>Our algorithm for modulo scheduling iteratively considers schedules with increasing number of schedule slots. A problem with such an iterative method is that if the initiation interval is not equal to the lower bound there is no way to determine whether the found solution is optimal or not. We have proven that for a class of architectures that we call transfer free, we can set an upper bound on the schedule length. I.e., we can prove when a found modulo schedule with initiation interval larger than the lower bound is optimal.</p><p>Experiments have been conducted to show the usefulness and limitations of our optimal methods. For the basic block case we compare the optimal method to the heuristic based on genetic algorithms.<em></em></p><p><em>This work has been supported by The Swedish national graduate school in computer science (CUGS) and Vetenskapsrådet (VR).</em></p>
|
3 |
Suporte especializado de hardware para geração automática de loop pipelining em FPGASSouza, Guilherme Stefano Silva de 19 November 2014 (has links)
Submitted by Daniele Amaral (daniee_ni@hotmail.com) on 2016-09-13T20:06:59Z
No. of bitstreams: 1
DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-15T13:34:53Z (GMT) No. of bitstreams: 1
DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-15T13:35:23Z (GMT) No. of bitstreams: 1
DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) / Made available in DSpace on 2016-09-15T13:35:30Z (GMT). No. of bitstreams: 1
DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5)
Previous issue date: 2014-11-19 / Não recebi financiamento / Loop pipelining is a technique that may offer significant performance improvements, being employed not only in conventional compilation targeting microprocessors, but also by High Level Synthesis (HLS) tools, targeting heterogeneous architectures and hardware accelerators. This work presents a specialized hardware support aiming at facilitate compilation tasks for HLS tools, along with potential advantages in execution performance and total silicon area employed. Two specialized hardware modules are presented: a queue register file and an instruction predication control module. / O desempenho na execução de programas, que é cada vez mais uma prioridade, pode ter uma melhora significativa por meio do uso de paralelismo em nível de instrução (ILP). Uma técnica que utiliza o ILP e propicia ganhos de desempenho significativos é o loop pipelining, sendo usado não apenas por compiladores para microprocessadores, mas também por
ferramentas de Síntese de Alto Nível (HLS), visando arquiteturas heterogêneas e aceleradores de hardware. Neste trabalho é apresentado o projeto e implementação de estruturas de hardware especializadas, objetivando-se em solucionar o problema de sobreposição de valores que ocorre no loop pipelining, facilitar tarefas de compilaçãoo em ferramentas HLS e diminuir a repetição de código. Além disso, ganhos potenciais de desempenho e área de silício total podem ser alcançados como resultado do uso das estruturas propostas. Serão apresentados: um arquivo de registradores baseado em filas e um módulo de controle para a execução de instruções predicadas.
|
4 |
Integrated Software PipeliningEriksson, Mattias January 2009 (has links)
In this thesis we address the problem of integrated software pipelining for clustered VLIW architectures. The phases that are integrated and solved as one combined problem are: cluster assignment, instruction selection, scheduling, register allocation and spilling. As a first step we describe two methods for integrated code generation of basic blocks. The first method is optimal and based on integer linear programming. The second method is a heuristic based on genetic algorithms. We then extend the integer linear programming model to modulo scheduling. To the best of our knowledge this is the first time anybody has optimally solved the modulo scheduling problem for clustered architectures with instruction selection and cluster assignment integrated. We also show that optimal spilling is closely related to optimal register allocation when the register files are clustered. In fact, optimal spilling is as simple as adding an additional virtual register file representing the memory and have transfer instructions to and from this register file corresponding to stores and loads. Our algorithm for modulo scheduling iteratively considers schedules with increasing number of schedule slots. A problem with such an iterative method is that if the initiation interval is not equal to the lower bound there is no way to determine whether the found solution is optimal or not. We have proven that for a class of architectures that we call transfer free, we can set an upper bound on the schedule length. I.e., we can prove when a found modulo schedule with initiation interval larger than the lower bound is optimal. Experiments have been conducted to show the usefulness and limitations of our optimal methods. For the basic block case we compare the optimal method to the heuristic based on genetic algorithms. This work has been supported by The Swedish national graduate school in computer science (CUGS) and Vetenskapsrådet (VR).
|
Page generated in 0.0673 seconds