Global ETD Search

21	Návrh a implementace prostředků pro zvýšení výkonu procesoru / Design and Implementation of Mechanisms for Enhancing Performance of CPU Zlatohlávková, Lucie January 2007 (has links) This masters thesis is focused on the issue of processor architecture. The ground of this project is a design of a simple processor, which is enriched by modern components in processor architecture such as pipelining, cache memory and branch prediction. The processor has been made in VHDL programming language and was simulated in ModelSim simulation tool.
22	Améliorer la performance séquentielle à l’ère des processeurs massivement multicœurs / Increase Sequential Performance in the Manycore Era Prémillieu, Nathanaël 03 December 2013 (has links) L'omniprésence des ordinateurs et la demande de toujours plus de puissance poussent les architectes processeur à chercher des moyens d'augmenter les performances de ces processeurs. La tendance actuelle est de répliquer sur une même puce plusieurs cœurs d'exécution pour paralléliser l'exécution. Si elle se poursuit, les processeurs deviendront massivement multicoeurs avec plusieurs centaines voire un millier de cœurs disponibles. Cependant, la loi d'Amdahl nous rappelle que l'augmentation de la performance séquentielle sera toujours nécessaire pour améliorer les performances globales. Une voie essentielle pour accroître la performance séquentielle est de perfectionner le traitement des branchements, ceux-ci limitant le parallélisme d'instructions. La prédiction de branchements est la solution la plus étudiée, dont l'intérêt dépend essentiellement de la précision du prédicteur. Au cours des dernières années, cette précision a été continuellement améliorée et a atteint un seuil qu'il semble difficile de dépasser. Une autre solution est d'éliminer les branchements et de les remplacer par une construction reposant sur des instructions prédiquées. L'exécution des instructions prédiquées pose cependant plusieurs problèmes dans les processeurs à exécution dans le désordre, en particulier celui des définitions multiples. Les travaux présentés dans cette thèse explorent ces deux aspects du traitement des branchements. La première partie s'intéresse à la prédiction de branchements. Une solution pour améliorer celle-ci sans augmenter la précision est de réduire le coût d'une mauvaise prédiction. Cela est possible en exploitant la reconvergence de flot de contrôle et l'indépendance de contrôle pour récupérer une partie du travail fait par le processeur sur le mauvais chemin sur les instructions communes aux deux chemins pour éviter de le refaire sur le bon chemin. La deuxième partie s'intéresse aux instructions prédiquées. Nous proposons une solution au problème des définitions multiples qui passe par la prédiction sélective de la valeur des prédicats. Un mécanisme de rejeu sélectif est utilisé pour réduire le coût d'une mauvaise prédiction de prédicat. / Computers are everywhere and the need for always more computation power has pushed the processor architects to find new ways to increase performance. The today's tendency is to replicate execution core on the same die to parallelize the execution. If it goes on, processors will become manycores featuring hundred to a thousand cores. However, Amdahl's law reminds us that increasing the sequential performance will always be vital to increase global performance. A perfect way to increase sequential performance is to improve how branches are executed because they limit instruction level parallelism. The branch prediction is the most studied solution, its interest greatly depending on its accuracy. In the last years, this accuracy has been continuously improved up to reach a hardly exceeding limit. An other solution is to suppress the branches by replacing them with a construct based on predicated instructions. However, the execution of predicated instructions on out-of-order processors comes up with several problems like the multiple definition problem. This study investigates these two aspects of the branch treatment. The first part is about branch prediction. A way to improve it without increasing the accuracy is to reduce the coast of a branch misprediction. This is possible by exploiting control flow reconvergence and control independence. The work done on the wrong path on instructions common to the two paths is saved to be reused on the correct path. The second part is about predicated instructions. We propose a solution to the multiple definition problem by selectively predicting the predicate values. A selective replay mechanism is used to reduce the cost of a predicate misprediction. Architecture des processeurs Prédiction de branchements Reconvergence Indépendance de contrôle Instructions prédiquées Rejeu sélectif Processor architecture Branch prediction Reconvergence : control independence Predicated instructions Selective replay
23	Grafický simulátor superskalárních procesorů / Graphical Simulator of Superscalar Processors Vávra, Jan January 2021 (has links) Práce se zabývá implementací simulátoru superskalárního procesoru. Implementace se odvíjí od existujících simulátorů a jejich chybějících částí. Simulátor umí vykonávat instrukční sadu RISC-V, ovšem je umožněno přidání jakékoli RISC instrukční sady. Simulátor má deterministickou predikci skoku. Části procesoru lze upravovat. Součástí je i editor kódu pro danou instrukční sadu.
24	Kompiliatorių optimizavimas IA-64 architektūroje / Compiler optimizations on ia-64 architecture Valiukas, Tadas 01 July 2014 (has links) Tradicinės x86 architektūros spartinimui artėjant prie galimybių ribos, kompanija Intel pradėjo kurti naują IA-64 architektūrą, paremtą EPIC – išreikštinai lygiagrečiai vykdomomis instrukcijomis vieno takto metu. Ši pagrindinė savybė leidžia vykdyti iki šešių instrukcijų per vieną taktą. Taipogi architektūra pasižymi tokiomis savybėmis, kurios leido efektyviai spręsti su kodo optimizavimu susijusias problemas tradicinėse architektūrose. Tačiau kompiliatorių optimizavimo algoritmai ilgą laiką buvo tobulinami tradicinėse architektūrose, todėl norint išnaudoti naująją architektūrą, reikia ieškoti būdų tobulinti esamus kompiliatorius. Vienas iš būdų – kompiliatoriaus vidinių parametrų atsakingų už optimizacijas reikšmių pritaikymas IA-64. Būtent toks yra šio darbo tikslas, kuriam pasiekti reikia išnagrinėti IA-64 savybes, jas vėliau eksperimentiškai taikyti realaus kodo pavyzdžiuose bei įvertinti jų įtaką kodo vykdymo spartai. Pagal gautus rezultatus nagrinėjami kompiliatoriaus vidiniai parametrai ir su specialia kompiliatorių testavimo programa randamas geriausias reikšmių rinkinys šiai architektūrai. Vėliau šis rinkinys išbandomas su taikomosiomis programomis. Gauto parametrų rinkinio reikšmės turėtų leisti generuoti efektyvesnį kodą IA-64 architektūrai. / After performance optimization of traditional architectures began to reach their limits, Intel corporation started to develop new architecture based on EPIC – Explicitly Parallel Instruction Counting. This main feature allowed up to six instructions to be executed in single CPU cycle. Also this architecture includes more features, which allowed efficient solution of traditional architectures code optimization problems. However for long time code optimization algorithms have been improved for traditional architectures only, as a result those algorithms should be adopted to new architecture. One of the ways to do that – exploration of internal compilers parameters, which are responsible for code optimizations. That is the primary target of this work and in order to reach it the features of the IA-64 architecture and impact to execution performance must be explored using real-life code examples. Tests results may be used later for internal parameters selection and further exploration of these parameters values by using special compiler performance testing benchmarks. The set of those new values could be tested with real life applications in order to prove efficiency of IA-64 architecture features. IA-64 architektūra Itanium Predikacija Išankstinis duomenų užkrovimas Ciklų optimizavimas Valdymo spėjimas IA-64 architecture VLIW – Very Long Instruction Word Predication Control and data speculation Software pipelining Branch prediction Prefetching RSE – rotating register engine Gcc optimization

Page generated in 0.0852 seconds