Global ETD Search

591	Otimizações para acesso a memoria em tradução binaria dinamica / Optimization for memory acess in dynamic binary translation Attrot, Wesley 12 December 2008 (has links) Orientador: Guido Costa Souza de Araujo / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-12T07:49:18Z (GMT). No. of bitstreams: 1 Attrot_Wesley_D.pdf: 1097052 bytes, checksum: 298445ea7d116f82e1c318d1a5dab324 (MD5) Previous issue date: 2008 / Resumo: Tradutores binários dinâmicos ou DBTs2, são programas projetados para executar, em uma arquitetura-alvo, programas binários de arquiteturas diferentes, realizando assim a tradução do programa binário em tempo de execução. Eles também podem ser utilizados para se melhorar o desempenho de programas nativos de uma dada arquitetura. DBTs podem coletar informação de profile da aplicação em tempo de execução, habilidade essa impossível para um compilador estático. Este tipo de informação pode ser usada pelos DBTs para realizar novos tipos de otimizações, não possíveis em um compilador estático, seja por falta de informação do comportamento do programa, ou por não conhecer que regiões do código são mais importantes para otimizar, em detrimento de outras. Como os DBTs gastam tempo para traduzir o código binário, é muito importante que os processos de tradução e otimização sejam extremamente rápidos, para que o impacto final no tempo total de execução seja o mínimo possível. Desta forma, para um tradutor binário dinâmico é essencial saber onde aplicar as otimizações, isto é, descobrir quais regiões do código traduzido são realmente importantes e que podem resultar em ganhos de desempenho. Uma vez que tais regiões tenham sido identificadas, os DBTs irão aplicar às mesmas, otimizações de código de forma a tentar compensar o tempo gasto na tradução do programa binário e mesmo melhorar o desempenho da aplicação traduzida. Como o acesso à memória é algo custoso para um programa, evitá-lo em um ambiente dinâmico pode fazer com que o programa traduzido obtenha ganhos de desempenho, compensando assim parte do tempo gasto no processo de tradução Com isso, neste trabalho investigou-se o ganho de desempenho que pode ser obtido em um ambiente de tradução dinâmica ao se tentar otimizar os acessos à memória que o programa traduzido realiza dentro das regiões de código selecionadas para otimização. O processo de otimiza¸c¿ao tenta, tanto quanto possível, evitar acessos à memória principal do computador, transformando-os em acessos à registradores da arquitetura alvo. Como grande parte das otimizações de código necessita de informações de fluxo de dados para poder realizar transformações de código, este trabalho também investigou uma nova forma de se melhorar as análises de fluxo de dados que s¿ao executadas em trechos limitados de código pelo tradutor binário dinâmico. Os resultados mostram que otimizar os acessos à memória produz ganhos pequenos, da ordem de 2%. No tocando a melhora da informa¸c¿ao de fluxo de dados, descobriu-se que quando se busca por registradores disponíveis, pode-se descobrir que quase 25% do total dos registradores investigados estão de fato vazios e podem ser utilizados em otimizações. / Abstract: Dynamic binary translators or DBTs, are programs designed to execute, in a target architecture, binary programs from different architectures, performing the translation of the binary program during the execution time. They can also be used to improve the performance of native programs for a specific architecture. DBTs can collect profile information from the application during runtime, this skill is impossible for a static compiler. This kind of information can be used by the DBTs to perform new kinds of optimizations, not possible in the static compiler, due to few information about the program's behavior, or does not know the regions of the code that are more important to optimize, in detriment of others. DBTs spend time translating the binary code, so is very important that the translation and the optimization process, both be as fast as possible, to the impact in the overall execution time, be the minimum possible. In this way, for a dynamic binary translator, is essential to know where to apply the optimizations, that is, find out what regions of the translated code are really important and that can generate performance improvements. When these regions are identified, the DBTs apply code optimizations in these regions to compensate the time spend to translate the binary program and even improve the performance of the translated aplication. Memory access is a expensive operation for programs, to avoid it in a dynamic environment may result in performance improvement in the translated program, compensating the time spend to translate the binary. In this work, we investigate the performance improvement that can be achieved in a dynamic translation environment when we optimize the memory access that the translated program performs inside the regions selected for optimization. The optimization process tries, when possible, to avoid access to the main computer memory, transforming them into registers access of the target architecture. Many code optimizations need data flow information to perform code transformations, in this work we also investigate a new way to improve the data flow analysis that are performed in constraint regions of code by the dynamic binary translator. The results show that optimize the memory access produce small gains, about 2%. When we try to improve the data flow information, we have discovered that when we are looking for available registers, we can find that almost 25% of the investigated registers are empty and can be used for optimizations. / Doutorado / Sistemas de Computação / Doutor em Ciência da Computação Otimização Compiladores (Computadores) Alocação de recursos Arquitetura de computador Optimization Compilers (Computers) Resource allocation Computer architecture
592	Modeling the performance impact of hot code misprediction in Cross-ISA virtual machines = Modelagem do impacto de erros de predição de código quente no desempenho de máquinas virtuais / Modelagem do impacto de erros de predição de código quente no desempenho de máquinas virtuais Lucas, Divino César Soares, 1985- 04 September 2013 (has links) Orientadores: Guido Costa Souza de Araújo, Edson Borin / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-23T12:28:12Z (GMT). No. of bitstreams: 1 Lucas_DivinoCesarSoares_M.pdf: 1053361 bytes, checksum: e29ab79838532619ba298ddde8ba0f39 (MD5) Previous issue date: 2013 / Resumo: Máquinas virtuais (MVs) são sistemas que se propõem a eliminar a incompatibilidade entre duas, em geral diferentes, interfaces e dessa forma habilitar a comunicação entre diferentes sistemas. Nesse sentido, atuando como mediadores, uma MV está em um ponto que a permite fomentar o desenvolvimento de soluções inovadoras para vários problemas. Tais sistemas geralmente utilizam técnicas de emulação, por exemplo, interpretação ou tradução dinâmica de binários, para executar o código da aplicação cliente. Para determinar qual técnica de emulação é a ideal para um trecho de código geralmente é necessário que a MV empregue algum tipo de predição para determinar se o benefício de compilar o código supera os custos. Este problema, na maioria dos casos, resume-se a predizer se o dado trecho de código será frequentemente executado ou não, problema conhecido pelo nome de Predição de Código Quente. Em geral, se o preditor sinalizar um trecho de código como quente, a MV imediatamente toma a decisão de compilá-lo. Contudo, um problema surge nesta estratégia, à resposta do preditor é apenas a decisão de uma heurística e é, portanto, suscetível a erros. Quando o preditor sinaliza como quente um trecho de código que não será frequentemente executado, ou seja, um código que de fato é "frio", ele está fazendo uma predição errônea de código quente. Quando uma predição incorreta é feita, ocorre que a técnica de emulação que a MV utilizará para emular o trecho de código não compensará o seu custo e, portanto a MV gastará mais tempo executando o seu próprio código do que o código da aplicação cliente. Neste trabalho, foi avaliado o impacto de predições incorretas de código quente no desempenho de MVs emulando vários tipos de aplicações. Na análise realizada foi avaliado o preditor de código quente baseado em limiar, uma técnica frequentemente utilizada para identificar regiões de código que serão frequentemente executadas. Para fazer esta análise foi criado um modelo matemático para simular o comportamento de tal preditor e a partir deste modelo uma série de resultados puderam ser explorados. Inicialmente é mostrado que este preditor frequentemente erra a predição e, como conseqüência, o tempo gasto fazendo compilações torna-se o maior componente do tempo de execução da MV. Também é mostrado como diferentes limiares de predição afetam o número de predições incorretas e qual o impacto disto no desempenho da MV. Também são apresentados resultados indicando qual o impacto do custo de compilação, tradução e velocidade do código traduzido no desempenho da MV. Por fim é mostrado que utilizando apenas o conjunto de aplicações do SPEC CPU 2006 para avaliar o desempenho de MVs que utilizam o preditor de código quente baseado em limiar pode levar a resultados imprecisos / Abstract: Virtual machines are systems that aim to eliminate the compatibility gap between two, possible distinct, interfaces, thus enabling them to communicate. This way, acting like a mediator, the VM lies at an important position that enables it to foster innovative solutions for many problems. Such systems usually rely on emulation techniques, such as interpretation and dynamic binary translation, to execute guest application code. In order to select the best emulation technique for each code segment, the VM typically needs to predict whether the cost of compiling the code overcome its future execution time. This problem, in the common case, reduce to predicting if the given code region will be frequently executed or not, a problem called Hot Code Prediction. Generally, if the predictor flags a given code region as hot the VM instantly takes the decision to compile it. However, a problem came out from this strategy, the predictor response is only a decision made by means of a heuristic and thus it can be incorrect. Whenever the predictor flags a code region that will be infrequently executed (cold code) as hot code, we say that it is doing a hotness misprediction. Whenever a misprediction happens it means that the technique the VM will use to emulate the code will not have its cost amortized by executing the optimized code and thus the VM will, in fact, spend more time executing its own code rather than the guest application code. In this work we measure the impact of hotness mispredictions in a VM emulating several kinds of applications. In our analysis we evaluate the threshold-based hot code predictor, a technique commonly used to predict hot code fragments. To do so we developed a mathematical model to simulate the behavior of such predictor and we use it to estimate the impact of mispredictions in several benchmarks. We show that this predictor frequently mispredicts the code hotness and as a result the VM emulation performance becomes dominated by miscompilations. Moreover, we show how the threshold choice can affect the number of mispredictions and how this impacts the VM performance. We also show how the compilation, interpretation and steady state execution cost of translated instructions affect the VM performance. At the end we show that using SPEC CPU 2006 benchmarks to measure the performance of a VM using the threshold-based predictor can lead to misleading results / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Sistemas de computação virtual Compiladores (Programas de computador) Arquitetura de computador Virtual computer systems Compilers (Computer programs) Computer architecture
593	Um simulador compilado dinâmico para o ArchC / Dynamic compiled simulator for ArchC Garcia, Maxiwell Salvador, 1986- 19 August 2018 (has links) Orientadores: Sandro Rigo, Rodolfo Jardim de Azevedo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-19T17:27:58Z (GMT). No. of bitstreams: 1 Garcia_MaxiwellSalvador_M.pdf: 2001408 bytes, checksum: 18a0b7e502a8676d32857b27374a5d77 (MD5) Previous issue date: 2011 / Resumo: O simulador é uma das ferramentas mais importantes para o desenvolvimento de uma nova arquitetura computacional. Entre as vantagens que ele apresenta destacam-se a flexibilidade e o baixo custo. Os primeiros simuladores eram criados manualmente, uma prática muito propensa a erros. Atualmente, Linguagens de Descrição de Arquiteturas (ADLs) facilitam a geração dessas ferramentas. O foco deste trabalho é a pesquisa em técnicas de simulação rápida utilizando a ADL ArchC. Partindo do estado da arte nesta área, a simulação compilada, conseguiu-se melhorar ainda mais o desempenho dos simuladores de conjunto de instruções. Duas abordagens compilada foram usadas. A primeira é uma abordagem estática, que analisa e decodifica o binário previamente e especializa o simulador para aquela aplicação, deixando a simulação com um alto desempenho. As simulações ficaram apenas 5 vezes mais lentas, na média, que execuções nativas em máquina Intel, com desempenho atingindo 900 milhões de instruções por segundo. A segunda abordagem é a dinâmica, que não exige o conhecimento prévio da aplicação, evitando a sobrecarga inicial de se especializar o simulador. Com essa abordagem é possível, também, simular aplicativos que sofrem modificações em seu próprio código, como boot-loader e sistemas operacionais. A decodificação e compilação do aplicativo são feitas em tempo de execução, fazendo uso da infraestrutura LLVM. O desempenho de simulação só não superou o estático, alcançando uma média de 140 milhões de instruções por segundo. Considerando-se a sobrecarga de geração do simulador compilado estático, a abordagem dinâmica torna-se mais rápida, mostrando-se uma excelente alternativa ao projetista que não tem o interesse em ficar simulando repetidas vezes a mesma aplicação / Abstract: The simulator is one of the most important tools to design a new computer architecture. It has many advantages, the most important are exibility and low cost. The _rst simulators were written from scratch, which was an error-prone practice. Nowadays, Architecture Description Languages (ADLs) simplify the generation of these tools. This work focus on the research of new fast simulation techniques using the ArchC ADL. Beginning from the state-of-art in this area, the compiled simulation, is was possible to speed-up the instruction set simulation performance even higher. Two approaches have been used. The _rst is static compiled simulation, which analyzes and decodes the binary, and specializes the simulator for that application, improving the simulation and reaching high performance. The simulations were only 5 times slower, on average, if compared to native execution on an Intel machine, reaching 900 million instructions per second. The second approach is a dynamic compiled simulation, which requires no knowledge about the application, avoiding the overhead of specializing the simulator. With this approach it is possible to simulate sef-modifying code, such as in boot-loaders and operating systems. The application is decoded and compiled at runtime, using the LLVM framework. The simulation performance reaches an average of 140 million instructions per second, not overcoming the static approach. However, if you consider the overhead of generating the static compiled simulator, the dynamic approach becomes better, being an excellent alternative to the designer who has no interest in repeating simulations for the same application / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Arquitetura de computador Simulação (Computadores) Hardware - Linguagens descritivas Computer architecture Computer simulation Computer hardware description languages
594	Implementação de cache no projeto ArchC / Cache implementation in the ArchC project Almeida, Henrique Dante de, 1982- 20 August 2018 (has links) Orientadores: Paulo Cesar Centoducatte, Rodolfo Jardim de Azevedo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-20T15:21:59Z (GMT). No. of bitstreams: 1 Almeida_HenriqueDantede_M.pdf: 506967 bytes, checksum: ca41d5af5008feeb442f3b9d9322af51 (MD5) Previous issue date: 2012 / Resumo: O projeto ArchC visa criar uma linguagem de descrição de arquiteturas, com o objetivo de se construir simuladores e toolchains de arquiteturas computacionais completas. O objetivo deste trabalho é dotar ArchC com capacidade para gerar simuladores de caches. Para tanto foi realizado um estudo detalhado das caches (tipos, organizações, configurações etc) e do funcionamento e do código do ArchC. O resultado foi a descrição de uma coleção de caches parametrizáveis que podem ser adicionadas 'as arquiteturas descritas em ArchC. A implementação das caches é modular, possuindo código isolado para a memória de armazenamento da cache e políticas de operação. A corretude da cache foi verificada utilizando uma sequ¿encia de simulações de diversas configurações de cache e com comparações com o simulador dinero. A cache resultante apresentou um overhead, no tempo de simulaçao, que varia entre 10% e 60%, quando comparada a um simulador sem cache / Abstract: The ArchC project aims to create an architecture description language, with the goal of building complete computer architecture simulators and toolchains. The goal of this project is to add support in ArchC for simulating caches. To achieve this, a detailed study about caches (types, organization, configuration etc) and about the ArchC code was done. The result was a collection of parameterized caches that may be included on the architectures described with ArchC. The cache implementation is modular, having isolated code for the storage and operation policies. Implementation correctness was verified using a set of many cache configurations and with comparisons with the results from dinero simulator. The resulting cache showed an overhead varying between 10% and 60%, when compared to a simulator without caches / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Sistema de computação Arquitetura de computador Memória cache Computer systems Computer architecture Cache memory
595	Information infrastructures for manufacturing enterprises Albertyn, Erina Francina 28 August 2012 (has links) M.Sc. (Computer Science) / The automation of manufacturing systems is a very important research area. This study is concerned with information infrastructures for manufacturing enterprises and the various methods that could be utilised to do enterprise modelling. The objectives of this study is to: • analyse the various enterprise modelling architectures • apply one of the architectures to a manufacturing enterprise and to evaluate this architecture • compare the various architectures. Manufacturing processes - Automation. Information technology. Computer architecture
596	Network Processor specific Multithreading tradeoffs Boivie, Victor January 2005 (has links) Multithreading is a processor technique that can effectively hide long latencies that can occur due to memory accesses, coprocessor operations and similar. While this looks promising, there is an additional hardware cost that will vary with for example the number of contexts to switch to and what technique is used for it and this might limit the possible gain of multithreading. Network processors are, traditionally, multiprocessor systems that share a lot of common resources, such as memories and coprocessors, so the potential gain of multithreading could be high for these applications. On the other hand, the increased hardware required will be relatively high since the rest of the processor is fairly small. Instead of having a multithreaded processor, higher performance gains could be achieved by using more processors instead. As a solution, a simulator was built where a system can effectively be modelled and where the simulation results can give hints of the optimal solution for a system in the early design phase of a network processor system. A theoretical background to multithreading, network processors and more is also provided in the thesis. Datorteknik multithreading network processors computer architecture system level design exploration Datorteknik Computer Engineering Datorteknik
597	Performance Evaluation of Embedded Microcomputers for Avionics Applications Bilen, Celal Can, Alcalde, John January 2010 (has links) Embedded microcomputers are used in a wide range of applications nowadays. Avionics is one of these areas and requires extra attention regarding reliability and determinism. Thus, these issues should also be born in mind in addition to performance when evaluating embedded microcomputers. This master thesis suggests a framework for performance evaluation of two members of the PowerPC microprocessor family, namely the MPC5554 from Freescale and PPC440EPx from AMCC, and analyzes the results within and between these processors. The framework can be generalized to be used in any microprocessor family, if required. Apart from performance evaluation, this thesis also suggests also a new terminology by introducing the concept of determinism levels to be able to estimate determinism issues in avionics applications more clearly, which is crucial regarding the requirements and working conditions of this very application. Such estimation does not include any practical results as in performance evaluation, but rather remains theoretical. Similar to Automark™ used by AutoBench™ in the EEMBC Benchmark Suite, we introduce a new performance metric score that we call ”Aviomark” and we carry out a detailed comparison of Aviomark with the traditional Automark™ score to be able to see how Aviomark differs from Automark™ in behavior. Finally, we have developed a graphical user interface (GUI) which works in parallel with the Green Hills MULTI Integrated Development Environment (IDE) in order to simplify and automate the evaluation process. By the help of the GUI, the users will be able to easily evaluate their specific PowerPC processors by starting the debugging from MULTI IDE. Microprocessor Avionics PowerPC Performance Evaluation Determinism Embedded Systems Computer Architecture Computer Engineering Datorteknik
598	From high level architecture descriptions to fast instruction set simulators Wagstaff, Harry January 2015 (has links) As computer systems become increasingly complex and diverse, so too do the architectures they implement. This leads to an increase in complexity in the tools used to design new hardware and software. One particularly important tool in hardware and software design is the Instruction Set Simulator, which is used to prototype new architectures and hardware features, verify hardware, and test and debug software. Many Architecture Description Languages exist which facilitate the description of new architectural or hardware features, and generate a tools such as simulators. However, these typically suffer from poor performance, are difficult to test effectively, and may be limited in functionality. This thesis considers three objectives when developing Instruction Set Simulators: performance, correctness, and completeness, and presents techniques which contribute to each of these. Performance is obtained by combining Dynamic Binary Translation techniques with a novel analysis of high level architecture descriptions. This makes use of partial evaluation techniques in order to both improve the translation system, and to improve the quality of the translated code, leading a performance improvement of over 2.5x compared to a naïve implementation. This thesis also presents techniques which contribute to the correctness objective. Each possible behaviour of each described instruction is used to guide the generation of a test case. Constraint satisfaction techniques are used to determine the necessary instruction encoding and context for each behaviour to be produced. It is shown that this is a significant improvement over benchmark-driven testing, and this technique has led to the discovery of several bugs and inconsistencies in multiple state of the art instruction set simulators. Finally, several challenges in ‘Full System’ simulation are addressed, contributing to both the performance and completeness objectives. Full System simulation generally carries significant performance costs compared with other simulation strategies. Crucially, instructions which access memory require virtual to physical address translation and can now cause exceptions. Both of these processes must be correctly and efficiently handled by the simulator. This thesis presents novel techniques to address this issue which provide up to a 1.65x speedup over a state of the art solution. 004.2
599	ADLOA : an architecture description language for artificial ontogenetic architectures Venter, Jade Anthony 13 October 2014 (has links) M.Com. (Information Technology) / ADLOA is an Architecture Description Language (ADL) proposed to describe biologicallyinspired complex adaptive architectures such as ontogenetic architectures. The need for an ontogenetic ADL stems from the lack of support from existing ADLs. This dissertation further investigates the similarities between existing intelligent architectures and ontogenetic architectures. The research conducted on current ADLs, artificial ontogeny and intelligent architectures reveals that there are similarities between ontogenetic architectures and other intelligent architectures. However, the dynamism of artificial ontogeny indicates a lack of support for architecture description. Therefore, the dissertation proposes two core mechanisms to address ontogenetic architecture description. Firstly, the ADLOA process is defined as a systematisation of artificial ontogeny. The process specifies a uniform approach to defining ontogenetic architectures. Secondly, a demonstration of the implemented ADLOA process is used, in conjunction with the ADLOA model, mechanisms and Graphical User Interface (GUI), to present a workable description environment for software architects. The result of the dissertation is a standalone ADL that has the ability to describe ontogenetic architectures and to produce language-dependent code frameworks using the Extensible Markup Language (XML) and Microsoft Visual Studio platform. Ontologies (Information retrieval) Description logics Ontogeny Intelligent agents (Computer software) Computer architecture Multiagent systems
600	Automated grid fault detection and repair Luyt, Leslie 24 May 2012 (has links) With the rise in interest in the field of grid and cloud computing, it is becoming increasingly necessary for the grid to be easily maintainable. This maintenance of the grid and grid services can be made easier by using an automated system to monitor and repair the grid as necessary. We propose a novel system to perform automated monitoring and repair of grid systems. To the best of our knowledge, no such systems exist. The results show that certain faults can be easily detected and repaired. / TeX / Adobe Acrobat 9.51 Paper Capture Plug-in Computer architecture

Search results