Global ETD Search

71	Modeling the performance impact of hot code misprediction in Cross-ISA virtual machines = Modelagem do impacto de erros de predição de código quente no desempenho de máquinas virtuais / Modelagem do impacto de erros de predição de código quente no desempenho de máquinas virtuais Lucas, Divino César Soares, 1985- 04 September 2013 (has links) Orientadores: Guido Costa Souza de Araújo, Edson Borin / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-23T12:28:12Z (GMT). No. of bitstreams: 1 Lucas_DivinoCesarSoares_M.pdf: 1053361 bytes, checksum: e29ab79838532619ba298ddde8ba0f39 (MD5) Previous issue date: 2013 / Resumo: Máquinas virtuais (MVs) são sistemas que se propõem a eliminar a incompatibilidade entre duas, em geral diferentes, interfaces e dessa forma habilitar a comunicação entre diferentes sistemas. Nesse sentido, atuando como mediadores, uma MV está em um ponto que a permite fomentar o desenvolvimento de soluções inovadoras para vários problemas. Tais sistemas geralmente utilizam técnicas de emulação, por exemplo, interpretação ou tradução dinâmica de binários, para executar o código da aplicação cliente. Para determinar qual técnica de emulação é a ideal para um trecho de código geralmente é necessário que a MV empregue algum tipo de predição para determinar se o benefício de compilar o código supera os custos. Este problema, na maioria dos casos, resume-se a predizer se o dado trecho de código será frequentemente executado ou não, problema conhecido pelo nome de Predição de Código Quente. Em geral, se o preditor sinalizar um trecho de código como quente, a MV imediatamente toma a decisão de compilá-lo. Contudo, um problema surge nesta estratégia, à resposta do preditor é apenas a decisão de uma heurística e é, portanto, suscetível a erros. Quando o preditor sinaliza como quente um trecho de código que não será frequentemente executado, ou seja, um código que de fato é "frio", ele está fazendo uma predição errônea de código quente. Quando uma predição incorreta é feita, ocorre que a técnica de emulação que a MV utilizará para emular o trecho de código não compensará o seu custo e, portanto a MV gastará mais tempo executando o seu próprio código do que o código da aplicação cliente. Neste trabalho, foi avaliado o impacto de predições incorretas de código quente no desempenho de MVs emulando vários tipos de aplicações. Na análise realizada foi avaliado o preditor de código quente baseado em limiar, uma técnica frequentemente utilizada para identificar regiões de código que serão frequentemente executadas. Para fazer esta análise foi criado um modelo matemático para simular o comportamento de tal preditor e a partir deste modelo uma série de resultados puderam ser explorados. Inicialmente é mostrado que este preditor frequentemente erra a predição e, como conseqüência, o tempo gasto fazendo compilações torna-se o maior componente do tempo de execução da MV. Também é mostrado como diferentes limiares de predição afetam o número de predições incorretas e qual o impacto disto no desempenho da MV. Também são apresentados resultados indicando qual o impacto do custo de compilação, tradução e velocidade do código traduzido no desempenho da MV. Por fim é mostrado que utilizando apenas o conjunto de aplicações do SPEC CPU 2006 para avaliar o desempenho de MVs que utilizam o preditor de código quente baseado em limiar pode levar a resultados imprecisos / Abstract: Virtual machines are systems that aim to eliminate the compatibility gap between two, possible distinct, interfaces, thus enabling them to communicate. This way, acting like a mediator, the VM lies at an important position that enables it to foster innovative solutions for many problems. Such systems usually rely on emulation techniques, such as interpretation and dynamic binary translation, to execute guest application code. In order to select the best emulation technique for each code segment, the VM typically needs to predict whether the cost of compiling the code overcome its future execution time. This problem, in the common case, reduce to predicting if the given code region will be frequently executed or not, a problem called Hot Code Prediction. Generally, if the predictor flags a given code region as hot the VM instantly takes the decision to compile it. However, a problem came out from this strategy, the predictor response is only a decision made by means of a heuristic and thus it can be incorrect. Whenever the predictor flags a code region that will be infrequently executed (cold code) as hot code, we say that it is doing a hotness misprediction. Whenever a misprediction happens it means that the technique the VM will use to emulate the code will not have its cost amortized by executing the optimized code and thus the VM will, in fact, spend more time executing its own code rather than the guest application code. In this work we measure the impact of hotness mispredictions in a VM emulating several kinds of applications. In our analysis we evaluate the threshold-based hot code predictor, a technique commonly used to predict hot code fragments. To do so we developed a mathematical model to simulate the behavior of such predictor and we use it to estimate the impact of mispredictions in several benchmarks. We show that this predictor frequently mispredicts the code hotness and as a result the VM emulation performance becomes dominated by miscompilations. Moreover, we show how the threshold choice can affect the number of mispredictions and how this impacts the VM performance. We also show how the compilation, interpretation and steady state execution cost of translated instructions affect the VM performance. At the end we show that using SPEC CPU 2006 benchmarks to measure the performance of a VM using the threshold-based predictor can lead to misleading results / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Sistemas de computação virtual Compiladores (Programas de computador) Arquitetura de computador Virtual computer systems Compilers (Computer programs) Computer architecture
72	A Neural Network Configuration Compiler Based on the Adaptrode Neuronal Model McMichael, Lonny D. (Lonny Dean) 12 1900 (has links) A useful compiler has been designed that takes a high level neural network specification and constructs a low level configuration file explicitly specifying all network parameters and connections. The neural network model for which this compiler was designed is the adaptrode neuronal model, and the configuration file created can be used by the Adnet simulation engine to perform network experiments. The specification language is very flexible and provides a general framework from which almost any network wiring configuration may be created. While the compiler was created for the specialized adaptrode model, the wiring specification algorithms could also be used to specify the connections in other types of networks. compilers neural network models computer sciences systems Compilers (Computer programs) Neural networks (Computer science)
73	A parser generator system to handle complete syntax Ossher, Harold Leon January 1982 (has links) To define a language completely, it is necessary to define both its syntax and semantics. If these definitions are in a suitable form, the parser and code-generator of a compiler, respectively, can be generated from them. This thesis addresses the problem of syntax definition and automatic parser generation. Parsing (Computer grammar) Compilers (Computer programs)
74	EXTRACT: Extensible Transformation and Compiler Technology Calnan, III, Paul W. 29 April 2003 (has links) Code transformation is widely used in programming. Most developers are familiar with using a preprocessor to perform syntactic transformations (symbol substitution and macro expansion). However, it is often necessary to perform more complex transformations using semantic information contained in the source code. In this thesis, we developed EXTRACT; a general-purpose code transformation language. Using EXTRACT, it is possible to specify, in a modular and extensible manner, a variety of transformations on Java code such as insertion, removal, and restructuring. In support of this, we also developed JPath, a path language for identifying portions of Java source code. Combined, these two technologies make it possible to identify source code that is to be transformed and then specify how that code is to be transformed. We evaluate our technology using three case studies: a type name qualifier which transforms Java class names into fully-qualified class names; a contract checker which enforces pre- and post-conditions across behavioral subtypes; and a code obfuscator which mangles the names of a class's methods and fields such that they cannot be understood by a human, without breaking the semantic content of the class. code transformation Java programming language compilers Compilers (Computer programs) Java (Computer program language) EXTRACT (Computer program language)
75	Compiling ACE for Distributed-Memory Machines Song, Jun 05 November 1992 (has links) Distributed-memory machines offer a very high level of performance, flexibility and scalability. But the memory organization of this kind of machine determines that processes on different processors must communicate explicitly by sending and receiving messages. As a result, the programmer faces the enormously difficult task of detailed planning of algorithm-irrelevant, low-level communication issues. This level of programming resembles writing assembly programs for a sequential machine. ACE is a message-passing language with abstract communication statements. It was defined by Dr. Jingke Li at Portland State University. The communication in ACE is still explicit, but it is abstracted to a higher level. The abstraction can help balance the needs of ease of programming and high performance. This thesis discusses how those high-level communication abstractions can be transformed into low-level communication routines. It presents the design and implementation of a compiler that transforms an ACE program into a C program with low-level communication routines. The compiler is implemented for the Intel iPSC/2 hypercube multiprocessor machine. Compared to their low-level counterparts, ACE programs are easier to write and are more understandable. Compared to their high level counterparts, more efficient code can be generated since the communication information is expressed explicitly in ACE and the compiler itself is much less complex. ACE also enables the users to fine tune some critical communication segments. Some well known parallel algorithms written in ACE are compiled by the compiler as examples, and experimental results of their performance are included. ACE (Computer program language) Compilers (Computer programs) Computer algorithms Computer Sciences Programming Languages and Compilers
76	Técnicas de formação de regiões para projetos de máquinas virtuais eficientes / Region formation techniques for efficient virtual machines design Zinsly, Raphael Moreira, 1989- 23 August 2018 (has links) Orientadores: Sandro Rigo, Edson Borin / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-23T22:21:33Z (GMT). No. of bitstreams: 1 Zinsly_RaphaelMoreira_M.pdf: 2659662 bytes, checksum: 961bbb4fb596ee0d81d07c51279c44ed (MD5) Previous issue date: 2013 / Resumo: O resumo poderá ser visualizado no texto completo da tese digital / Abstract: The complete abstract is available with the full electronic document / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Arquitetura de computador Compiladores (Programas de computador) Máquinas virtuais Sistemas de computação Computer architecture Compilers (Computer programs) Virtual machines Computing systems
77	Compilador ASN.1 e codificador/decodificador para BER / ASN.1 compiler and encode/decode for BER Restovic Valderrama, Maria Inés 09 March 1992 (has links) Orientador: Manuel de Jesus Mendes / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação / Made available in DSpace on 2018-08-18T21:31:47Z (GMT). No. of bitstreams: 1 RestovicValderrama_MariaInes_M.pdf: 1435363 bytes, checksum: 073357e2857410d0a58e683b2837ca3e (MD5) Previous issue date: 1992 / Resumo: Neste trabalho apresenta-se uma ferramenta chamada Compilador ASN.1, cujo objetivo é fornecer uma representação concreta para a sintaxe abstrata ASN.1, de forma que, as especificações das PDU's dos protocolos de aplicação, geralmente escritas em ASN.1, possam ser utilizadas computacionalmente. Uma das funções prioritárias da camada de apresentação de um protocolo de comunicação é produzir uma codificação dos valores destas PDU's, baseando-se nas regras definidas pela norma BER. Assim, o compilador deve fornecer numa segunda tarefa, as rotinas de codificação e decodificação específicas para cada PDU compilada, utilizando um conjunto de funções que se encontram em duas bibliotecas auxiliares que realizam estas conversões / Abstract: This work presents a tool called "Compilador ASN.1", which main objective is to provide a concrete representation for the abstract syntax ASN.1, in order to translate the application protocol PDU's specification, written in ASN.1, to the C language. One of the main functions of the presentation layer is produce an encode-decode for the PDU's data values, based on the BER norm. Therefore, a second compiler task is to provide the specific encode-decode routines for each compiled PDU, using a function set available in two complementary libraries that carry out these conversions / Mestrado / Automação / Mestre em Engenharia Elétrica Protocolos Compiladores (Programas de computador) Tipos abstratos de dados (Computação) Redes de computadores Computers networks Protocols Compilers (Computer programs) Abstract data types (Computer)
78	Mecanismo para execução especulativa de aplicações paralelizadas por técnicas DOPIPE usando replicação de estágios / Mechanism for speculative execution of applications parallelized by DOPIPE techniques using stage replication Baixo, André Oliveira Loureiro do, 1986- 21 August 2018 (has links) Orientador: Guido Costa Souza de Araújo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-21T04:52:37Z (GMT). No. of bitstreams: 1 Baixo_AndreOliveiraLoureirodo_M.pdf: 1756118 bytes, checksum: 00900e9463b55e1800da080419da53c7 (MD5) Previous issue date: 2012 / Resumo: A utilização máxima dos núcleos de arquiteturas multi-processadas é fundamental para permitir uma utilização completa do paralelismo disponível em processadores modernos. A fim de obter desempenho escalável, técnicas de paralelização requerem um ajuste cuidadoso de: (a) mecanismo arquitetural para especulação; (b) ambiente de execução; e (c) transformações baseadas em software. Mecanismos de hardware e software já foram propostos para tratar esse problema. Estes mecanismos, ou requerem alterações profundas (e arriscadas) nos protocolos de coerência de cache, ou exibem uma baixa escalabilidade de desempenho para uma gama de aplicações. Trabalhos recentes em técnicas de paralelização baseadas em DOPIPE (como DSWP) sugerem que a combinação de versionamento de dados baseado em paginação com especulação em software pode resultar em bons ganhos de desempenho. Embora uma solução apenas em software pareça atrativa do ponto de vista da indústria, essa não utiliza todo o potencial da microarquitetura para detectar e explorar paralelismo. A adição de tags às caches para habilitar o versionamento de dados, conforme recentemente anunciado pela indústria, pode permitir uma melhor exploração de paralelismo no nível da microarquitetura. Neste trabalho, é apresentado um modelo de execução que permite tanto a especulação baseada em DOPIPE, como as técnicas de paralelização especulativas tradicionais. Este modelo é baseado em uma simples abordagem com tags de cache para o versionamento de dados, que interage naturalmente com protocolos de coerência de cache tradicionais, não necessitando que estes sejam alterados. Resultados experimentais, utilizando benchmarks SPEC e PARSEC, revelam um ganho de desempenho geométrico médio de 21.6× para nove programas sequenciais em uma máquina simulada de 24 núcleos, demonstrando uma melhora na escalabilidade quando comparada a uma abordagem apenas em software / Abstract: Maximal utilization of cores in multicore architectures is key to realize the potential performance available from modern microprocessors. In order to achieve scalable performance, parallelization techniques rely on carefully tunning speculative architecture support, runtime environment and software-based transformations. Hardware and software mechanisms have already been proposed to address this problem. They either require deep (and risky) changes on the existing hardware and cache coherence protocols, or exhibit poor performance scalability for a range of applications. Recent work on DOPIPE-based parallelization techniques (e.g. DSWP) has suggested that the combination of page-based data versioning with software speculation can result in good speed-ups. Although a softwareonly solution seems very attractive from an industry point-of-view, it does not enable the whole potential of the microarchitecture in detecting and exploiting parallelism. The addition of cache tags as an enabler for data versioning, as recently announced in the industry, could allow a better exploitation of parallelism at the microarchitecture level. In this paper we present an execution model that supports both DOPIPE-based speculation and traditional speculative parallelization techniques. It is based on a simple cache tagging approach for data versioning, which integrates smoothly with typical cache coherence protocols, and does not require any changes to them. Experimental results, using SPEC and PARSEC benchmarks, reveal a geometric mean speedup of 21.6x for nine sequential programs in a 24-core simulated CMP, while demonstrate improved scalability when compared to a software-only approach / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Arquitetura de computador Processamento paralelo (Computadores) Compiladores (Programas de computador) Computer architecture Compilers (Computer programs)
79	Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterogeneous Processors Prasad, Ashwin 01 1900 (has links) (PDF) MATLAB is an array language, initially popular for rapid prototyping, but is now being in-creasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program’s execution time. Today’s com-puter systems have tremendous computing power in the form of traditional CPU cores and also throughput-oriented accelerators such as graphics processing units (GPUs). Thus, an approach that maps the control flow dominated regions of a MATLAB program to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this work, we present the design and implementation of MEGHA, a compiler that auto-matically compiles MATLAB programs to enable synergistic execution on heterogeneous pro-cessors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. Our compiler identifies data parallel regions of the program and com-poses them into kernels. The kernel composition step eliminates a number of intermediate arrays which are otherwise required and also reduces the size of the scheduling and mapping problem the compiler needs to solve subsequently. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically, and the amount of data transfer needed is minimized. A heuristic technique to ensure that memory accesses on the CPU exploit locality and those on the GPU are coalesced is also presented. In order to ensure that data transfers required for dependences across basic blocks are performed, we propose a data flow analysis step and an edge-splitting strategy. Thus our compiler automatically handles kernel composition, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfers. Additionally, we address the problem of identifying what variables can coexist in GPU memory simultaneously under the GPU memory constraints. We formulate this problem as that of identifying maximal cliques in an interference graph. We approximate the interference graph using an interval graph and develop an efficient algorithm to solve the problem. Furthermore, we present two program transformations that optimize memory accesses on the GPU using the software managed scratchpad memory available in GPUs. We have prototyped the proposed compiler using the Octave system. Our experiments using this implementation show a geometric mean speedup of 12X on the GeForce 8800 GTS and 29.2X on the Tesla S1070 over baseline MATLAB execution for data parallel benchmarks. Experiments also reveal that our method provides up to 10X speedup over hand written GPUmat versions of the benchmarks. Our method also provides a speedup of 5.3X on the GeForce 8800 GTS and 13.8X on the Tesla S1070 compared to compiled MATLAB code running on the CPU. Compilers (Computer Programs) MATLAB (Computer Program) MEGHA (Compiler) Parallel Processors Heterogeneous Processors MATLAB Programs - Compilation MATLAB Program Computer Science
80	A New Look at Retargetable Compilers Burke, Patrick William 12 1900 (has links) Consumers demand new and innovative personal computing devices every 2 years when their cellular phone service contracts are renewed. Yet, a 2 year development cycle for the concurrent development of both hardware and software is nearly impossible. As more components and features are added to the devices, maintaining this 2 year cycle with current tools will become commensurately harder. This dissertation delves into the feasibility of simplifying the development of such systems by employing heterogeneous systems on a chip in conjunction with a retargetable compiler such as the hybrid computer retargetable compiler (Hy-C). An example of a simple architecture description of sufficient detail for use with a retargetable compiler like Hy-C is provided. As a software engineer with 30 years of experience, I have witnessed numerous system failures. A plethora of software development paradigms and tools have been employed to prevent software errors, but none have been completely successful. Much discussion centers on software development in the military contracting market, as that is my background. The dissertation reviews those tools, as well as some existing retargetable compilers, in an attempt to determine how those errors occurred and how a system like Hy-C could assist in reducing future software errors. In the end, the potential for a simple retargetable solution like Hy-C is shown to be very simple, yet powerful enough to provide a very capable product in a very fast-growing market. retargetable compiler software errors architecture description fast-growing market Compilers (Computer programs) Software failures -- Prevention.

Search results