Spelling suggestions: "subject:"computer architecture."" "subject:"coomputer architecture.""
601 |
μCloud : a P2P cloud platform for computing service provision / muCloudFouodji Tasse, Ghislain 22 August 2012 (has links)
The advancements in virtualization technologies have provided a large spectrum of computational approaches. Dedicated computations can be run on private environments (virtual machines), created within the same computer. Through capable APIs, this functionality is leveraged for the service we wish to implement; a computer power service (CPS). We target peer-to-peer systems for this service, to exploit the potential of aggregating computing resources. The concept of a P2P network is mostly known for its expanded usage in distributed networks for sharing resources like content files or real-time data. This study adds computing power to the list of shared resources by describing a suitable service composition. Taking into account the dynamic nature of the platform, this CPS provision is achieved using a self stabilizing clustering algorithm. So, the resulting system of our research is based around a hierarchical P2P architecture and offers end-to-end consideration of resource provisioning and reliability. We named this system μCloud and characterizes it as a self-provisioning cloud service platform. It is designed, implemented and presented in this dissertation. Eventually, we assessed our work by showing that μCloud succeeds in providing user-centric services using a P2P computing unit. With this, we conclude that our system would be highly beneficial in both small and massively deployed environments. / KMBT_223 / Adobe Acrobat 9.51 Paper Capture Plug-in
|
602 |
A new parallel technique for the solution of sparse nonlinear equationsCereijo Martinez, Maria 28 July 1994 (has links)
Solving nonlinear systems of equations is a central problem in numerical analysis, with enormous significance for science and engineering. A special case, sparse systems of equations, occurs frequently in various applications. Sparsity occurs in the analysis of many types of complex systems because of the local nature of the dependence or connectivity among system components.
One such system which may be modeled by a nonlinear sparse set of equations is the power system load flow analysis. This is a mathematical study performed by electrical utilities to monitor the electrical power system. The data from system components are used to create a set of nonlinear equations. These equations are then solved to find the voltage profile of the power network. With these data, control and security of the power system are achieved.
Solving problems of this type is very time consuming when the system is large. This dissertation proposes a highly parallel computer architecture for solving large sets of nonlinear sparse equations. The goal of this architecture is to reduce the processing time required to solve this type of problem. In particular, the load flow problem is analyzed and implemented on this architecture. For the FPL network, the speed is increased by a factor of about 2000.
|
603 |
Massively parallel neural computationFox, Paul James January 2013 (has links)
Reverse-engineering the brain is one of the US National Academy of Engineering’s “Grand Challenges.” The structure of the brain can be examined at many different levels, spanning many disciplines from low-level biology through psychology and computer science. This thesis focusses on real-time computation of large neural networks using the Izhikevich spiking neuron model. Neural computation has been described as “embarrassingly parallel” as each neuron can be thought of as an independent system, with behaviour described by a mathematical model. However, the real challenge lies in modelling neural communication. While the connectivity of neurons has some parallels with that of electrical systems, its high fan-out results in massive data processing and communication requirements when modelling neural communication, particularly for real-time computations. It is shown that memory bandwidth is the most significant constraint to the scale of real-time neural computation, followed by communication bandwidth, which leads to a decision to implement a neural computation system on a platform based on a network of Field Programmable Gate Arrays (FPGAs), using commercial off- the-shelf components with some custom supporting infrastructure. This brings implementation challenges, particularly lack of on-chip memory, but also many advantages, particularly high-speed transceivers. An algorithm to model neural communication that makes efficient use of memory and communication resources is developed and then used to implement a neural computation system on the multi- FPGA platform. Finding suitable benchmark neural networks for a massively parallel neural computation system proves to be a challenge. A synthetic benchmark that has biologically-plausible fan-out, spike frequency and spike volume is proposed and used to evaluate the system. It is shown to be capable of computing the activity of a network of 256k Izhikevich spiking neurons with a fan-out of 1k in real-time using a network of 4 FPGA boards. This compares favourably with previous work, with the added advantage of scalability to larger neural networks using more FPGAs. It is concluded that communication must be considered as a first-class design constraint when implementing massively parallel neural computation systems.
|
604 |
Improving The Performance Of Dynamic Loadbalancing Multiscalar ArchitecturesGokulmuthu, N 09 1900 (has links) (PDF)
No description available.
|
605 |
Evaluating the Scalability of SDF Single-chip Multiprocessor Architecture Using Automatically Parallelizing CodeZhang, Yuhua 12 1900 (has links)
Advances in integrated circuit technology continue to provide more and more transistors on a chip. Computer architects are faced with the challenge of finding the best way to translate these resources into high performance. The challenge in the design of next generation CPU (central processing unit) lies not on trying to use up the silicon area, but on finding smart ways to make use of the wealth of transistors now available. In addition, the next generation architecture should offer high throughout performance, scalability, modularity, and low energy consumption, instead of an architecture that is suitable for only one class of applications or users, or only emphasize faster clock rate. A program exhibits different types of parallelism: instruction level parallelism (ILP), thread level parallelism (TLP), or data level parallelism (DLP). Likewise, architectures can be designed to exploit one or more of these types of parallelism. It is generally not possible to design architectures that can take advantage of all three types of parallelism without using very complex hardware structures and complex compiler optimizations. We present the state-of-art architecture SDF (scheduled data flowed) which explores the TLP parallelism as much as that is supplied by that application. We implement a SDF single-chip multiprocessor constructed from simpler processors and execute the automatically parallelizing application on the single-chip multiprocessor. SDF has many desirable features such as high throughput, scalability, and low power consumption, which meet the requirements of the next generation of CPU design. Compared with superscalar, VLIW (very long instruction word), and SMT (simultaneous multithreading), the experiment results show that for application with very little parallelism SDF is comparable to other architectures, for applications with large amounts of parallelism SDF outperforms other architectures.
|
606 |
High Performance Architecture using Speculative Threads and Dynamic Memory Management HardwareLi, Wentong 12 1900 (has links)
With the advances in very large scale integration (VLSI) technology, hundreds of billions of transistors can be packed into a single chip. With the increased hardware budget, how to take advantage of available hardware resources becomes an important research area. Some researchers have shifted from control flow Von-Neumann architecture back to dataflow architecture again in order to explore scalable architectures leading to multi-core systems with several hundreds of processing elements. In this dissertation, I address how the performance of modern processing systems can be improved, while attempting to reduce hardware complexity and energy consumptions. My research described here tackles both central processing unit (CPU) performance and memory subsystem performance. More specifically I will describe my research related to the design of an innovative decoupled multithreaded architecture that can be used in multi-core processor implementations. I also address how memory management functions can be off-loaded from processing pipelines to further improve system performance and eliminate cache pollution caused by runtime management functions.
|
607 |
An Integrated Architecture for Ad Hoc GridsAmin, Kaizar Abdul Husain 05 1900 (has links)
Extensive research has been conducted by the grid community to enable large-scale collaborations in pre-configured environments. grid collaborations can vary in scale and motivation resulting in a coarse classification of grids: national grid, project grid, enterprise grid, and volunteer grid. Despite the differences in scope and scale, all the traditional grids in practice share some common assumptions. They support mutually collaborative communities, adopt a centralized control for membership, and assume a well-defined non-changing collaboration. To support grid applications that do not confirm to these assumptions, we propose the concept of ad hoc grids. In the context of this research, we propose a novel architecture for ad hoc grids that integrates a suite of component frameworks. Specifically, our architecture combines the community management framework, security framework, abstraction framework, quality of service framework, and reputation framework. The overarching objective of our integrated architecture is to support a variety of grid applications in a self-controlled fashion with the help of a self-organizing ad hoc community. We introduce mechanisms in our architecture that successfully isolates malicious elements from the community, inherently improving the quality of grid services and extracting deterministic quality assurances from the underlying infrastructure. We also emphasize on the technology-independence of our architecture, thereby offering the requisite platform for technology interoperability. The feasibility of the proposed architecture is verified with a high-quality ad hoc grid implementation. Additionally, we have analyzed the performance and behavior of ad hoc grids with respect to several control parameters.
|
608 |
Construção e avaliação de uma solução eficiente para comunicação entre processadores SPARCv8 / Development and evaluation of an efficient solution for SPARCv8 processors communicationAbdnur, Thiago Borges, 1984- 12 November 2012 (has links)
Orientador: Rodolfo Jardim de Azevedo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-22T08:24:56Z (GMT). No. of bitstreams: 1
Abdnur_ThiagoBorges_M.pdf: 3580657 bytes, checksum: 2f83cda26eeb7b31a6ed647c31e27117 (MD5)
Previous issue date: 2012 / Resumo: Com a mudança da maior parte das arquiteturas convencionais para multi-core a comunica _cão entre as diferentes unidades de processamento se torna um problema de destaque, principalmente no que tange _a transferência de dados entre cores. Apesar do enorme impacto no desempenho, é limitado o número de trabalhos científicos que tratam sobre novas soluções para o problema, o foco mais comum é realizar a comunicação através da memória ou endereços específicos mapeados em memória. Nesta dissertação foi definido um modelo de comunicação que acrescenta três novas instruções ao conjunto de instruções do SPARCv8, permitindo que diferentes cores transportem dados entre si diretamente, sem a latência derivada do uso de uma memória compartilhada e de Lucas, como _e o caso da atual implementação do LEON3. Avaliou-se esse modelo de comunicação através de diversos tipos de aplicações sintéticas como produtor-consumidor e pipeline. Para tornar o protótipo em FPGA mais realista, também foi construído um modelo de atraso para a memória principal do sistema, para que o desempenho relativo entre processador e memória _que mais próximo do real. Foi adicionado um suporte básico _as novas instruções no compilador para seu uso em código C através de asm-inline. De forma geral, obteve-se ganhos de 3% _a até 70 vezes, em termos de tempo de execução, em comparação ao uso de memória compartilhada e Lucas / Abstract: As processors design shift towards multicore architectures, new challenges arise to increase the core to core communication efficiency. Despite the potential huge performance impact, the number of papers focusing on this problem is limited. In this project, we define a communication model, adding three new instructions to the SPARCv8 instruction set, to allow different cores to communicate directly, without the shared memory and lock latencies. We implemented the model inside the LEON3 VHDL and evaluated it using synthetic benchmarks like producer-consumer and pipeline. To make the FPGA prototype timings more realistic, we also implemented a new memory timer so that it keeps the processor-memory speed ratio closer to real values. We also created the basic compiler support for these new instructions through intrinsic, converted to inline assembly in C code. Our overall results improve the performance from 3% to up to 70 times faster / Mestrado / Ciência da Computação / Mestre em Ciência da Computação
|
609 |
Implementação em software de algoritmos de resumo criptográfico / Software implementation of cryptographic hash algorithmsOliveira, Thomaz Eduardo de Figueiredo 18 August 2018 (has links)
Orientador: Julio César López Hernández / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-18T13:36:05Z (GMT). No. of bitstreams: 1
Oliveira_ThomazEduardodeFigueiredo_M.pdf: 4175073 bytes, checksum: 14d147ca37955c85736d05e60182a583 (MD5)
Previous issue date: 2011 / Resumo: Os algoritmos de resumo criptográfico são uma importante ferramenta usada em muitas aplicações para o processamento seguro e eficiente de informações. Na década de 2000, sérias vulnerabilidades encontradas em funções de resumo tradicionais, como o SHA-1 e o MD5, levou a comunidade a repensar o desenvolvimento da criptanálise destes algoritmos e projetar novas estratégias para a sua construção. Como resultado, o instituto NIST anunciou em novembro de 2007 um concurso público para o desenvolvimento de um novo padrão de funções de resumo, o SHA-3, contando com a participação de autores de todo o mundo. Esta dissertação foca nos aspectos da implementação em software de alguns algoritmos submetidos no concurso SHA-3, buscando compreender a forma como os autores desenvolveram a questão do custo computacional de seus projetos em diversas plataformas, além de entender os novos paradigmas de implementação introduzidos pela tecnologia presente nos processadores atuais. Como consequência, propusemos novas técnicas algorítmicas para a implementação em software de alguns algoritmos, como o Luffa e o Keccak, levando aos mesmos melhorias significativas de desempenho / Abstract: Hash algorithms are an important tool of cryptography used in many applications for secure and efficient information processing. During the 2000 decade, serious vulnerabilities found at some traditional hash functions like SHA-1 and MD5 prompted the cryptography community to review the advances in the cryptanalysis of these algorithms and their design strategies. As a result, on November, 2007, NIST announced a public competition to develop a new cryptographic hash function, the SHA-3, which involved competitors throughout the world. This work focuses on the software implementation aspects of some of the SHA-3 submitted algorithms, seeking to comprehend how the authors resolved the computational cost issues at distinct platforms and to understand the new paradigms introduced by the present processors technology. As a consequence, we proposed new algorithmic techniques for the software implementation of Luffa and Keccak hash algorithms, improving their performance significantly / Mestrado / Teoria da Computação / Mestre em Ciência da Computação
|
610 |
Desafios no desenvolvimento de plataformas capazes de executar sistemas operacionais utilizando o ArchC / Challenges on development of platforms capable to run operating systems using ArchCCardoso, Rogerio Alves, 1982- 27 August 2018 (has links)
Orientadores: Rodolfo Jardim de Azevedo, Sandro Rigo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-27T12:16:48Z (GMT). No. of bitstreams: 1
Cardoso_RogerioAlves_M.pdf: 7699903 bytes, checksum: be514fda4ed9849ea41a1899010841c7 (MD5)
Previous issue date: 2015 / Resumo: Com o aumento da complexidade dos sistemas eletrônicos, novos desafios foram surgindo na fase de projeto desses sistemas; assim, os requisitos de projeto estão cada vez mais complexos, implicando diretamente no time-to-market que torna-se cada vez mais difícil de ser cumprido. As abordagens tradicionais como o projeto RTL tornaram-se impraticáveis visto que é cada vez mais evidente a necessidade da criação de software paralelamente ao projeto de hardware. Nesse contexto, metodologias modernas como ESL têm sido utilizadas com sucesso, para que os projetistas possam solucionar esses problemas. Com o crescente numero de funcionalidades que os novos dispositivos implementam e o aumento da complexidade das aplica coes, muitas vezes exigem que esses dispositivos rodem um sistema operacional embarcado. Isso dificulta ainda mais o desenvolvimento homogêneo hardware/software, pois demanda a criação de plataformas virtuais completas capazes de executarem um sistema operacional e suas aplicações, e o desenvolvimento dessas plataformas não é uma tarefa trivial. Este trabalho apresenta a implementação de uma plataforma, em nível de sistema, completa da arquitetura LEON, utilizando a ferramenta ArchC. A plataforma apresentada permite executar um sistema operacional Linux e suas aplica coes, com suporte a gerenciamento de memoria virtual. Além de demonstrar as dificuldades e as limitações da ferramenta ArchC na geração desse tipo plataformas / Abstract: Design challenges in electronic systems increase with their size and the design require- ments, leading to even more pressure in time-to-market issues. Traditional approaches like RTL become unaffordable, due to the need for parallel development of hardware and software necessity. In this context, modern methodologies like ESL have been success- fully used to tackle this kind of problem. With the increasing number of features and the complexity of the applications to that new devices, these devices, in major, may need an embedded operating system. This poses a challenge in the homogeneous development of hardware and software, demanding a complex virtual platform development, capable of running an operating system and its applications. But, developing this kind of platform is not a simple task. This work presents an ArchC System Level Platform implementation, based on LEON architecture. This platform can execute a Linux operating system and user applications with virtual memory support. It besides demonstrates the challenges and limitations of the ArchC tools on development of this type of platform / Mestrado / Ciência da Computação / Mestre em Ciência da Computação
|
Page generated in 0.0751 seconds