Global ETD Search

51	On-chip Pipelined Parallel Mergesort on the Intel Single-Chip Cloud Computer Avdic, Kenan January 2014 (has links) With the advent of mass-market consumer multicore processors, the growing trend in the consumer off-the-shelf general purpose processor industry has moved away from increasing clock frequency as the classical approach for achieving higher performance. This is commonly attributed to the well-known problems of power consumption and heat dissipation with high frequencies and voltage. This paradigm shift has prompted research into a relatively new field of "many-core" processors, such as the Intel Single-chip Cloud Computer. The SCC is a concept vehicle, an experimental homogenous architecture employing 48 IA32 cores interconnected by a high-speed communication network. As similar multiprocessor systems, such as the Cell Broadband Engine, demonstrate a significantly higher aggregate bandwidth in the interconnect network than in memory, we examine the viability of a pipelined approach to sorting on the Intel SCC. By tailoring an algorithm to the architecture, we investigate whether this is also the case with the SCC and whether employing a pipelining technique alleviates the classical memory bottleneck problem or provides any performance benefits. For this purpose, we employ and combine different classic algorithms, most significantly, parallel mergesort and samplesort. intel scc many-core pipelined sorting mergesort algorithms Computer Engineering Datorteknik Computer Sciences Datavetenskap (datalogi)
52	Vývoj globálního trhu s mikročipy / The Evolution of the Global Microchip Market Srba, Lukáš Martin January 2015 (has links) This thesis aims to explain the evolution and transformation of the microchip industry. It focuses on the changes and prediction of the future state including its causes and consequences. The analysis starts on the general description of the market and continues through its subjects ending on relationships between them. This serves as a source of information to the prediction in the final part of the thesis. In the beginning the products, which are taken into consideration in this work (namely CPUs, GPUs and APUs), are described. Following this, there is an analysis of the competition environment that defines a structure of the market upon which further work is based. (Three levels; the manufacturer of photolithographic machines, makers of the chips and their designers and OEM and aftermarket subjects.) The penultimate part defines the barriers to entry to this market and three categories are drawn up: economic, technical and geoeconomic, which are applied to every level of the market. Thus all prerequisites to a successful prediction are satisfied. In the last part of the thesis the prognosis is made and defined, along with its assumptions and limitations. In the concluding part of this work the consequences and results are summarized.
53	Unstructured Computations on Emerging Architectures Al Farhan, Mohammed 05 May 2019 (has links) This dissertation describes detailed performance engineering and optimization of an unstructured computational aerodynamics software system with irregular memory accesses on various multi- and many-core emerging high performance computing scalable architectures, which are expected to be the building blocks of energy-austere exascale systems, and on which algorithmic- and architecture-oriented optimizations are essential for achieving worthy performance. We investigate several state-of-the-practice shared-memory optimization techniques applied to key kernels for the important problem class of unstructured meshes. We illustrate for a broad spectrum of emerging microprocessor architectures as representatives of the compute units in contemporary leading supercomputers, identifying and addressing performance challenges without compromising the floating-point numerics of the original code. While the linear algebraic kernels are bottlenecked by memory bandwidth for even modest numbers of hardware cores sharing a common address space, the edge-based loop kernels, which arise in the control volume discretization of the conservation law residuals and in the formation of the preconditioner for the Jacobian by finite-differencing the conservation law residuals, are compute-intensive and effectively exploit contemporary multi- and many-core processing hardware. We therefore employ low- and high-level algorithmic- and architecture-specific code optimizations and tuning in light of thread- and data-level parallelism, with a focus on strong thread scaling at the node-level. Our approaches are based upon novel multi-level hierarchical workload distribution mechanisms of data across different compute units (from the address space down to the registers) within every hardware core. We analyze the demonstrated aerodynamics application on specific computing architectures to develop certain performance metrics and models to bespeak the upper and lower bounds of the performance. We present significant full application speedup relative to the baseline code, on a succession of many-core processor architectures, i.e., Intel Xeon Phi Knights Corner (5.0x) and Knights Landing (2.9x). In addition, the performance of Knights Landing outperforms, at significantly lower power consumption, Intel Xeon Skylake with nearly twofold speedup. These optimizations are expected to be of value for many other unstructured mesh partial differential equation-based scientific applications as multi- and many- core architecture evolves. Performance Optimizations Thread-level parallelism Data-level parallelism Unstructured Grids Computational Aerodynamics Intel Xeon Phi
54	Semi-centralizovaná kryptoměna založená na blockchainu a trusted computing / Semi-Centralized Cryptocurrency Based on the Blockchain and Trusted Computing Handzuš, Jakub January 2021 (has links) The aim of this thesis is to create a concept of semi-centralized cryptocurrency that supports external interoperability. It is assumed that semi-centralized cryptocurrency is the future of cryptocurrencies in the banking sector, because even at the cost of partial centralization, the concept brings the benefits of a decentralized ledger. Since the simultaneous deployment of their own cryptocurrencies by various central authorities, such as central bank, it is necessary to establish a communication protocol for interbank transactions. The work is thus focused on extending the existing Aquareum solution with an interoperability protocol.
55	Paralelizace výpočtů pro zpracování obrazu / Paralelized image processing library Fuksa, Tomáš January 2011 (has links) This work deals with parallel computing on modern processors - multi-core CPU and GPU. The goal is to learn about computing on this devices suitable for parallelization, define their advantages and disadvantages, test their properties in examples and select appropriate tools to implement a library for parallel image processing. This library is going to be used for the vanishing point estimation in the path finding mobile robot.
56	Podpora DMA pro rodinu mikrokontrolerů HCS08 / DMA Support for HCS08 Microcontrollers Family Novosád, Adrián January 2013 (has links) Embedded systems are dedicated to perform specific tasks, so design engineers can optimize them to reduce the size and cost of the product and increase the reliability and performance. However, result of these optimizations is that some architectures may lack commonly used technologies such as direct memory access (DMA). We may encounter with this situation in family of microcontrollers HCS08. The main theme of this work is to describe a design of DMA controller that can be added into the family of microcontrollers HCS08.
57	Analýza výkonnosti procesorů IBM POWER8 / Performance Analysis of IBM POWER8 Processors Jelen, Jakub January 2016 (has links) This paper describes the IBM Power8 system in comparison to the Intel Xeon processors, widely used in today’s solutions. The performance is not evaluated only on the whole system level but also on the level of threads, cores and a memory. Different metrics are demonstrated on typical optimized algorithms. The benchmarked Power8 processor provides extremely fast memory providing sustainable bandwidth up to 145 GB/s between main memory and processor, which Intel is unable to compete. Computation power is comparable (Matrix multiplication) or worse (N-body simulation, division, more complex algorithms) in comparison with current Intel Haswell-EP. The IBM Power8 is able to compete Intel processors these days and it will be interesting to observe the future generation of Power9 and its performance in comparison to current and future Intel processors.
58	Exploitable Hardware Features and Vulnerabilities Enhanced Side-Channel Attacks on Intel SGX and Their Countermeasures Chen, Guoxing 29 August 2019 (has links) No description available. Computer Engineering Computer Science Intel SGX computer engineering hyper-threading speculative execution
59	Exploring Computational Sprinting in New Domains Saravanan, Indrajeet 28 August 2019 (has links) No description available. Computer Science Intel Cache Allocation Technology Service Level Objectives Browsers Computational Sprinting Dark Silicon
60	Programming the INTEL 8086 microprocessor for GRADS : a graphic real-time animation display system Haag, Roger. January 1985 (has links) No description available. Graphic methods -- Computer programs Intel 8086 (Microprocessor) Operating systems (Computers) Computer graphics Computer software -- Development

Search results