Global ETD Search

1	Study of the Hyperscalar Multi-core Architecture Chou, Yu-Liang 07 September 2011 (has links) Current trends in processor design have migrated toward chip multiprocessors (CMPs). CMPs are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processors. However, the conventional design of current CMPs is forced to make a choice between high single-thread performance and high peak throughput. This inability to adjust to varying levels of ILP and TLP results in processor inefficiency. To cope with the dilemma of designing CMPs confronted by the processor designers, this dissertation proposed the hyperscalar concept for current multi-core designs. The hyperscalar concept enables the multi-core architectures to dynamically group many scalar in-order cores as a superscalar processor to accelerate a sequential thread. The reconfigure feature of hyperscalar architecture contributes to the high flexibility in adapting different types of applications, providing high single-thread performance when thread level parallelism (TLP) is low and high throughput when TLP is high. Based on the hyperscalar concept, this dissertation first proposed a hyperscalar dual-core architecture. It can play three different roles (a 2-issue statically scheduled superscalar processor, a homogeneous dual-core processor, or a standalone single-core processor). An Instruction-dependency Analyzer (IA) that connects two scalar in-order cores is designed to handle the role switching. The design of IA makes it possible for the two cores to work together like a 2-issue statically scheduled superscalar processor. The IA dispatches instructions with data dependencies to the same core so that the data dependencies can be resolved by existing forwarding paths in the core. Simulation results show that when the proposed architecture works in a statically scheduled superscalar manner, it achieves a 30.3% higher instructions per cycle (IPC) than the traditional five-stage pipelined core based on 35 benchmarks from the MiBench suite. The increases in area and power for extending a homogeneous dual-core processor to a hyperscalar dual-core processor are only 1.8% and 1.75%, respectively, using 90nm CMOS technology. On top of that, this dissertation further extended the hyperscalar dual-core architecture to hyperscalar multi-core architecture capable of flexibly providing high throughput for uniform parallel application as well as high performance for more general workloads. It can dynamically unite many scalar cores as a larger OOO superscalar processor to accelerate a thread. To accomplish this, the Virtual Shared Register File (VSRF) concept was proposed to help the instructions of a thread in different cores can logically face a uniform set of register file. Simulation results show that the 2, 4, 8, 16, and 32-core-united configurations of the hyperscalar multi-core architecture archive 95%, 84%, 82%, 85%, and 90% of the performance of the monolithic 2, 4,8, 16, and 32-issue OOO superscalar processors based the SPEC2000 benchmarks. Finally, this dissertation proposed a new technology, called multi-streaming SIMD, applicable for hyperscalar architecture to efficiently exploit data-level parallelism (DLP). The multi-streaming SIMD technology enables current multimedia extensions to simultaneously manipulate multiple data streams. Simulation results show that when a multi-streaming SIMD computing engine has four 4-register multimedia operation storage units, it provides a factor of 3.3x to 5.5x performance enhancement for traditional MMX extensions on twelve multimedia kernels. After exploring the above research topics discussed in this dissertation, a promising architecture for future multi-core designs was realized. SIMD chip multiprocessors superscalar dynamic multi-core reconfigurable hardware multimedia processing hyperscalar
2	Multimedia Processing: Real-Time Colour Grading with JIT using the MLT Framework Kolling, Pina January 2024 (has links) The topic of this thesis project is multimedia processing, focusing on the user-sided adjustment of RGB values in video streaming using Just-In-Time (JIT) techniques and the Media Lovin’ Toolkit (MLT) framework. This is implemented in Codemill’s Accurate Player and using Web Real-Time Communication (WebRTC) as a data channel. Colour theory and RGB colour representation are discussed and technical details on the structure and usage of the MLT framework are provided. The first part of the research question aims to evaluate the feasibility of the real-time colour adjustment. This research question is answered positively by providing an implementation that can address real-world use cases. A comparison of different MLT filters is included, to select the most suitable filter for the RGB adjustment. The second part of the research question considers the comparison of video colour grading results with MLT filters that were applied on different platforms: The Accurate Player, the command line video editor Melt and the editing software KDEN Live. For this, frames of the different platforms were extracted and subtracted from each other to show differences in the colour saturations. The results reveal that the Accurate Player plays back the original video more accurately than the Melt framework. Additionally, the results lead to the assumption that KDEN Live is not using the same Melt filter as the Accurate Player to adjust the RGB values. Those significant differences in the compared frames show the complexity of the topic of colour adjustment and representation. Multimedia processing RGB colour adjustment MLT framework MLT filter Colour grading Just-in-time Codemill Accurate Player Web Real-Time Communication WebRTC JIT Melt KDEN Live Computer Sciences Datavetenskap (datalogi)

Search results

Study of the Hyperscalar Multi-core Architecture

Multimedia Processing: Real-Time Colour Grading with JIT using the MLT Framework