Global ETD Search

71	A Preliminary Exploration of Memory Controller Policies on Smartphone Workloads Narancic, Goran 26 November 2012 (has links) This thesis explores memory performance for smartphone workloads. We design a Video Conference Workload (VCW) to model typical smartphone usage. We describe a trace-based methodology which uses a software implementation to mimic the behaviour of specialised hardware accelerators. Our methodology stores dataflow information from the original application to maintain the relationships between requests. We first study seven address mapping schemes with our VCW, using a first-ready, first-come-first-served (FR-FCFS) memory scheduler. Our results show the best performing scheme is up to 82% faster than the worst. The VCW is memory intensive, with up to 86.8% bandwidth utilisation using the best performing scheme. We also test a Web Browsing and a set of computer vision workloads. Most are not memory intensive, with utilisation under 15%. Finally, we compare four schedulers and find that the FR-FCFS scheduler using the Write Drain mode [8] performed the best, outperforming the worst scheduler by 6.3%. smartphone workloads workload characterisation memory controller main memory DRAM address mapping memory scheduler memory performance 0984 0537
72	A Preliminary Exploration of Memory Controller Policies on Smartphone Workloads Narancic, Goran 26 November 2012 (has links) This thesis explores memory performance for smartphone workloads. We design a Video Conference Workload (VCW) to model typical smartphone usage. We describe a trace-based methodology which uses a software implementation to mimic the behaviour of specialised hardware accelerators. Our methodology stores dataflow information from the original application to maintain the relationships between requests. We first study seven address mapping schemes with our VCW, using a first-ready, first-come-first-served (FR-FCFS) memory scheduler. Our results show the best performing scheme is up to 82% faster than the worst. The VCW is memory intensive, with up to 86.8% bandwidth utilisation using the best performing scheme. We also test a Web Browsing and a set of computer vision workloads. Most are not memory intensive, with utilisation under 15%. Finally, we compare four schedulers and find that the FR-FCFS scheduler using the Write Drain mode [8] performed the best, outperforming the worst scheduler by 6.3%. smartphone workloads workload characterisation memory controller main memory DRAM address mapping memory scheduler memory performance 0984 0537
73	Novel Double-Deposited-Aluminum (DDA) Process for Improving Al Void and Refresh Characteristics of DRAM Hong, Seok-Woo, Kang, Seung-Mo, Choi, In-Hyuk, Jung, Seung-Uk, Park, Dong-Sik, Kim, Kyoung-Ho, Choi, Yong-Jin, Lee, Tae-Woo, Lee, Haebum, Cho, In-Soo 22 July 2016 (has links) In order to resolve the Al void formation originated from the severe stress issues in dynamic random access memory (DRAM), double-deposited-aluminum (DDA) layer process was proposed. This novel metallization process can be effectively and simply performed with the native oxide such as Al 2 O 3 between upper and lower Al metal layer by ex-situ deposition technique. We could effectively control the Al void by adapting the DDA layers with different grain structure. From this novel metallization process, we have confirmed the optimal thickness of Al barrier metal to 100Å to be free from Al voids, which makes it possible to improve the static refresh characteristics of DRAM by 17%. info:eu-repo/classification/ddc/620 ddc:620 Dynamisches RAM
74	Software based memory correction for a miniature satellite in low-Earth orbit / Mjukvarustyrd rättning av minnesfel för en miniatyrsatellit i låg omloppsbana Wikman, John, Sjöblom, Johan January 2017 (has links) The harsh radiation environment of space is known to cause bit flips in computer memory. The conventional way to combat this is through error detection and correction (EDAC) circuitry, but for low-budget space missions software EDAC can be used. One such mission is the KTH project Miniature Student Satellite (MIST), which aims to send a 3U CubeSat into low-Earth orbit. To ensure a high level of data reliability on board MIST, this thesis investigates the performance of different types of EDAC algorithms. First, a prediction of the bit flip susceptibility of DRAM memory in the planned trajectory is made. After that, data reliability models of Hamming and Reed-Solomon (RS) codes are proposed, and their respective running times on the MIST onboard computer are approximated. Finally, the performance of the different codes is discussed with regards to data reliability, memory overhead, and CPU usage. The findings of this thesis suggest that using an EDAC algorithm would greatly increase the data reliability. Among the codes investigated, three good candidates are RS(28,24), RS(196,192) and RS(255,251), depending on how much memory overhead can be accepted. / Rymdens strålningsmiljö är känd för att orsaka bitflippar i datorminnen.Vanligtvis motverkas detta genom att felrättande hårdvara installeraspå satelliten, men för lågkostnadssatelliter kan rättningen iställetskötas i mjukvaran. Ett exempel på en sådan satellit är KTH-projektetMiniature Student Satellite (MIST), vars mål är att skicka upp en 3UCubeSat i låg omloppsbana. Den här uppsatsen undersöker hur olika felrättningsalgoritmer kananvändas för att skydda data ombord på satelliten från att bli korrupt. Först görs en uppskattning av hur strålningskänsliga DRAM minnenär i den planerade omloppsbanan. Därefter föreslås datakorruptionsmodellerför Hamming- och Reed-Solomonkoder (RS) tillsammans meden uppskattning av deras respektive körtider på satellitens omborddator. Slutligen diskuteras de föreslagna koderna med hänsyn till datakorruptionsskydd,minnesanvändning och processoranvändning. Uppsatsens slutsats indikerar att användandet av felrättningsalgoritmerkraftigt minskar risken för datakorruption. Bland de kodersom undersökts framstår RS(28,24), RS(196,192) och RS(255,251) somde bästa alternativen, beroende på hur mycket extra minnesanvändningsom är acceptabelt. Software based memory correction DRAM Reed-Solomon Hamming CubeSat Low-Earth Orbit MIST Computer Sciences Datavetenskap (datalogi)
75	賽局理論下是否存在最適生產模型 / The investigation of optimal production model in DRAM industry under the Game Theory 黃俊欽, Huang, Chun Chin Unknown Date (has links) 影響企業決策的因素隨著時代變遷轉趨複雜，技術演進、財務管理、人力資源，甚至於終端產品發展趨勢牽動整個企業決策方向。同業間的競爭以資訊透明層度影響彼此決策。賽局理論（Game Theory）探討的就是聰明又自私的人如何在策略性佈局中採取行動及與對手互動。如果財務管理是考量企業自身的條件與價值，賽局理論則是加入外部因素包括產業與自身互動關係討論決策模式。本研究以DRAM（Dynamic Random Access Memory 動態隨機存取記憶體）產業為例，每一世代的支本支出皆以百億計，快速推進的製程技術造成產量大增，影響價格崩跌令廠商大幅虧損。本文以賽局理論為基礎討論最適資本結構下的生產規模。藉由廠商選擇有利於企業的發展策略進而控制資本支出，讓企業維持生產規模最適狀態進而達成價格維持的目的。本研究以賽局理論為主軸討論產業內業者競合關係，應用納許均衡討論企業有利決策形成過程，就過去五年內DRAM價格崩跌與產能擴增的速度來印證囚犯的困境，從經濟學的寡佔市場理論來討論DRAM產業內的定價關係，運用A. Cournot古諾雙占模型推廣為基礎來討論產能分配均衡。財務管理中最適資本結構下股價最高，以當時的資本負債比為參考，如果以資本額為股東權益來計算在資產＝負債＋業主權益的恆等式下求出該企業應該有的資產、負債規模。在該目標下進行資本結構與負債管理，達成企業最適產能規模。賽局理論納許均衡囚犯困境寡佔市場理論策略 DRAM
76	Modes de défaillance induits par l'environnement radiatif naturel dans les mémoires DRAMs : étude, méthodologie de test et protection Bougerol, A. 16 May 2011 (has links) (PDF) L'augmentation des performances requises pour les systèmes aéronautiques et spatiaux nécessite l'utilisation de composants électroniques de complexité croissante, dont la fiabilité, incluant la tenue aux radiations cosmiques, doit être évaluée au sol. Les mémoires DRAMs sont largement utilisées, mais leurs modes de défaillance sont de plus en plus variés, aussi les essais traditionnels en accélérateur de particules ne sont plus suffisants pour les caractériser parfaitement. Le laser impulsionnel peut déclencher des effets similaires aux particules ionisantes, aussi cet outil a été utilisé en complément d'accélérateurs de particules pour étudier, d'une part, les événements parasites SEUs (Single Event Upset) dans les plans mémoire et, d'autre part, les SEFIs (Single Event Functional Interrupt) dans les circuits périphériques. Ces études ont notamment permis d'expliquer l'influence des motifs de test sur les sensibilités mesurées, de découvrir l'origine des SEFIs les plus importants ainsi que de valider des techniques pour quantifier leurs surfaces sensibles. Une méthodologie de test destinée aux industriels a été établie, basée sur l'utilisation du moyen laser en complément des essais en accélérateur de particules dans le but d'optimiser les coûts et l'efficacité des caractérisations. En outre, une nouvelle solution de tolérance aux fautes est proposée, utilisant la propriété des cellules DRAMs d'être immune aux radiations pour un de leurs états de charge. SEU tolérance aux fautes Accélérateur de particules DRAM Environnement Radiatif Laser Méthodologie de Test Motif de Test SEFI SEFLU
77	High-performance computer system architectures for embedded computing Lee, Dongwon 26 August 2011 (has links) The main objective of this thesis is to propose new methods for designing high-performance embedded computer system architectures. To achieve the goal, three major components - multi-core processing elements (PEs), DRAM main memory systems, and on/off-chip interconnection networks - in multi-processor embedded systems are examined in each section respectively. The first section of this thesis presents architectural enhancements to graphics processing units (GPUs), one of the multi- or many-core PEs, for improving performance of embedded applications. An embedded application is first mapped onto GPUs to explore the design space, and then architectural enhancements to existing GPUs are proposed for improving throughput of the embedded application. The second section proposes high-performance buffer mapping methods, which exploit useful features of DRAM main memory systems, in DSP multi-processor systems. The memory wall problem becomes increasingly severe in multiprocessor environments because of communication and synchronization overheads. To alleviate the memory wall problem, this section exploits bank concurrency and page mode access of DRAM main memory systems for increasing the performance of multiprocessor DSP systems. The final section presents a network-centric Turbo decoder and network-centric FFT processors. In the era of multi-processor systems, an interconnection network is another performance bottleneck. To handle heavy communication traffic, this section applies a crossbar switch - one of the indirect networks - to the parallel Turbo decoder, and applies a mesh topology to the parallel FFT processors. When designing the mesh FFT processors, a very different approach is taken to improve performance; an optical fiber is used as a new interconnection medium. Turbo decoding GPU architecture SDF graph DRAM system Embedded computer systems High performance computing Electronic data processing
78	The use of memory state knowledge to improve computer memory system organization Isen, Ciji 01 June 2011 (has links) The trends in virtualization as well as multi-core, multiprocessor environments have translated to a massive increase in the amount of main memory each individual system needs to be fitted with, so as to effectively utilize this growing compute capacity. The increasing demand on main memory implies that the main memory devices and their issues are as important a part of system design as the central processors. The primary issues of modern memory are power, energy, and scaling of capacity. Nearly a third of the system power and energy can be from the memory subsystem. At the same time, modern main memory devices are limited by technology in their future ability to scale and keep pace with the modern program demands thereby requiring exploration of alternatives to main memory storage technology. This dissertation exploits dynamic knowledge of memory state and memory data value to improve memory performance and reduce memory energy consumption. A cross-boundary approach to communicate information about dynamic memory management state (allocated and deallocated memory) between software and hardware viii memory subsystem through a combination of ISA support and hardware structures is proposed in this research. These mechanisms help identify memory operations to regions of memory that have no impact on the correct execution of the program because they were either freshly allocated or deallocated. This inference about the impact stems from the fact that, data in memory regions that have been deallocated are no longer useful to the actual program code and data present in freshly allocated memory is also not useful to the program because the dynamic memory has not been defined by the program. By being cognizant of this, such memory operations are avoided thereby saving energy and improving the usefulness of the main memory. Furthermore, when stores write zeros to memory, the number of stores to the memory is reduced in this research by capturing it as compressed information which is stored along with memory management state information. Using the methods outlined above, this dissertation harnesses memory management state and data value information to achieve significant savings in energy consumption while extending the endurance limit of memory technologies. / text Computer architecture Memory power Memory management (Computer science) Memory energy Memory allocation Phase change memory DRAM Computer storage devices Computer memory systems Program semantics
79	High-performance memory system architectures using data compression Baek, Seungcheol 22 May 2014 (has links) The Chip Multi-Processor (CMP) paradigm has cemented itself as the archetypal philosophy of future microprocessor design. Rapidly diminishing technology feature sizes have enabled the integration of ever-increasing numbers of processing cores on a single chip die. This abundance of processing power has magnified the venerable processor-memory performance gap, which is known as the ”memory wall”. To bridge this performance gap, a high-performing memory structure is needed. An attractive solution to overcoming this processor-memory performance gap is using compression in the memory hierarchy. In this thesis, to use compression techniques more efficiently, compressed cacheline size information is studied, and size-aware cache management techniques and hot-cacheline prediction for dynamic early decompression technique are proposed. Also, the proposed works in this thesis attempt to mitigate the limitations of phase change memory (PCM) such as low write performance and limited long-term endurance. One promising solution is the deployment of hybridized memory architectures that fuse dynamic random access memory (DRAM) and PCM, to combine the best attributes of each technology by using the DRAM as an off-chip cache. A dual-phase compression technique is proposed for high-performing DRAM/PCM hybrid environments and a multi-faceted wear-leveling technique is proposed for the long-term endurance of compressed PCM. This thesis also includes a new compression-based hybrid multi-level cell (MLC)/single-level cell (SLC) PCM management technique that aims to combine the performance edge of SLCs with the higher capacity of MLCs in a hybrid environment. Memory systems Cache compression Cache replacement Hybrid DRAM/PCM Data compression (Computer science) High performance computing Computer storage devices Cache memory
80	Global address spaces for efficient resource provisioning in the data center Young, Jeffrey Scott 13 January 2014 (has links) The rise of large data sets, or "Big Data'', has coincided with the rise of clusters with large amounts of memory and GPU accelerators that can be used to process rapidly growing data footprints. However, the complexity and performance limitations of sharing memory and accelerators in a cluster limits the options for efficient management and allocation of resources for applications. The global address space model (GAS), and specifically hardware-supported GAS, is proposed as a means to provide a high-performance resource management platform upon which resource sharing between nodes and resource aggregation across nodes can take place. This thesis builds on the initial concept of GAS with a model that is matched to "Big Data'' computing and its data transfer requirements. The proposed model, Dynamic Partitioned Global Address Spaces (DPGAS), is implemented using a commodity converged interconnect, HyperTransport over Ethernet (HToE), and a software framework, the Oncilla runtime and API. The DPGAS model and associated hardware and software components are used to investigate two application spaces, resource sharing for time-varying workloads and resource aggregation for GPU-accelerated data warehousing applications. This work demonstrates that hardware-supported GAS can be used improve the performance and power consumption of memory-intensive applications, and that it can be used to simplify host and accelerator resource management in the data center. Networks Memory DRAM GPUs Global address spaces Data warehousing InfiniBand Ethernet FPGA Graphics processing units Big data Memory management (Computer science) Computer architecture

Search results