Global ETD Search

1	Integrated Input/Output Interconnection and Packaging for GSI Dang, Bing 03 August 2006 (has links) In this research, a set of integrated I/O interconnection and packaging technologies are investigated. MEMS-based sea-of-leads (SoL) compliant interconnects are demonstrated to be promising to eliminate the need for underfill between a Si chip and organic packaging substrate. Wafer-level packaging with the compliant interconnects can largely reduce the impact on the fragile low-k interlevel dielectric (ILD) films. The technology feasibility of the SoL MEMS I/O interconnects is demonstrated by process integration, assembly, and reliability assessment. To achieve the high power dissipation with compact form factor, integrated thermal-fluidic I/O interconnects and CMOS compatible microchannels are developed to enable a prototype on-chip microfluidic heat sink. In addition, highly integrated electrical and optical interconnects based on dual-mode polymer pillars are fabricated, assembled and tested as a potential solution to the I/O bandwidth bottleneck. The resulting integrated I/O interconnection and packaging technologies are compatible with back-end-of-the-line (BEOL) wafer processing and conventional flip-chip assembly. I/O packaging GSI
2	An Investigation of I/O Strategies for MPI Workloads Attari, Sanya 19 January 2011 (has links) Different techniques could be used for improving application performance in parallel systems. Studies have been shown that I/O communication delay is the main reason for different behavior of I/O intensive applications with specific requirements for performance optimization. So, using common strategies, generally defined and effective for computationally intensive applications may not have the same effect on performance improvement for these applications. Moreover, background system configuration effects on the behavior of the application and its performance. Growing use of parallel multi-core systems is an important factor in increasing performance and speeding up the applications. Since changing multi-core systems hardware is not an efficient method in satisfying different expectations of unique application, it is application developer's responsibility to design flexible and scalable code that is compatible with different environments. On the other hand, predicting application behavior and I/O requirements for I/O intensive applications with irregular communication patterns is a complicated and time-consuming task that pushes the problem to runtime impacts. Addressing this issue, we provided an overview on different techniques used for solving this problem. We have studied I/O bound parallel applications that use MPI as the communication method in order to define a general perspective to optimize cost performance ratio. Our designed experiments cover different setups for these applications in order to define various criteria that should be considered in design stage as well as runtime. Moreover, targeting one of the popular I/O intensive applications, we have discussed some possible solutions to speed it up on a multi-core system. / Master of Science mpi Performance I/O
3	Wide Range Bidirectional Mixed-Voltage-Tolerant I/O Buffer Chang, Wei-chih 25 June 2008 (has links) The thesis is composed of two topics : a fully bidirectional mixed-voltage-tolerant I/O buffer using a clamping dynamic gate bias generator and a wide range fully bidirectional mixed-voltage-tolerant I/O buffer with a calibration function. The first topic, a mixed-voltage-tolerant I/O buffer implemented in 2P4M 0.35 £gm CMOS process, comprises a low-power bias circuit with clamping transistors in a feedback loop, a power supply level detector circuit, a voltage level converter circuit, a logic switch circuit, a dynamic driving detector circuit, and a clamping dynamic gate bias generator. The proposed design can transmit and receive digital signals with voltage levels of 5/3.3/1.8 V without any gate-oxide overstress and leakage current path in different voltage interface applications. The second topic, a 0.9 V to 5.0 V (0.9/1.2/1.8/2.5/3.3/5 V) mixed-voltage-tolerant I/O buffer carried out in 2P4M 0.35 £gm CMOS technology, contains a dynamic gate bias generator to provide appropri¬ate gate voltages for the output stage composed of stacked PMOS and stacked NMOS, an I/O buffer which can transmit the signal with a higher voltage level (VDDH), a floating N-well circuit to remove the body effect at the output PMOS, and a dynamic driving detector to balance the turn-on voltages for the pull-up PMOS and pull-down NMOS in the output stage. The duty cycle of the output signal of the proposed I/O buffer can then be equalized even if the output stage power supply is biased at a low voltage. In order to adapt to wide range input voltage applications, a logic calibration circuit is added in the input buffer. I/O buffer mixed-voltage
4	Discovering the differences that make a difference: racial majority and minority responses to online diversity statements Stephens, Kelsey M. 03 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / The presented research the effect of Ely and Thomas’ (2001) three diversity perspectives—integration-and-learning, discrimination-and-fairness, and access-and-legitimacy—on perceptions of organizations as a function of their implied ideologies (i.e., multiculturalism, colorblindness, and tokenism). It was hypothesized that the organizational websites that enhance multiculturalism, such as the integration-and-learning perspective, will be perceived more favorably than websites that emphasize ideologies of colorblindness and tokenism, such as the discrimination-and-fairness and the access-and-legitimacy diversity perspectives, respectively. Additionally, expanding work by Plaut, Thomas, and Goren (2009) the study proposed that websites portraying the latter two perspectives will be perceived more negatively by Blacks than by Whites. In contrast, diversity perspectives that emphasize multiculturalism, such as the integration-and-learning diversity perspective, are hypothesized to be perceived more favorably, regardless of racial group membership. The main dependent variables of focus are the organizational outcomes of organizational attraction, organizational trust, P-O fit, and perceived justice. Findings suggest that racial group membership does not operate as a significant moderator of the relationship; however, the hypothesis that diversity perspectives would have varying relationships with diversity ideologies was partially verified. diversity recruitment i/o psychology
5	MPIOR: A Framework to Analyze File System Performance of MPI Applications Banerjee, Shankha 11 April 2012 (has links) MPI I/O replay (MPIOR) is an I/O performance modeling and prediction tool used to trace and replay a parallel application to determine application performance under a new I/O sub system. The trace collector deduces synchronization inter-dependencies between nodes and I/O demands placed by each node on the storage subsystem. It uses a novel runtime graph traversal technique to filter and log only those MPI calls that affect I/O, thus substantially reducing both the number of runs and the size of the trace file. Unlike other such tools, MPIOR collects a valid trace in a single run and it does not rely on node sampling or I/O sampling. MPIOR's post processing engine analyzes the trace files and sets up the re-player. Due to minimal overhead for trace collection, MPIOR can be used during production runs rather than just as a debugging tool. The re-player mimics the behavior of the application across a variety of storage systems by mapping multiple processes to multiple threads running on a single node. We show average replay error for parallel applications is below 30%. / Master of Science I/O replay MPI trace
6	Evaluating I/O scheduling techniques at the forwarding layer and coordinating data server accesses / Avaliação de técnicas de escalonamento de E/S na camada de encaminhamento e coordenação de acesso aos servidores de dados Bez, Jean Luca January 2016 (has links) Em ambientes de Computação de Alto Desempenho, as aplicações científicas dependem dos Sistemas de Arquivos Paralelos (SAP) para obter desempenho de Entrada/Saída (E/S), especialmente ao lidar com grandes quantidades de dados. No entanto, E/S ainda é um gargalo para um número crescente de aplicações, devido à diferença histórica entre a velocidade de processamento e de acesso aos dados. Para aliviar a concorrência causada por milhares de nós que acessam um número significativamente menor de servidores SAP, normalmente nós intermediários de E/S são adicionados entre os nós de processamento e o sistema de arquivos. Cada nó intermediário encaminha solicitações de vários clientes para o sistema, uma configuração que dá a este componente a oportunidade de executar otimizações como o escalonamento de requisições de E/S. O objetivo desta dissertação é avaliar diferentes algoritmos de escalonamento, na camada de encaminhamento de E/S, cuja finalidade é melhorar o padrão de acesso das aplicações, agregando e reordenando requisições para evitar padrões que são conhecidos por prejudicar o desempenho. Demonstramos que os escalonadores FIFO (First In, First Out), HBRR (Handle-Based Round-Robin), TO (Time Order), SJF (Shortest Job First) e MLF (Multilevel Feedback) são apenas parcialmente eficazes porque o padrão de acesso não é o principal fator que afeta o desempenho na camada de encaminhamento de E/S, especialmente para requisições de leitura Um novo algoritmo de escalonamento chamado TWINS é proposto para coordenar o acesso de nós intermediários de E/S aos servidores de dados do sistema de arquivos paralelo. Nossa abordagem reduz a concorrência nos servidores de dados, um fator previamente demonstrado como reponsável por afetar negativamente o desempenho. O algoritmo proposto é capaz de melhorar o tempo de leitura de arquivos compartilhados em até 28% se comparado a outros algoritmos de escalonamento e em até 50% se comparado a não fazer o encaminhamento de requisições de E/S. / In High Performance Computing (HPC) environments, scientific applications rely on Parallel File Systems (PFS) to obtain Input/Output (I/O) performance especially when handling large amounts of data. However, I/O is still a bottleneck for an increasing number of applications, due to the historical gap between processing and data access speed. To alleviate the concurrency caused by thousands of nodes accessing a significantly smaller number of PFS servers, intermediate I/O nodes are typically employed between processing nodes and the file system. Each intermediate node forwards requests from multiple clients to the parallel file system, a setup which gives this component the opportunity to perform optimizations like I/O scheduling. The objective of this dissertation is to evaluate different scheduling algorithms, at the I/O forwarding layer, that work to improve concurrent access patterns by aggregating and reordering requests to avoid patterns known to harm performance. We demonstrate that the FIFO (First In, First Out), HBRR (Handle- Based Round-Robin), TO (Time Order), SJF (Shortest Job First) and MLF (Multilevel Feedback) schedulers are only partially effective because the access pattern is not the main factor that affects performance in the I/O forwarding layer, especially for read requests. A new scheduling algorithm, TWINS, is proposed to coordinate the access of intermediate I/O nodes to the parallel file system data servers. Our approach decreases concurrency at the data servers, a factor previously proven to negatively affect performance. The proposed algorithm is able to improve read performance from shared files by up to 28% over other scheduling algorithms and by up to 50% over not forwarding I/O requests. Processamento paralelo Computacao cientifica : Alto desempenho High performance I/O Parallel file systems Parallel I/O I/O forwarding I/O scheduling Access coordination
7	A 10-bit 30-MS/s Pipeline ADC for DVB-H Receiver Systems and Mixed-Voltage Tolerant I/O Cell Design Chang, Tie-Yan 11 July 2007 (has links) The first topic of this thesis proposes a 10-bit, 30 Msample/s pipeline analog-to-digital converter (ADC) suitable for digital video broadcasting over handheld (DVB-H) systems. The ADC is based on the 1.5-bit-per-stage pipeline architecture. The proposed design is implement- ed by 0.18 um CMOS technology. The input range is 2 V peak-to-peak differential signals, and the post-layout simulation result shows that the spurious-free dynamic range (SFDR) is 57.85 dBc with a full-scale sinusoidal input at 700 KHz. The maximum power consumption is 37 mW given a 3.3 V power supply. The core area is 0.27 mm2. The second topic is to propose a fully mixed-voltage-tolerant I/O cell implemented using typical CMOS 2P4M 0.35 um process. Unlike traditional mixed-voltage-tolerant I/O cell, the proposed design can transmit and receive the digital signals with voltage levels of 5/3.3/1.8 V. By using stacked PMOS and stacked NMOS at the output stage and a voltage level converter providing appropriate control voltages for the gates of the stacked PMOS, the gate-oxide overstress and hot-carrier degradation are avoided. Moreover, gate-tracking and floating N-well circuits are used to remove the undesirable leakage current paths. The maximum transmitting speed of the proposed I/O cell is 103/120/84 Mbps for the supply voltage of I/O cell at 5/3.3/1.8 V, respectively, given the load of 20 pF. Mixed-Voltage Pipeline ADC I/O Cell
8	Reliable low latency I/O in torus-based interconnection networks Azeez, Babatunde 25 April 2007 (has links) In today's high performance computing environment I/O remains the main bottleneck in achieving the optimal performance expected of the ever improving processor and memory technologies. Interconnection networks therefore combines processing units, system I/O and high speed switch network fabric into a new paradigm of I/O based network. It decouples the system into computational and I/O interconnections each allowing "any-to-any" communications among processors and I/O devices unlike the shared model in bus architecture. The computational interconnection, a network of processing units (compute-nodes), is used for inter-processor communication in carrying out computation tasks, while the I/O interconnection manages the transfer of I/O requests between the compute-nodes and the I/O or storage media through some dedicated I/O processing units (I /O-nodes). Considering the special functions performed by the I/O nodes, their placement and reliability become important issues in improving the overall performance of the interconnection system. This thesis focuses on design and topological placement of I/O-nodes in torus based interconnection networks, with the aim of reducing I/O communication latency between compute-nodes and I/O-nodes even in the presence of faulty I/O-nodes. We propose an efficient and scalable relaxed quasi-perfect placement scheme using Lee distance error correction code such that compute-nodes are at distance-t or at most distance-t+1 from an I/O-node for a given t. This scheme provides a better and optimal alternative placement than quasi perfect placement when perfect placement cannot be found for a particular torus. Furthermore, in the occurrence of faulty I/O-nodes, the placement scheme is also used in determining other alternative I/O-nodes for rerouting I/O traffic from affected compute-nodes with minimal slowdown. In order to guarantee the quality of service required of inter-processor communication, a scheduling algorithm was developed at the router level to prioritize message forwarding according to inter-process and I/O messages with the former given higher priority. Our simulation results show that relaxed quasi-perfect outperforms quasi-perfect and the conventional I/O placement (where I/O nodes are concentrated at the base of the torus interconnection) with little degradation in inter-process communication performance. Also the fault tolerant redirection scheme provides a minimal slowdown, especially when the number of faulty I/O nodes is less than half of the initial available I/O nodes. I/O interconnection networks torus parallel computers
9	Design and Implementation of A Low-cost Video Decoder with Low-power SRAM and Digital I/O Cell Lee, Ching-Li 10 January 2008 (has links) Video decoders play a very important role in the TV receivers. This is especially true for NTSC-based TVs. The design and implementation of the video decoder with two-line delay comb filter are presented. Moreover, the works includes the low-power SRAM (static random access memory) in the comb filter for storing scanning line data and the low-power small-area I/O cells for transmitting digital data. A digital phase lock loop (PLL) in the proposed video decoder uses a ROM-less 4£c-based direct digital frequency synthesizer (DDFS)-based digital control oscillator to resolve the false locking problem. Two 20-tap transposed FIRs (finite-duration impulse response filter) are used to implement the low pass filters (LPF) in the chrominance demodulator. Besides, the unnecessary decimals of the coefficients of the LPF are truncated to reduce hardware cost. The proposed SRAM takes advantage of a negative word-line voltage controlling the access transistors of the memory cell to reduce the leakage current in the standby mode. Besides, a memory bank partition scheme and a clock gating scheme are also used to save more power. Finally, a fully different concept from current I/O designs is proposed. The novel I/O cell takes advantage of reducing output voltage swing as well as transistors with different threshold voltages such that the area and power consumption of overall chip can be drastically reduced. video decoder digital I/O cell SRAMs
10	Application of active inductors in high-speed I/O circuits Lee, Yen-Sung Michael 11 1900 (has links) This thesis explores the use of active inductors as a compact alternative to the bulky passive spiral structures in high-speed I/O circuits. A newly proposed PMOS-based topology is introduced and used in active-inductor terminations. The 1st prototype design fabricated in a 90-nm CMOS process consists of an output driver using active-inductor terminations to provide channel equalization and output impedance matching. From measurement results, the use of active inductors in the termination, as compared to when the active inductor is disabled, increases the vertical eye opening in the receiver side by a factor of two and reduces the jitterp-p by 30% of the transmitted 10 Gb/s (2³¹-1) pseudo-random binary sequence pattern, over a 6-inch FR4 channel. An output impedance matching with S₂₂ less than -10 dB over a bandwidth of 20 GHz is achieved. The pair of active-inductor terminations occupies 17×25 µm² and has a low overhead power consumption of 0.8 mW. In the 2nd prototype design, a 4-stage output buffer with active-inductor loads is designed and implemented in a 65-nm CMOS process. Simulation results verify that when operating at 31.25 Gb/s, the output eye of the active-inductor load buffer compares favorably with that of the passive-inductor load buffer. For a similar eye-height and 78% less timing jitter the active-inductor load design’s speed (31.25 Gb/s) is 25% faster than the passive-resistor load design (25 Gb/s). The active-inductor load output buffer achieves comparable performance in terms of speed, power, and output swing with other reported designs using passive inductors. Its total area is 135×30 µm² (including three differential active inductors) which is comparable to the size of a single passive spiral inductor having a 0.5~1 nH inductance. Active inductor High-speed i/o

Search results