Global ETD Search

21	Polymorphic multiple-processor networks Rana, Deepak 01 January 1991 (has links) Many existing multiple-processor architectures are designed to efficiently exploit parallelism in a specific narrow range, where the extremes are fine-grained data parallelism and coarse-grained control parallelism. Most real world problems are comprised of multiple tasks which vary in their range of parallelism. Future multiple-processor architectures must be "flexible" in supporting multiple forms of parallelism to tackle these problems. This thesis addresses issues related to communication in "flexible" multiple-processor systems. Intermediate-level vision is chosen as the application domain for demonstrating multi-modal parallelism. The specific problem addressed is to determine the communication requirements of the intermediate level processors of the Image Understanding Architecture, explore the design space of potential solutions, develop a network design that meets the requirements, demonstrate the feasibility of constructing the design, and show both analytically and empirically that the design meets the requirements. The major contributions of this thesis are: (1) Crossbars and other dense networks are viable design alternatives even for large parallel processors; (2) Central control is viable for reasonably large network sizes, which is contrary to conventional wisdom; (3) It is shown that by using a special search memory to implement part of the Clos and Benes network routing algorithm in hardware, it is feasible to quickly reconfigure these networks so that they may be used in fine-grained, data-dependent communication; (4) The feasibility of constructing easily reconfigurable communication networks for "flexible" multiple-processor systems is shown. These networks can quickly reconfigure their topologies to best suit a particular algorithm, can be controlled efficiently (in SIMD as well as MIMD mode), and can efficiently route messages (especially with low overhead in SIMD mode); (5) During the course of this investigation it was discovered that, flexible communication as well as shared memory support is much more critical for supporting intermediate-level vision than providing a variety of fixed communication patterns. This observation may also have implications for general-purpose parallel processing; and (6) It was also discovered that supporting a symbolic token database at the intermediate level is a more fundamental requirement than supporting particular algorithms. Computer science\|Electrical engineering
22	Spatial and temporal grouping in the interpretation of image motion Sawhney, Harpreet S 01 January 1992 (has links) Interpretation of the three-dimensional (3D) world from two-dimensional (2D) images is a primary goal of vision. Image motion is an important source of 3D information. The 3D motion (relative to a camera), and the 3D structure of the environment manifest themselves as image motion. The primary focus of this thesis is the reliable derivation of scene structure through an interpretation of motion in monocular images over extended time intervals. An approach is presented for the integration of spatial models of the 3D world with temporal models of 3D motion in order to robustly derive 3D structure and perform tracking and segmentation. Two specific problems are addressed to illustrate this approach. First, a model of a class of 3D objects is combined with smooth 3D motion to track, identify and reconstruct shallow structures in the scene. Second, a specific model of 3D motion along with general spatial constraints is employed for 3D reconstruction of motion trajectories. Both parts rely fundamentally upon the quantitative modeling of common fate of a structure in motion. In many man-made environments, obstacles in the path of a mobile robot can be characterized as shallow, i.e. they have relatively small extent in depth compared to the distance from the camera. Shallowness can be quantified as affine describability. This is embedded in a tracking system to discriminate between shallow and non-shallow structures based on their affine trackability. The temporal evolution of a structure, derived using affine constraints, is used for verifying its identity as a shallow structure and for its 3D reconstruction. Spatio-temporal analysis and integration is further demonstrated through a two-stage technique for the reconstruction of 3D structure and motion of a scene undergoing a rotational motion with respect to the camera. First, the spatio-temporal grouping of discrete point correspondences as a set of conic trajectories in the image plane is obtained, by exploiting their common motion. This leads to a description that is reliable when compared to the independent fitting of trajectories. The second stage uses a new closed-form solution for computing the 3D trajectories from the computed image trajectories under perspective projection. Computer science\|Electrical engineering
23	Connection management in very-high-speed interconnection systems Kienzle, Martin G 01 January 1992 (has links) The requirement to connect high-performance components of supercomputer systems has caused an increasing demand for connection-oriented, high-speed peer-to-peer communication. The importance of this type of communication has led to the advent of the HIPPI standard, designed to support interconnection systems for this environment. While much research has been directed towards the physical layer of these systems, not much effort has been expended to address system control issues. This thesis explores the connection management, service disciplines, configuration, and performance of very-high-speed interconnection systems. We propose several connection management policies that represent different trade-offs of cost, efficiency, and system performance. The centralized connection management policy assumes knowledge of the connection state, requires a complex implementation, but gives the best performance. At the other end of the spectrum, the distributed connection management policy assumes no knowledge of the connection state and is very simple to implement. However, its performance is lower, and its best performance can be achieved only at the cost of significant overhead to the node systems. For the implementation of the centralized policy, we introduce a connection management algorithm that gives preferential access to important connections according to a non-preemptive priority discipline, that treats connections within the same priority class equitably, and that achieves these objectives at high performance and low node overhead. We discuss the implementation of this connection management algorithm in two concrete system contexts. To support the evaluation of these policies and of the priority algorithm, we develop analytic performance models that capture the salient features of the interconnection systems, their configurations, their connection management policies, and their service disciplines. Using these models, we compare the performance and the node overhead of the connection management policies and interconnection system configurations, and we demonstrate the impact of the choice of service discipline. Computer science\|Electrical engineering
24	Design, modeling, and evaluation of high-performance I/O subsystems Chen, Shenze 01 January 1992 (has links) In today's computer systems, the disk I/O subsystem is often identified as a major bottleneck to system performance. Removing this bottleneck has proven to be a challenging research problem. The research reported in this dissertation is motivated by the design and performance issues of disk I/O subsystems for a variety of today's computing systems. The goal is to design and evaluate high performance I/O subsystems for computing systems which support various applications. Our contributions are three-fold: for different application areas, we (1) propose new architecture and scheduling policies for the disk I/O subsystems; (2) develop analytic models and simulators for these disk subsystems; and (3) study and compare the performance of these architectures and scheduling policies. First, we study a mirrored disk which can be found in various fault tolerant systems, where each data item is duplicated. While the primary purpose of disk mirroring is to provide the fault tolerance, we are interested in how to achieve performance gain by taking advantage of the two data copies. In particular, we propose and examine several policies which differ according to the manner in which read requests are scheduled to one of the two copies. Analytic models are developed to model the behavior of these policies and the best policies are proposed. In addition, the modeling techniques developed in this study are of independent interest, which can be used or extended to other system studies. Second, we investigate existing and propose new disk array architectures for high performance computing systems. Depending on the applications, we propose scheduling policies suitable for these architectures. In particular, we study three variations of RAID 1, mirrored declustering, chained declustering, and group-rotate declustering. We compare the performance of RAID 5 versus Parity Striping. Finally, we examine the performance/cost trade-off of RAID 1 and RAID 5. The performance studies of the various disk array architectures coupled with the proposed scheduling policies are based on our analytic models and simulators. Third, we examine the disk I/O subsystems for real-time systems, where each I/O is expected to complete before a deadline. The goal is to minimize the fraction of requests that miss their deadlines. In doing so, we first propose two new real-time disk scheduling algorithms, and compare them with other known real-time or conventional algorithms in an integrated real-time transaction system model. We then extend this study to a mirrored disk and disk arrays by combining the real-time algorithms with the architectures and policies studied before. Last, we study disk I/O in a non-removal real-time system environment. Computer science\|Electrical engineering
25	An investigation of techniques for partial scan and parity testable BIST design Park, Sung Ju 01 January 1992 (has links) One method of reducing the difficulty of test generation for sequential circuits is by the use of full scan design. To overcome the large extra hardware overhead attendant in the full scan design, the concept of partial scan designs has emerged. With the sequential circuit modeled as a directed graph, much effort has been expended to remove the subset of arcs or vertices representing flip-flops (FF). First we describe an efficient algorithm for finding a minimum feedback arc set that breaks all cycles in directed graphs. A sequential ordering technique based on depth-first search and an efficient cut algorithm are discussed. We also introduce an efficient algorithm to find a Minimum Feedback Vertex Set (MFVS) in directed graphs. This algorithm is based on the removal of essential cycles in which no subset of vertices constitutes a cycle. Then cycles whose length are greater than K are removed under the observation that the complexity of test generation in the sequential circuits is often caused by the lengthy cycles (synchronization sequence). A new structure called a totally combinationalized structure is developed to simplify the problems of test generation and fault simulation for sequential circuits to those for combinational circuits. This structure requires less scan FFs than full scan design and totally combinationalizes the sequential problem. The FFs in the sequential circuits are dedicated Test Pattern Generators and Test Response Compressors in Built-In Self-Test. Most of the benchmark circuits are known to be parity-even. However, it is the parity-odd circuits that are likely to detect most of the faults using a parity-bit checker test response compressor. After investigating parity testable faults, a novel technique which imposes linear constraints among primary inputs is described which changes most of the primary outputs to parity-odd and also compacts the test signals. It is shown that high fault coverage can be obtained by combining both MISR and the parity-bit checker. Computer science\|Electrical engineering
26	Issues in large ultra-reliable system design Suri, Neeraj 01 January 1992 (has links) A primary design basis of ultra-reliable systems is the system synchronization framework. This provides for time synchronization within the system, and also forms the basis of the distributed agreement protocols, which direct the aspects of fault-tolerance, redundancy management, scheduling, and error-handling functions within the system. Traditional system designs have focused on developing the theory and techniques for fully connected systems. These basic principles do not directly extrapolate to both large and non-fully connected designs. Also, the designs have ranged between two extremes: those considering all system faults to be benign, and models where all system faults are considered malicious--for system operations, algorithm design and for reliability assessment purposes. This dissertation considers the synchronization problem in systems with a two-fold objective. Firstly, it addresses the problem of fault classification based on fault manifestations and develops the theory for convergence based synchronization functions. Next, a large cluster based non-fully connected architectural model is presented. The system model is shown to be physically realizable without the dominating graph complexities associated in a similarly sized fully connected approach. For this model and for general networks, an initial synchronization and a unique variation of the steady state synchronization algorithm applicable to non-fully connected systems is presented. An important design consideration is the accurate assessment of the system design. A novel reliability assessment approach is presented for the architecture models, in conjunction with the fault classification model, to obtain a precise and realistic fault coverage reliability model. ftn$\sp1$This work was funded in part under ONR N00014-91-C-0014, AFOSR 88-0205 and grants for Allied-Signal. Computer science\|Electrical engineering
27	Scheduling algorithms for mobile packet radio networks Ahn, Hyeyeon 01 January 1994 (has links) This dissertation presents new scheduling algorithms for sharing the common radio channel in mobile packet radio networks. The idea of sharing a common, multiple-access channel among many users was used in the past as the basis of many communication systems. However, the unique characteristics of the packet radio networks preclude the straightforward application of existing protocols that are tuned to other types of networks. For single hop packet radio networks with arbitrary multiple reception capacity, the first nontrivial scheduling algorithm is developed with guaranteed, almost optimal efficiency, obtained without requiring the simultaneous transmission/reception capability. We propose a robust scheduling protocol for multihop networks, which is unique in providing a topology transparent solution to scheduled access in mobile radio networks with guaranteed packet delivery. Computer science\|Electrical engineering
28	The evaluation of massively parallel array architectures Herbordt, Martin Christopher 01 January 1994 (has links) Although massively parallel arrays have been proposed since the 1950's and built since the 1960's, they have undergone very few systematic studies and these have covered only a small fraction of the design space. The major problems limiting previous studies are: computational cost of detailed and accurate simulations; programming cost of creating a test suite that compiles to the various target architectures and runs on them with comparable efficiency; and diversity of the architectural design space, especially communication networks. These issues are addressed in the construction of ENPASSANT, an evaluation environment for massively parallel array architectures that obtains performance measures of candidate designs with respect to real program executions. We address the computational cost problem with a novel approach to trace-based simulation. Code is run on an abstract virtual machine to generate a coarse-grained trace, which is then refined through a series of transformations (a process we call trace compilation) wherein greater resolution is obtained with respect to the details of the target architecture. We have found this technique to be one to two orders of magnitude faster than detailed simulation, while still retaining much of the accuracy of the model. Furthermore, abstract machine traces must be regenerated for only a small fraction of the possible architectural parameter combinations. Using virtual machine emulation and trace compilation also addresses program portability by allowing the user to code in a single language with a single compiler, regardless of the target architecture. Fairness and programmability are obtained with architecture dependent application libraries for a small set of critical functions. The diverse design space is covered by using parameterized models of the architectural components which direct ENPASSANT in the evaluation of the target machines on the basis of user specifications. ENPASSANT has already generated significant results, including effects of varying the number of dimensions in k-ary n-cubes, trade-offs in register and cache design, and usefulness of certain ALU features. Some surprising results are that bidirectional links provide a large advantage for k-ary n-cubes (where n = 2) in an essential application, and that smaller rather than larger cache block sizes are favored for most applications studied. Computer science\|Electrical engineering
29	Space and time scheduling in multicomputers Das Sharma, Debendra 01 January 1995 (has links) Multicomputers are expensive resources that must be shared among multiple users to achieve the desired levels of throughput, utilization, and price-performance ratio. A multi-user environment may be provided either by space partitioning, or time-sharing, or a combination of both. This dissertation presents fast and efficient techniques to improve the performance of multi-user multicomputers using both space partitioning as well as 'time-sharing on space partitions' approaches. The techniques have been specifically targeted for mesh and hypercube multicomputers; the two popular topologies for commercial multicomputers. Space partitioning deals with executing independent tasks on independent partitions. It comprises of two components: the processor allocator and the job sequencer. This dissertation presents fast and efficient strategies for processor allocation in mesh and hypercubes. Simulation results demonstrate that the proposed strategies outperform all the existing methods in terms of the response times while possessing the least time and space overheads. The strategies proposed for job sequencing are independent of the topology and improve the turn-around times of jobs significantly. In addition, they are shown to efficiently utilize the proposed processor allocation strategies. This results in an improved performance and low space and time overheads. Time-sharing on space partitions possesses a promising prospect for improving the response times of users while providing an interactive service. A subcube level time-sharing strategy has been proposed for hypercube multicomputers. The proposed strategy tries to allocate overlapping subcubes to incoming tasks where each processor is time-shared between the various processes running on it. The proposed strategy was implemented on a 64 node nCUBE 2E system running real applications. The proposed strategy is shown to outperform both the space partitioning and the time-sharing approaches. Computer science\|Electrical engineering
30	Retrieval and annotation of music using latent semantic models Levy, Mark January 2012 (has links) This thesis investigates the use of latent semantic models for annotation and retrieval from collections of musical audio tracks. In particular latent semantic analysis (LSA) and aspect models (or probabilistic latent semantic analysis, pLSA) are used to index words in descriptions of music drawn from hundreds of thousands of social tags. A new discrete audio feature representation is introduced to encode musical characteristics of automatically-identified regions of interest within each track, using a vocabulary of audio muswords. Finally a joint aspect model is developed that can learn from both tagged and untagged tracks by indexing both conventional words and muswords. This model is used as the basis of a music search system that supports query by example and by keyword, and of a simple probabilistic machine annotation system. The models are evaluated by their performance in a variety of realistic retrieval and annotation tasks, motivated by applications including playlist generation, internet radio streaming, music recommendation and catalogue search. 820.285

Search results