161 |
A quantitative performance evaluation of SCI memory hierarchiesHexsel, Roberto A. January 1994 (has links)
The Scalable Coherent Interface (SCI) is an IEEE standard that defines a hardware platform for scalable shared-memory multiprocessors. SCI consists of three parts. The first is a set of physical interfaces that defines board sizes, wiring and network clock rates. The second is a communication protocol based on undirectional point to point links. The third defines a cache coherence protocol based on a full directory that is distributed amongst the cache and memory modules. The cache controllers keep track of the copies of a given datum by maintaining them in a doubly linked list. SCI can scale up to 65520 nodes. This dissertation contains a quantitative performance evaluation of an SCI-connected multiprocessor that assesses both the communication and cache coherence subsystems. The simulator is driven by reference streams generated as a by-product of the execution of "real" programs. The workload consists of three programs from the SPLASH suite and three parallel loops. The simplest topology supported by SCI is the ring. It was found that, for the hardware and software simulated, the largest efficient ring size is between eight and sixteen nodes and that raw network bandwidth seen by processing elements is limited at about 80Mbytes/s. This is because the network saturates when link traffic reaches 600-700Mbytes/s. These levels of link traffic only occur for two poorly designed programs. The other four programs generate low traffic and their execution speed is not limited by interconnect nor cache coherence protocol. An analytical model of the multiprocessor is used to assess the cost of some frequently occurring cache coherence protocol operations.
|
162 |
VLSI architectures for public key cryptologyTomlinson, Allan January 1991 (has links)
This thesis addresses the issue of the efficient implementation of public key cryptosystems. Unlike conventional systems, public key cryptosystems allow secure exchange of information between two parties without prior exchange of secret keys. In addition to this, many public key cryptosystems may be used to provide digital signatures for authentication of documents. The underlying mathematics of most of these systems however, is more complex than that found in conventional systems, resulting in relatively poor performance of public key cryptosystems in terms of encryption rates. To improve the bandwidth of the encryption algorithms, processors specifically designed to implement public key cryptosystems are needed. The research presented in this thesis has identified modular multiplication of large integers to be a bottleneck in virtually all public key algorithms and proposes a novel approach to this operation suitable for hardware implementation. A modular multiplier architecture based on this technique has been proposed and forms the basis of a cascadable modular arithmetic processor capable of dealing with user defined word lengths. The device has been fabricated and results of tests on the finished chip suggest that the RSA encryption algorithm with a 512 bit modulus will achieve a throughput of 30 Kbits/s.
|
163 |
A compositional approach to performance modellingHillston, Jane January 1994 (has links)
Performance modelling is concerned with the capture and analysis of the dynamic behaviour of computer and communication systems. The size and complexity of many modern systems result in large, complex models. A compositional approach decomposes the system into subsystems that are smaller and more easily modelled. In this thesis a novel compositional approach to performance modelling is presented. This approach is based on a suitably enhanced process algebra, PEPA (Performance Evaluation Process Algebra). The compositional nature of the language provides benefits for model solution as well as model construction. An operational semantics is provided for PEPA and its use to generate an underlying Markov process for any PEPA model is explained and demonstrated. Model simplification and state space aggregation have been proposed as means to tackle the problems of large performance models. These techniques are presented in terms of notions of equivalence between modelling entitles. A framework is developed for analysing such notions of equivalence and it is explained how the bisimulation relations developed for process algebras fit within the framework. Four different equivalence relations for PEPA, two structural and two based on bisimulation are developed and considered within this framework. For each equivalence the implications for the underlying Markov process are studied and its potential use as the basis of a model simplification technique is assessed. Three of these equivalences are shown to be congruences and all are complementary to the compositional nature of the models considered. As well as their intrinsic interest from a process algebra perspective, each of these notions of equivalence is also demonstrated to be useful in a performance modelling context. The strong structural equivalence, <I>isomorphism</I>, generates equational laws which form the basis of model transformation techniques. This is weakened to define <I>weak isomorphism. </I>
|
164 |
Logic-based machine learning using a bounded hypothesis space : the lattice structure, refinement operators and a genetic algorithm approachTamaddoni Nezhad, Alireza January 2013 (has links)
Rich representation inherited from computational logic makes logic-based machine learning a competent method for application domains involving relational background knowledge and structured data. There is however a trade-off between the expressive power of the representation and the computational costs. Inductive Logic Programming (ILP) systems employ different kind of biases and heuristics to cope with the complexity of the search, which otherwise is intractable. Searching the hypothesis space bounded below by a bottom clause is the basis of several state-of-the-art ILP systems (e.g. Progol and Aleph). However, the structure of the search space and the properties of the refinement operators for theses systems have not been previously characterised. The contributions of this thesis can be summarised as follows: (i) characterising the properties, structure and morphisms of bounded subsumption lattice (ii) analysis of bounded refinement operators and stochastic refinement and (iii) implementation and empirical evaluation of stochastic search algorithms and in particular a Genetic Algorithm (GA) approach for bounded subsumption. In this thesis we introduce the concept of bounded subsumption and study the lattice and cover structure of bounded subsumption. We show the morphisms between the lattice of bounded subsumption, an atomic lattice and the lattice of partitions. We also show that ideal refinement operators exist for bounded subsumption and that, by contrast with general subsumption, efficient least and minimal generalisation operators can be designed for bounded subsumption. In this thesis we also show how refinement operators can be adapted for a stochastic search and give an analysis of refinement operators within the framework of stochastic refinement search. We also discuss genetic search for learning first-order clauses and describe a framework for genetic and stochastic refinement search for bounded subsumption. on. Finally, ILP algorithms and implementations which are based on this framework are described and evaluated.
|
165 |
Learning in mobile context-aware applicationsSmith, Jeremiah January 2015 (has links)
This thesis explores and proposes solutions to the challenges in deploying context-aware systems that make decisions or take actions based on the predictions of a machine learner over long periods of time. In particular, this work focuses on mobile context-aware applications which are intrinsically personal, requiring a specific solution for each individual that takes into account user preferences and changes in user behaviour as time passes. While there is an abundance of research on mobile context-aware applications which employ machine learning, most does not address the three core challenges required to be deployable over indefinite periods of time. Namely, (1) user-friendly and longitudinal collection and labelling of data, (2) measuring a user's experienced performance and (3) adaptation to changes in a user's behaviour, also known as concept drift. This thesis addresses these challenges by introducing (1) an infer-and-confirm data collection strategy which passively collects data and infers data labels using the user's natural response to target events, (2) a weighted accuracy measure Aw as the objective function for underlying machine learners in mobile context-aware applications and (3) two training instance selection algorithms, Training Grid and Training Clusters which only forget data points in areas of the data space where newer evidence is available, moving away from the traditional time window based techniques. We also propose a new way of measuring concept drift indicating which type of concept drift adaption strategy is likely to be beneficial for any given dataset. This thesis also shows the extent to which the requirements posed by the use of machine learning in deployable mobile context-aware applications influences its overall design by evaluating a mobile context-aware application prototype called RingLearn, which was developed to mitigate disruptive incoming calls. Finally, we benchmark our training instance selection algorithms over 8 data corpuses including the RingLearn corpus collected over 16 weeks and the Device Analyzer corpus which logs several years of smartphone usage for a large set of users. Results show that our algorithms perform at least as well as state-of-the-art solutions and many times significantly better with performance delta ranging from -0.2% to +11.3% compared to the best existing solutions over our experiments.
|
166 |
Dynamic data placement and discovery in wide-area networksBall, Nicholas January 2013 (has links)
The workloads of online services and applications such as social networks, sensor data platforms and web search engines have become increasingly global and dynamic, setting new challenges to providing users with low latency access to data. To achieve this, these services typically leverage a multi-site wide-area networked infrastructure. Data access latency in such an infrastructure depends on the network paths between users and data, which is determined by the data placement and discovery strategies. Current strategies are static, which offer low latencies upon deployment but worse performance under a dynamic workload. We propose dynamic data placement and discovery strategies for wide-area networked infrastructures, which adapt to the data access workload. We achieve this with data activity correlation (DAC), an application-agnostic approach for determining the correlations between data items based on access pattern similarities. By dynamically clustering data according to DAC, network traffic in clusters is kept local. We utilise DAC as a key component in reducing access latencies for two application scenarios, emphasising different aspects of the problem: The first scenario assumes the fixed placement of data at sites, and thus focusses on data discovery. This is the case for a global sensor discovery platform, which aims to provide low latency discovery of sensor metadata. We present a self-organising hierarchical infrastructure consisting of multiple DAC clusters, maintained with an online and distributed split-and-merge algorithm. This reduces the number of sites visited, and thus latency, during discovery for a variety of workloads. The second scenario focusses on data placement. This is the case for global online services that leverage a multi-data centre deployment to provide users with low latency access to data. We present a geo-dynamic partitioning middleware, which maintains DAC clusters with an online elastic partition algorithm. It supports the geo-aware placement of partitions across data centres according to the workload. This provides globally distributed users with low latency access to data for static and dynamic workloads.
|
167 |
Design interfaces for high-level synthesis : library modelling, netlist generation and visualisationJohal, S. S. January 1993 (has links)
In the fast growing field of high-level synthesis, very little attention has been paid to the areas where core synthesis tools must interact with their immediate environment. Library modelling, netlist generation and design visualisation are the three interfaces that have been neglected at the expense of advances in core synthesis tools. This thesis addresses this problem by looking at these primary interfaces and developing the ideas and tools that are needed to provide significant improvements over and above interfaces used by existing systems. Most of the results of this work has been embodied in the development of the SAGE high-level synthesis system, whose most significant difference between existing high-level synthesis systems is that the electronic design engineer is able to direct the process of synthesis to a very fine degree of granularity. The main vehicle that has helped achieve this is the visibility of design information through graphical representations with which a designer is able to directly interact. This is in stark contrast to the purely automatic approaches of many synthesis systems, whose only support in heading towards the desired solution tends to be in the form of restarting a synthesis session from scratch. As well as the interfaces themselves, support tools in the form of sound software building blocks combined with software frameworks around which solid interfaces can be built are equally important. Without them, the interfaces would be concepts without proof in reality. Consequently, an equally important problem that this thesis addresses is the development of the necessary tools that can ensure this can happen.
|
168 |
MOSFET characterisation and its application to process control and VLSI circuit designGribben, Anthony January 1988 (has links)
As the silicon fabrication industry has rapidly expanded, competition has led to smaller geometry circuits in order to maximise profit and obtain optimum performance. Device operation has to be characterised more rigorously because the tolerances on device operation are reduced and designers are constantly endeavouring to push the limits of the technology. In order to characterise MOSFETs, parameters for the SPICE level 3 model can be extracted. Although SPICE has been around for several years, commercial programs which extract parameters using numerical optimisation have only recently become available. A program, PARAMEX, has been developed to physically extract parameters which accurately simulate device operation. A thorough analysis of parameters for different geometry devices has been carried out and recommendations for simulating devices of different sizes are provided. Of particular interest to designers is the definiton of a 'worst case' parameter set and by extracting parameters from numerous sites on a single wafer, a method for determining a 'worst case' set is proposed. Ideally if SPICE parameters are to be central to the design of integrated circuits, it would be useful to link them to specific steps in fabrication. Parameters from wafers fabricated using different processes were correlated with the process steps which had been varied and the effects on both first and second order parameters are described. The subthreshold region is of increasing importance in small geometry circuits. As fabrication processes have evolved, more implants have been made in the channel region with only limited regard to the effect on the subthreshold currents. By thoroughly analysing the subthreshold currents in transistors manufactured with different channel profiles, conclusions about the effect of channel implants on subthreshold operation and the consequences for simple circuits are set out.
|
169 |
The design of systems for telecommunications between small and large computersJohn, Robin B. January 1973 (has links)
No description available.
|
170 |
Demand-driven, concurrent discrete event simulationSmart, Colin January 2001 (has links)
The simulation of complex systems can consume vast amounts of computing power. In common with other disciplines faced with complex systems, simulationists have approached the management of complexity from two angles; sub-system evaluation and hierarchical evaluation. Sub-system evaluation attempts to determine the global behaviour by determining the local behaviour and then joining these behaviours together. Hierarchical simulation tries to reduce the detail in the system in areas which are less critical to the model. Demand driven evaluation provides a coherent approach to the problem of simulating large systems at different levels of abstraction, at a cost comparable to data-driven evaluation. A model for both data and demand driven evaluation is described which captures the total communication and computation load for each node in the system. The runtime dynamics of each system is investigated with particular emphasis on the relation between the costs of generating and transmitting an event. The ability of demand driven evaluation to simulate a system at a number of different levels of abstraction during a single run is also investigated and results presented which show the efficacy of such a hierarchical system.
|
Page generated in 0.0374 seconds