• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 2
  • 1
  • Tagged with
  • 42
  • 8
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

The classical simulation of noisy quantum computers : a polyhedral approach

Ratanje, Nikhil January 2017 (has links)
In this thesis we explored the consequences of considering generalised non-quantum notions of entanglement in the classical simulation of noisy quantum computers where the available measurements are restricted. Such noise rates serve as upper bounds to fault tolerance thresholds. These measurement restrictions come about either through imperfection, and/or by design to some limited set. By considering sets of operators that return positive measurement outcome probabilities for the restricted measurements, one can construct new single particle state spaces containing quantum and non-quantum operators. These state spaces can then be used with a modified version of Harrow and Nielsen’s classical simulation algorithm to efficiently simulate noisy quantum computers that are incapable of generating generalised entanglement with respect to the new state spaces. Through this approach we developed alternative methods of classical simulation, strongly connected to the study of non-local correlations, in that we constructed noisy quantum computers capable of performing non-Clifford operations and could generate some forms of multiparty quantum entanglement, but were classical in that they could be efficiently classically simulated and could not generate non-local statistics. We focused on magic state quantum computers (that are limited to only Pauli measurements), with ideal local gates, but noisy control-Pauli Z gates, and calculated the noise needed to ensure the control-Z gates became incapable of generating generalised entanglment for a variety of noise models and state space choice, with the aim of finding an optimal single particle state space requiring the least noise to remove the generalised entanglement. The state spaces were required to always return valid measurement probabilities, this meant they also had had to have octahedral symmetry to ensure local gates did not take states outside the state space. While we able to find to the optimal choice for highly imperfect measurements, were we unable to find the optimal in all cases. Our best candidate state space required less joint depolarising noise at [approximately equal to] 56% in comparison to noise levels of [approximately equal to] 67% required if the algorithm used quantum notions of separability. This suggests that generalised entanglement may offer more insight than quantum entanglement when discussing the power of Clifford operation based quantum computers.
22

Using estimation of distribution algorithms to detect concurrent faults

Staunton, Jan January 2012 (has links)
With processors tending toward more processing cores instead of higher clock speeds, developers are increasingly forced to grapple with concurrent paradigms to maximally exploit new CPU designs. Embracing concurrent paradigms entails the poten- tial risk of encountering concurrent software faults. Concurrent faults result from unforeseen timings of interactions between concurrent components, as opposed to traditional software faults that arise from functional failures. As a result, concurrent faults have a higher probability of surviving the software development process, potentially causing a catastrophic failure of high cost. As the complexity of software and hardware systems increases they become increasingly difficult to test. One measure of com- plexity is the number of potential execution paths a system can follow, with a higher complexity attributed to a greater number of paths. In concurrent software, the number of execution paths in a system typically increases exponentially as the number of concurrent components increase. Testing complex concurrent soft- ware is therefore difficult, with state of the art static and dynamic analysis techniques yielding only false positives or exhausted resources. This problem is likely only to be exacerbated given the trends highlighted above. Stochastic metaheuristic search techniques can often triumph where deterministic or analytical techniques fail. Methods such as Genetic Algorithms and Ant Colony Optimisation have shown great strength on hard problems, including testing concurrent software. Metaheuristic techniques often trade a perfect solution for good enough solutions, and merely accurately detecting a concurrent fault is better than allowing a fault to survive to a production system. Whilst metaheuristic techniques have had some success in this domain, the state of the art still struggles for a high success rate in some circumstances. There are a few metaheuristic search techniques that have yet to be tried in this area, and this thesis presents a study on one such technique. This thesis presents a literature review, detailing the state of the art in detecting concurrent faults in software and hardware systems. Following a review of metaheuristic techniques applied to finding concurrent faults, I set out a hypothesis asserting that a particular subclass of metaheuristic techniques, Estimation of Distribution Algorithms, are effective in detecting and debugging concurrent faults. To investigate the hypothesis, I first make an algorithmic proposal based on a particular EDA to search the state space of concurrent systems. I then demonstrate through experimentation the ability of the algorithm to detect faults and to optimise the quality of faults found in systems derived from industrial scenarios. I also outline methods of using features unique to EDAs to scale to large systems. Finally, I complete the thesis with a conclusion examining the hypothesis with respect to the evidence collected from empirical work, highlighting the novel aspects of the thesis and outlining future paths of research.
23

Complexity analysis and semantics for quantum computation

Kashefi, Elham January 2003 (has links)
No description available.
24

Design and implementation of an array language for computational science on a heterogeneous multicore architecture

Keir, Paul January 2012 (has links)
The packing of multiple processor cores onto a single chip has become a mainstream solution to fundamental physical issues relating to the microscopic scales employed in the manufacture of semiconductor components. Multicore architectures provide lower clock speeds per core, while aggregate floating-point capability continues to increase. Heterogeneous multicore chips, such as the Cell Broadband Engine (CBE) and modern graphics chips, also address the related issue of an increasing mismatch between high processor speeds, and huge latency to main memory. Such chips tackle this memory wall by the provision of addressable caches; increased bandwidth to main memory; and fast thread context switching. An associated cost is often reduced functionality of the individual accelerator cores; and the increased complexity involved in their programming. This dissertation investigates the application of a programming language supporting the first-class use of arrays; and capable of automatically parallelising array expressions; to the heterogeneous multicore domain of the CBE, as found in the Sony PlayStation 3 (PS3). The language is a pre-existing and well-documented proper subset of Fortran, known as the ‘F’ programming language. A bespoke compiler, referred to as E , is developed to support this aim, and written in the Haskell programming language. The output of the compiler is in an extended C++ dialect known as Offload C++, which targets the PS3. A significant feature of this language is its use of multiple, statically typed, address spaces. By focusing on generic, polymorphic interfaces for both the generated and hand constructed code, a number of interesting design patterns relating to the memory locality are introduced. A suite of medium-sized (100-700 lines), real-world benchmark programs are used to evaluate the performance, correctness, and scalability of the compiler technology. Absolute speedup values, well in excess of one, are observed for all of the programs. The work ultimately demonstrates that an array language can significantly reduce the effort expended to utilise a parallel heterogeneous multicore architecture, while retaining high performance. A substantial, related advantage in using standard ‘F’ is that any Fortran compiler can create debuggable, and competitively performing serial programs.
25

Efficient design-space exploration of custom instruction-set extensions

Zuluaga, Marcela January 2010 (has links)
Customization of processors with instruction set extensions (ISEs) is a technique that improves performance through parallelization with a reasonable area overhead, in exchange for additional design effort. This thesis presents a collection of novel techniques that reduce the design effort and cost of generating ISEs by advancing automation and reconfigurability. In addition, these techniques maximize the perfomance gained as a function of the additional commited resources. Including ISEs into a processor design implies development at many levels. Most prior works on ISEs solve separate stages of the design: identification, selection, and implementation. However, the interations between these stages also hold important design trade-offs. In particular, this thesis addresses the lack of interaction between the hardware implementation stage and the two previous stages. Interaction with the implementation stage has been mostly limited to accurately measuring the area and timing requirements of the implementation of each ISE candidate as a separate hardware module. However, the need to independently generate a hardware datapath for each ISE limits the flexibility of the design and the performance gains. Hence, resource sharing is essential in order to create a customized unit with multi-function capabilities. Previously proposed resource-sharing techniques aggressively share resources amongst the ISEs, thus minimizing the area of the solution at any cost. However, it is shown that aggressively sharing resources leads to large ISE datapath latency. Thus, this thesis presents an original heuristic that can be parameterized in order to control the degree of resource sharing amongst a given set of ISEs, thereby permitting the exploration of the existing implementation trade-offs between instruction latency and area savings. In addition, this thesis introduces an innovative predictive model that is able to quickly expose the optimal trade-offs of this design space. Compared to an exhaustive exploration of the design space, the predictive model is shown to reduce by two orders of magnitude the number of executions of the resource-sharing algorithm that are required in order to find the optimal trade-offs. This thesis presents a technique that is the first one to combine the design spaces of ISE selection and resource sharing in ISE datapath synthesis, in order to offer the designer solutions that achieve maximum speedup and maximum resource utilization using the available area. Optimal trade-offs in the design space are found by guiding the selection process to favour ISE combinations that are likely to share resources with low speedup losses. Experimental results show that this combined approach unveils new trade-offs between speedup and area that are not identified by previous selection techniques; speedups of up to 238% over previous selection thecniques were obtained. Finally, multi-cycle ISEs can be pipelined in order to increase their throughput. However, it is shown that traditional ISE identification techniques do not allow this optimization due to control flow overhead. In order to obtain the benefits of overlapping loop executions, this thesis proposes to carefully insert loop control flow statements into the ISEs, thus allowing the ISE to control the iterations of the loop. The proposed ISEs broaden the scope of instruction-level parallelism and obtain higher speedups compared to traditional ISEs, primarily through pipelining, the exploitation of spatial parallelism, and reducing the overhead of control flow statements and branches. A detailed case study of a real application shows that the proposed method achieves 91% higher speedups than the state-of-the-art, with an area overhead of less than 8% in hardware implementation.
26

Parallel discrete event simulation on the SpiNNaker engine

Bai, Chuan January 2013 (has links)
The SpiNNaker engine is a multiprocessor system, designed with a scalable interconnection system to perform real-time neural network simulation. The scalable property of the SpiNNaker system has the potential of providing high computation power making it suitable for solving certain large scale systems, such as neural networks. In addition, biological neural systems are intrinsically non-deterministic, and there are a number of design axioms of SpiNNaker that made it ideally suited to the simulation of systems with such properties. Interesting though they are, the non-deterministic attributes of SpiNNaker-based simulation are not the focus of this thesis. The high computational power available, coupled with the extremely low inter-chip communication cost, made SpiNNaker an attractive platform for other application areas in addition to its principal goal. One such problem is parallel discrete event simulation (PDES), which is the focus of this work. Discrete event simulation is a simple yet powerful algorithmic technique. Parallel discrete event simulation, on the other hand, is much more complicated due to the increase in complexity arising from the need to keep simulation data synchronized in a distributed environment. This property of PDES makes it a suitable candidate for generic simulation evaluation. Based on this insight, this thesis carries out the evaluation of the generic simulation capability of the SpiNNaker platform using a specially built framework running on the conventional parallel processing cluster to model the actual SpiNNaker system. In addition, a novel load balancing technique was also introduced and evaluated in this project.
27

Dynamic scheduling in multicore processors

Rosas Ham, Demian January 2012 (has links)
The advent of multi-core processors, particularly with projections that numbers of cores will continue to increase, has focused attention on parallel programming. It is widely recognized that current programming techniques, including those that are used for scientific parallel programming, will not allow the easy formulation of general purpose applications. An area which is receiving interest is the use of programming styles which do not have side-effects. Previous work on parallel functional programming demonstrated the potential of this to permit the easy exploitation of parallelism. This thesis investigates a dynamic load balancing system for shared memory Chip Multiprocessors. This system is based on a parallel computing model called SLAM (Spreading Load with Active Messages), which makes use of functional language evaluation techniques. A novel hardware/software mechanism for exploiting fine grain parallelism is presented. This mechanism comprises a runtime system which performs dynamic scheduling and synchronization automatically when executing parallel applications. Additionally the interface for using this mechanism is provided in the form of an API. The proposed system is evaluated using cycle-level models and multithreaded applications running in a full system simulation environment.
28

Leakage power minimisation techniques for embedded processors

Mistry, Jatin N. January 2013 (has links)
Leakage power is a growing concern in modern technology nodes. In some current and emerging applications, speed performance is uncritical but many of these applications rely on untethered power making energy a primary constraint. Leakage power minimisation is therefore key to maximising energy efficiency for these applications. This thesis proposes two new leakage power minimisation techniques to improve the energy efficiency of embedded processors. The first technique, called sub-clock power gating,can be used to reduce leakage power during the active mode. The technique capitalises on the observation that there can be large combinational idle time within the clock period in low performance applications and therefore power gates it. Sub-clock power gating is the first study into the application of power gating within the clock period, and simulation results on post layout netlists using a 90nm technology library show 3.5x, 2x and 1.3x improvement in energy efficiency for three test cases: 16-bit multiplier, ARM Cortex-M0 and Event Processor at a given performance point. To reduce the energy cost associated with moving between the sleep and active mode of operation, a second technique called symmetric virtual rail clamping is proposed. Rather than shutting down completely during sleep mode, the proposed technique uses a pair of NMOS and PMOS transistors at the head and foot of the power gated logic to lower the supply voltage by 2Vth. This reduces the energy needed to recharge the supply rails and eliminates signal glitching energy cost during wake-up. Experimental results from a 65nm test chip shows application of symmetric virtual rail clamping in sub-clock power gating improves energy efficiency, extending its applicable clock frequency range by 400x. The physical layout of power gating requires dedicated techniques and this thesis proposes dRail, a new physical layout technique for power gating. Unlike the traditional voltage area approach, dRail allows both power gated and non-power gated cells to be placed together in the physical layout to reduce area and routing overheads. Results from a post layout netlist of an ARM Cortex-M0 with sub-clock power gating shows standard cell area and signal routing are improved by 3% and 19% respectively. Sub-clock power gating, symmetric virtual rail clamping and dRail are incorporated into power gating design flows and are compatible with commercial EDA tools and gate libraries.
29

A model to reduce the divide between South African secondary institutional skills and knowledge, and the entrance requirements for an information technology diploma course

Baxter, Roger January 2008 (has links)
M. Tech. (Information and communication technology, Faculty of Applied and computer sciences), Vaal University of Technology / Historically, access to information technology (IT) in South Africa educational institutions has been socially stratified. As a result, many new learners seeking to enter South African tertiary institutions fail to meet the requirements of their preferred course and institution. In 2003, the Department of Information and Corrununications Technology at the Vaal University of Technology (VUT), in conjunction with the National Institute for Information Technology (NIIT), an internationally recognised IT organisation, introduced a short course named the Information Technology Boot Camp (ITBC). This course is now known as the Introduction to Information Technology course (Intro-to-IT). The course is targeted at learners who want to study the IT diploma at the VUT but, who as a result of their Matriculation marks, do not meet the VUT's entrance requirements. The aim of the course is to prepare and qualify these learners for possible acceptance into the IT diploma at the VUT. Although the Intro-to-IT course has impacted positively on the VUT, research has found that learners progressing from the Intro-to-IT course into the IT diploma course experience difficulties in solving programming problems in a logical way. Therefore, the failure rate in Development Software I, a first-semester programming subject, is relatively high. The model described in this study encompasses alterations (implemented and still to be implemented) to the syllabus and content of the Intro-to-IT course, changes to the learning methods and time frames for subjects, and the measurement of these changes in comparison to previous results. The model also includes a software program, which will assess the Intro-to-IT applicants, store results and provide analytical data on all learners' marks and results for the Intro-to-IT short course at the VUT. This model is designed to provide the necessary skills, knowledge and basic logic required to allow successful Intro-to-IT learners the opportunity of success when they enter the VUT's IT diploma stream.
30

Ανάπτυξη βιβλιοθήκης και περιβάλλοντος εξομοίωσης κβαντικών υπολογισμών σε γλώσσα Python

Μαυρίδη, Πετρούλα 20 September 2010 (has links)
Η παρούσα διπλωματική εργασία αφορά την ανάπτυξη βιβλιοθήκης συναρτήσεων για κβαντικό υπολογισμό και τη δημιουργία γραφικού περιβάλλοντος το οποίο χρησιμοποιεί τη συγκεκριμένη βιβλιοθήκη. Αρχικά μελετήθηκαν: πρώτον, οι βασικές έννοιες που διέπουν τους κβαντικούς υπολογιστές και δεύτερον, η χρήση και οι δυνατότητες της γλώσσας python. Εν συνέχεια, ύστερα από εμβάθυνση στα ανωτέρω πεδία πραγματοποιήθηκε η συγγραφή της κβαντικής βιβλιοθήκης καθώς και η δημιουργία του γραφικού περιβάλλοντος που χρησιμοποιεί τη κβαντική αυτή βιβλιοθήκη. Επίσης δημιουργήθηκε και μία γραφική διεπαφή για την παρουσίαση των γραφικών παραστάσεων που παρουσιάζουν τα αποτελέσματα του κβαντικού υπολογισμού. Η ανάπτυξη του προγράμματος πραγματοποιήθηκε σε γλώσσα Python. / This diploma dissertation presents the development of a library for quantum calculation and the implementation of a graphic interface that uses this library. Initially, the quantum computers and the programming language python were studied. After the comprehension on the two above fields the quantum library and the graphic interface which uses that library were created. Another graphical interface also created for the presentation of the graphs that showing the results of the quantum computation. The program is written in Python language.

Page generated in 0.0694 seconds