11 |
Realization methods for asynchronous sequential circuitsMago´, Gyula Antal January 1970 (has links)
No description available.
|
12 |
The evaluation and optimisation of a basic speech recogniserMoore, Roger K. January 1975 (has links)
No description available.
|
13 |
Asynchronous techniques for new generation variation-tolerant FPGALow, Hock Soon January 2015 (has links)
This thesis presents a practical scenario for asynchronous logic implementation that would benefit the modern Field-Programmable Gate Arrays (FPGAs) technology in improving reliability. A method based on Asynchronously-Assisted Logic (AAL) blocks is proposed here in order to provide the right degree of variation tolerance, preserve as much of the traditional FPGAs structure as possible, and make use of asynchrony only when necessary or beneficial for functionality. The newly proposed AAL introduces extra underlying hard-blocks that support asynchronous interaction only when needed and at minimum overhead. This has the potential to avoid the obstacles to the progress of asynchronous designs, particularly in terms of area and power overheads. The proposed approach provides a solution that is complementary to existing variation tolerance techniques such as the late-binding technique, but improves the reliability of the system as well as reducing the design’s margin headroom when implemented on programmable logic devices (PLDs) or FPGAs. The proposed method suggests the deployment of configurable AAL blocks to reinforce only the variation-critical paths (VCPs) with the help of variation maps, rather than re-mapping and re-routing. The layout level results for this method's worst case increase in the CLB’s overall size only of 6.3%. The proposed strategy retains the structure of the global interconnect resources that occupy the lion’s share of the modern FPGA’s soft fabric, and yet permits the dual-rail iv completion-detection (DR-CD) protocol without the need to globally double the interconnect resources. Simulation results of global and interconnect voltage variations demonstrate the robustness of the method.
|
14 |
Plasma physics computations on emerging hardware architecturesChorley, Joanne Clare January 2016 (has links)
This thesis explores the potential of emerging hardware architectures to increase the impact of high performance computing in fusion plasma physics research. For next generation tokamaks like ITER, realistic simulations and data-processing tasks will become significantly more demanding of computational resources than current facilities. It is therefore essential to investigate how emerging hardware such as the graphics processing unit (GPU) and field-programmable gate array (FPGA) can provide the required computing power for large data-processing tasks and large scale simulations in plasma physics specific computations. The use of emerging technology is investigated in three areas relevant to nuclear fusion: (i) a GPU is used to process the large amount of raw data produced by the synthetic aperture microwave imaging (SAMI) plasma diagnostic, (ii) the use of a GPU to accelerate the solution of the Bateman equations which model the evolution of nuclide number densities when subjected to neutron irradiation in tokamaks, and (iii) an FPGA-based dataflow engine is applied to compute massive matrix multiplications, a feature of many computational problems in fusion and more generally in scientific computing. The GPU data processing code for SAMI provides a 60x acceleration over the previous IDL-based code, enabling inter-shot analysis in future campaigns and the data-mining (and therefore analysis) of stored raw data from previous MAST campaigns. The feasibility of porting the whole Bateman solver to a GPU system is demonstrated and verified against the industry standard FISPACT code. Finally a dataflow approach to matrix multiplication is shown to provide a substantial acceleration compared to CPU-based approaches and, whilst not performing as well as a GPU for this particular problem, is shown to be much more energy efficient. Emerging hardware technologies will no doubt continue to provide a positive contribution in terms of performance to many areas of fusion research and several exciting new developments are on the horizon with tighter integration of GPUs and FPGAs with their host central processor units. This should not only improve performance and reduce data transfer bottlenecks, but also allow more user-friendly programming tools to be developed. All of this has implications for ITER and beyond where emerging hardware technologies will no doubt provide the key to delivering the computing power required to handle the large amounts of data and more realistic simulations demanded by these complex systems.
|
15 |
Lightweight physical unclonable functions circuit design and analysisGu, Chongyan January 2016 (has links)
With the increasing emergence of mobile electronic devices over the last two decades, they are now ubiquitous and can be found in our homes, our cars, our workplaces etc., and have the potential to revolutionise how we interact with the world today. This has led to a high demand for cryptographic devices that can provide authentication to protect user privacy and data security; however conventional cryptographic approaches suffer from a number of shortcomings. Also, today’s mobile devices are low-cost, low-power, embedded devices that are restricted both in memory and computing power. Hence, conventional cryptographic approaches are typically unsuitable as they incur significant timing, energy and area overhead. Physical unclonable functions (PUFs) are a novel security primitive which utilise the inherent variations that occur during manufacturing processing in order to generate a unique intrinsic identifier for a device. This gives it an advantage over current state-of-the-art alternatives. No special manufacturing processes are required to integrate a PUF into a design lowering the overall cost of the 1C, and everything can be kept on-chip enabling the PUF to be utilised as a hardware root of trust for all security or identity related operations on the device. This enables a multitude of higher level operations based on secure key storage and chip authentication. However, the design and implementation of PUF digital circuits is challenging, particularly for Field Programmable Gate Array (FPGA) devices. Since the circuits depend upon process variations, even small changes in environmental conditions, such as voltage or temperature, or unbalanced design that introduces skew, will affect their performance. In this thesis, a number of novel lightweight PUF techniques are proposed and experimentally validated. Furthermore, previously reported PUF techniques are evaluated and compared with the proposed designs in terms of efficiency and a range of performance metrics.
|
16 |
Analyzing non-collocated synchronous shared visual workspace-mediated interaction and effects on conversational grounding : a study on collaborative intelligence analysisLaurence, Sean Xavier January 2016 (has links)
A shared visual workspace and video in addition to voice are two functionalities or technologies which this thesis focuses on. What is clarified in this work is how these influence remote collaboration and conversational grounding in particular — where grounding refers to the pro-active process of seeking, creating and maintaining the shared meanings needed for conversational partners to communicate effectively. Additionally, this thesis clarifies how to support non-collocated synchronous mediated-collaboration around intelligence analytic tasks — away from traditional tasks that involve the identification or manipulation of physical objects which previous studies appear to favour. This research is guided by these three primary research questions: —RQ1) How can we expose aspects of conversational grounding in mediated communication involving different combinations of a video (showing a remote participant’s head and shoulder, and hands and work-area) and a fully shared visual workspace in addition to voice? —RQ2) In relation to the negotiated process of grounding, how can we explain what is happening when parties are collaborating on an intelligence task using a fully shared visual workspace? —RQ3) How can we design better fully shared visual workspace systems to support remote collaborative intelligence analysis tasks? Study1 — reported in Chapter 5, is an exploratory research which also serves as a groundwork for Study2. The findings there led to the formulation of more focused hypotheses later investigated in Study2. Further, the most significant contribution of the Study1 was the coding schema constructed for analysing the negotiation of common ground. Chapter 6, 7, 8 make up Study2. A human-participant experiment was conducted using a 2 x 2 factorial between-subjects design with 2-person teams and four media manipulations namely: video, no video, shared visual workspace and no shared visual workspace. Conversational grounding effort is operationalized as the number of repair-episodes per min (that is repair rate). Results here indicate that teams using shared visual workspace have a lower repair rate than those teams with no access to shared visual workspace. This result is statistically significant. Although teams using video equally had a lower repair rate than those teams not using video, this result was not statistically significant. This is consistent with prior research which found that a video showing a person’s face and shoulders is not terribly important in collaborative context. Results of another investigation demonstrate that regardless of the media condition, teams generally have a lower repair rate over time as the task progressed — this result was statistically significantly positive. Additionally, assessments of a questionnaire item measuring improvements of mutual agreements and shared understanding over time, showed a statistically significantly difference between the shared visual workspace group and the no shared visual workspace group, as was the participant’s rating of the effectiveness of the medium for information sharing. Results of a qualitative thematic analysis in Chapter 7 helps explain these statistical results and more. A conceptual process model of conversational grounding in shared visual workspace-mediated interaction is presented in Chapter 8. The model also summarises the research findings. The discourse there offer useful implications and guidelines for moving beyond current theories and models of the negotiation of common ground. Equally, practical design recommendations for the design of shared visual workspaces are also discussed there. Chapter 9, 10 reviews the research questions and considers how the research that has been presented addresses them, followed by a discussion of the contributions of the thesis, future work and conclusion. Overall, this thesis delivers the following contributions: —1) It advances existing knowledge silos and studies on media effects on conversational grounding — one of the ways it achieves that is by delivering a conceptual model framework for understanding conversational grounding processes in real-time remote collaborative intelligence analysis. —2) It delivers a new coding schema for the analysis of the negotiation of conversational grounding in remote work. —3) It offers four data-driven design recommendations for good practical design of shared visual workspace groupware that better support more natural communicative nuances.
|
17 |
Improving memory access performance for irregular algorithms in heterogeneous CPU/FPGA systemsBean, Andrew January 2016 (has links)
Many algorithms and applications in scientific computing exhibit irregular access patterns as consecutive accesses are dependent on the structure of the data being processed and as such cannot be known a priori. This manifests itself as a lack of temporal and spatial locality meaning these applications often perform poorly in traditional processor cache hierarchies. This thesis demonstrates that heterogeneous architectures containing Field Programmable Gate Arrays (FPGAs) alongside traditional processors can improve memory access throughput by 2-3x by using the FPGA to insert data directly into the processor cache, eliminating costly cache misses. When fetching data to be processed directly on the FPGA, scatter-gather Direct Memory Access (DMA) provides the best performance but its storage format is inefficient for these classes of applications. The presented optimised storage and generation of these descriptors on-demand leads to a 16x reduction in on-chip Block RAM usage and a 2/3 reduction in data transfer time. Traditional scatter-gather DMA requires a statically defined list of access instructions and is managed by a host processor. The system presented in this thesis expands the DMA operation to allow data-driven memory requests in response to processed data and brings all control on-chip allowing autonomous operation. This dramatically increases system flexibility and provides a further 11% performance improvement. Graph applications and algorithms for traversing and searching graph data are used throughout this thesis as a motivating example for the optimisations presented, though they should be equally applicable to a wide range of irregular applications within scientific computing.
|
18 |
Real-time motion capture for analysis and presentation within virtual environmentsBrownridge, Adam Mark January 2014 (has links)
This thesis describes motion capture methods with an application for real-time recording of extreme human movement. A wireless gyroscopic sensor based system is used to record and evaluate misalignments in ankle position of ballet dancers in a performance environment. Anatomic alignment has been shown to contribute to dance related injuries, and results of this work show that subtle variations in joint rotation can be clearly measured. The workflow has been developed to extract performance analysis data for fault detection in order to assist augmented feedback methods for the prevention of injury and improved performance. Infra-red depth sensing technology, commonly used in garment design, has been used to produce a representation of a scanned human subject and a workflow established to utilise this character avatar for animation using motion capture data. The process of presenting a visually acceptable representation of an overall performance in addition to the numerical evaluation of specific joint orientation provides a significant contribution to knowledge.
|
19 |
The instruction of systolic array (ISA) and simulation of parallel algorithmsMuslih, Ossama K. January 1989 (has links)
Systolic arrays have proved to be well suited for Very Large Scale Integrated technology (VLSI) since they: - Consist of a regular network of simple processing cells, - Use local communication between the processing cells only, - Exploit a maximal degree of parallelism. However, systolic arrays have one main disadvantage compared with other parallel computer architectures: they are special purpose architectures only capable of executing one algorithm, e.g., a systolic array designed for sorting cannot be used to form matrix multiplication. Several approaches have been made to make systolic arrays more flexible, in order to be able to handle different problems on a single systolic array. In this thesis an alternative concept to a VLSI-architecture the Soft-Systolic Simulation System (SSSS), is introduced and developed as a working model of virtual machine with the power to simulate hard systolic arrays and more general forms of concurrency such as the SIMD and MIMD models of computation. The virtual machine includes a processing element consisting of a soft-systolic processor implemented in the virtual.machine language. The processing element considered here was a very general element which allows the choice of a wide range of arithmetic and logical operators and allows the simulation of a wide class of algorithms but in principle extra processing cells can be added making a library and this library be tailored to individual needs. The virtual machine chosen for this implementation is the Instruction Systolic Array (ISA). The ISA has a number of interesting features, firstly it has been used to simulate all SIMD algorithms and many MIMD algorithms by a simple program transformation technique, further, the ISA can also simulate the so-called wavefront processor algorithms, as well as many hard systolic algorithms. The ISA removes the need for the broadcasting of data which is a feature of SIMD algorithms (limiting the size of the machine and its cycle time) and also presents a fairly simple communication structure for MIMD algorithms. The model of systolic computation developed from the VLSI approach to systolic arrays is such that the processing surface is fixed, as are the processing elements or cells by virtue of their being embedded in the processing surface. The VLSI approach therefore freezes instructions and hardware relative to the movement of data with the virtual machine and softsystolic programming retaining the constructions of VLSI for array design features such as regularity, simplicity and local communication, allowing the movement of instructions with respect to data. Data can be frozen into the structure with instructions moving systolically. Alternatively both the data and instructions can move systolically around the virtual processors, (which are deemed fixed relative to the underlying architecture). The ISA is implemented in OCCAM programs whose execution and output implicitly confirm the correctness of the design. The soft-systolic preparation comprises of the usual operating system facilities for the creation and modification of files during the development of new programs and ISA processor elements. We allow any concurrent high level language to be used to model the softsystolic program. Consequently the Replicating Instruction Systolic Array Language (RI SAL) was devised to provide a very primitive program environment to the ISA but adequate for testing. RI SAL accepts instructions in an assembler-like form, but is fairly permissive about the format of statements, subject of course to syntax. The RI SAL compiler is adopted to transform the soft-systolic program description (RISAL) into a form suitable for the virtual machine (simulating the algorithm) to run. Finally we conclude that the principles mentioned here can form the basis for a soft-systolic simulator using an orthogonally connected mesh of processors. The wide range of algorithms which the ISA can simulate make it suitable for a virtual simulating grid.
|
20 |
Maximising microprocessor reliability through game theory and heuristicsDocherty, James January 2014 (has links)
Embedded Systems are becoming ever more pervasive in our society, with most routine daily tasks now involving their use in some form and the market predicted to be worth USD 220 billion, a rise of 300%, by 2018. Consumers expect more functionality with each design iteration, but for no detriment in perceived performance. These devices can range from simple low-cost chips to expensive and complex systems and are a major cost driver in the equipment design phase. For more than 35 years, designers have kept pace with Moore's Law, but as device size approaches the atomic limit, layouts are becoming so complicated that current scheduling techniques are also reaching their limit, meaning that more resource must be reserved to manage and deliver reliable operation. With the advent of many-core systems and further sources of unpredictability such as changeable power supplies and energy harvesting, this reservation of capability may become so large that systems will not be operating at their peak efi ciency. These complex systems can be controlled through many techniques, with jobs scheduled either onl ine prior to execution beginning or online at each time or event change. Increased processing power and job types means that current online scheduling methods that employ exhaustive search techniques will not be suitable to de ne schedules for such enigmatic task lists and that new techniques using statistic-based methods must be investigated to preserve Quality of Service. A new paradigm of scheduling through complex heuristics is one way to administer these next levels of processor effectively and allow the use of more simple devices in complex systems; thus reducing unit cost while retaining reliability a key goal identified by the International Technology Roadmap for Semi- conductors for Embedded Systems in Critical Environments. These changes would be beneficial in terms of cost reduction and system exibility within the next generation of device. This thesis investigates the use of heuristics and statistical methods in the operation of real-time systems, with the feasibility of Game Theory and Statistical Process Control for the successful supervision of high-load and critical jobs investigated. Heuristics are identified as an effective method of controlling complex real-time issues, with two-person non-cooperative games delivering Nash-optimal solutions where these exist. The simplified al- gorithms for creating and solving Game Theory events allow for its use within small embedded RISC devices and an increase in reliability for systems operating at the apex of their limits. Within this Thesis, Heuristic and Game Theoretic algorithms for a variety of real-time scenarios are postulated, investigated, re- ned and tested against existing schedule types; initially through MATLAB simulation before testing on an ARM Cortex M3 architecture functioning as a simplified automotive Electronic Control Unit.
|
Page generated in 0.101 seconds