431 |
Multi-core design and resource allocation: from big core to ultra-tiny coreKwok, Tai-on, Tyrone., 郭泰安. January 2008 (has links)
published_or_final_version / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy
|
432 |
Design and performance evaluation of parallel architectures for image segmentation processingSurma, David Ray, 1963- January 1989 (has links)
The design of parallel architectures to perform image segmentation processing is given. In addition, the various designs are evaluated as to their performance, and a discussion of an optimal design is given. In this thesis, a set of eight segmentation algorithms has been provided as a starting point. Four of these algorithms will be evaluated and partitioned using two techniques. From this study of partitioning and considering the data flow through the total system, architectures utilizing parallel techniques will be derived. Timing analysis using pen and paper techniques will be given on the architectures using three of today's current technologies. Next, NETWORK II.5 simulations will be run to provide performance measures. Finally, evaluations of the various architectures will be made as well as the applicability of using NETWORK II.5 as a simulation language.
|
433 |
Optimising a fluid plasma turbulence simulation on modern high performance computersEdwards, Thomas David January 2010 (has links)
Nuclear fusion offers the potential of almost limitless energy from sea water and lithium without the dangers of carbon emissions or long term radioactive waste. At the forefront of fusion technology are the tokamaks, toroidal magnetic confinement devices that contain miniature stars on Earth. Nuclei can only fuse by overcoming the strong electrostatic forces between them which requires high temperatures and pressures. The temperatures in a tokamak are so great that the Deuterium-Tritium fusion fuel forms a plasma which must be kept hot and under pressure to maintain the fusion reaction. Turbulence in the plasma causes disruption by transporting mass and energy away from this core, reducing the efficiency of the reaction. Understanding and controlling the mechanisms of plasma turbulence is key to building a fusion reactor capable of producing sustained output. The extreme temperatures make detailed empirical observations difficult to acquire, so numerical simulations are used as an additional method of investigation. One numerical model used to study turbulence and diffusion is CENTORI, a direct two-fluid magneto-hydrodynamic simulation of a tokamak plasma developed by the Culham Centre for Fusion Energy (CCFE formerly UKAEA:Fusion). It simulates the entire tokamak plasma with realistic geometry, evolving bulk plasma quantities like pressure, density and temperature through millions of timesteps. This requires CENTORI to run in parallel on a Massively Parallel Processing (MPP) supercomputer to produce results in an acceptable time. Any improvements in CENTORI’s performance increases the rate and/or total number of results that can be obtained from access to supercomputer resources. This thesis presents the substantial effort to optimise CENTORI on the current generation of academic supercomputers. It investigates and reviews the properties of contemporary computer architectures then proposes, implements and executes a benchmark suite of CENTORI’s fundamental kernels. The suite is used to compare the performance of three competing memory layouts of the primary vector data structure using a selection of compilers on a variety of computer architectures. The results show there is no optimal memory layout on all platforms so a flexible optimisation strategy was adopted to pursue “portable” optimisation i.e optimisations that can easily be added, adapted or removed from future platforms depending on their performance. This required designing an interface to functions and datatypes that separate CENTORI’s fundamental algorithms from repetitive, low-level implementation details. This approach offered multiple benefits including: the clearer representation of CENTORI’s core equations as mathematical expressions in Fortran source code allows rapid prototyping and development of new features; the reduction in the total data volume by a factor of three reduces the amount of data transferred over the memory bus to almost a third; and the reduction in the number of intense floating point kernels reduces the effort of optimising the application on new platforms. The project proceeds to rewrite CENTORI using the new Application Programming Interface (API) and evaluates two optimised implementations. The first is a traditional library implementation that uses hand optimised subroutines to implement the library functions. The second uses a dynamic optimisation engine to perform automatic stripmining to improve the performance of the memory hierarchy. The automatic stripmining implementation uses lazy evaluation to delay calculations until absolutely necessary, allowing it to identify temporary data structures and minimise them for optimal cache use. This novel technique is combined with highly optimised implementations of the kernel operations and optimised parallel communication routines to produce a significant improvement in CENTORI’s performance. The maximum measured speed up of the optimised versions over the original code was 3.4 times on 128 processors on HPCx, 2.8 times on 1024 processors on HECToR and 2.3 times on 256 processors on HPC-FF.
|
434 |
Guidance and navigation software architecture design for the Autonomous Multi-Agent Physically Interacting Spacecraft (AMPHIS) test bedEikenberry, Blake D. 12 1900 (has links)
The Autonomous Multi-Agent Physically Interacting Spacecraft (AMPHIS) test bed examines the problem of multiple spacecraft interacting at close proximity. This thesis contributes to this on-going research by addressing the development of the software architecture for the AMPHIS spacecraft simulator robots and the implementation of a Light Detection and Ranging (LIDAR) unit to be used for state estimation and navigation of the prototype robot. The software modules developed include: user input for simple user tasking; user output for data analysis and animation; external data links for sensors and actuators; and guidance, navigation and control (GNC). The software was developed in the SIMULINK/MATLAB environment as a consistent library to serve as stand alone simulator, actual hardware control on the robot prototype, and any combination of the two. In particular, the software enables hardware-in-the-loop testing to be conducted for any portion of the system with reliable simulation of all other portions of the system. The modularity of this solution facilitates fast proof-of-concept validation for the GNC algorithms. Two sample guidance and control algorithms were developed and are demonstrated here: a Direct Calculus of Variation method, and an artificial potential function guidance method. State estimation methods are discussed, including state estimation from hardware sensors, pose estimation strategies from various vision sensors, and the implementation of a LIDAR unit for state estimation. Finally, the relative motion of the AMPHIS test bed is compared to the relative motion on orbit, including how to simulate the on-orbit behavior using Hill's equations.
|
435 |
An implementation of remote application support in a multilevel environmentEgan, Melissa K. 03 1900 (has links)
There is a growing need for high-assurance architectures that support mandatory confidentiality and integrity policies. One such architecture currently under development is the Monterey Security Architecture (MYSEA), a distributed multilevel secure (MLS) computing environment that integrates untrusted commercial off-the-shelf components with specialized high-assurance elements. To ensure that information is purged from untrusted client PCs between sessions at different security levels, MYSEA clients are diskless. Therefore, it is desirable for thin MYSEA clients to be able to remotely execute server-resident applications, which may in turn request access to data residing elsewhere on the MLS Local Area Network (LAN). This functionality must be implemented in such a way that the access control policies of the multilevel environment are maintained. Working from a detailed design for remote application support, this thesis involved the implementation and testing of the remote application support functionality. Beyond the implementation of remote application support itself, this thesis involved the porting of a Trivial File Transfer Protocol (TFTP) client and the development of a simple web client as proof-of-concept remote applications, as well as the creation of a Common Gateway Interface (CGI) mechanism for invoking those remote applications from a client web browser. This research is relevant to the DoD Global Information Grid's vision of assured information sharing.
|
436 |
A New N-way Reconfigurable Data Cache Architecture for Embedded SystemsBani, Ruchi Rastogi 12 1900 (has links)
Performance and power consumption are most important issues while designing embedded systems. Several studies have shown that cache memory consumes about 50% of the total power in these systems. Thus, the architecture of the cache governs both performance and power usage of embedded systems. A new N-way reconfigurable data cache is proposed especially for embedded systems. This thesis explores the issues and design considerations involved in designing a reconfigurable cache. The proposed reconfigurable data cache architecture can be configured as direct-mapped, two-way, or four-way set associative using a mode selector. The module has been designed and simulated in Xilinx ISE 9.1i and ModelSim SE 6.3e using the Verilog hardware description language.
|
437 |
Integrity Verification of Applications on RADIUM ArchitectureTarigopula, Mohan Krishna 08 1900 (has links)
Trusted Computing capability has become ubiquitous these days, and it is being widely deployed into consumer devices as well as enterprise platforms. As the number of threats is increasing at an exponential rate, it is becoming a daunting task to secure the systems against them. In this context, the software integrity measurement at runtime with the support of trusted platforms can be a better security strategy. Trusted Computing devices like TPM secure the evidence of a breach or an attack. These devices remain tamper proof if the hardware platform is physically secured. This type of trusted security is crucial for forensic analysis in the aftermath of a breach. The advantages of trusted platforms can be further leveraged if they can be used wisely. RADIUM (Race-free on-demand Integrity Measurement Architecture) is one such architecture, which is built on the strength of TPM. RADIUM provides an asynchronous root of trust to overcome the TOC condition of DRTM. Even though the underlying architecture is trusted, attacks can still compromise applications during runtime by exploiting their vulnerabilities. I propose an application-level integrity measurement solution that fits into RADIUM, to expand the trusted computing capability to the application layer. This is based on the concept of program invariants that can be used to learn the correct behavior of an application. I used Daikon, a tool to obtain dynamic likely invariants, and developed a method of observing these properties at runtime to verify the integrity. The integrity measurement component was implemented as a Python module on top of Volatility, a virtual machine introspection tool. My approach is a first step towards integrity attestation, using hypervisor-based introspection on RADIUM and a proof of concept of application-level measurement capability.
|
438 |
Framework for requirements-driven system design automationUnknown Date (has links)
In this thesis, a framework for improving model-driven system design productivity with Requirements-Driven Design Automation (RDDA) is presented. The key to the proposed approach is to close the semantic gap between requirements, components and architecture by using compatible semantic models for describing product requirements and component capabilities, including constraints. An ontology-based representation language is designed that spans requirements for the application domain, the software design domain and the component domain. Design automation is supported for architecture development by machine-based mapping of desired product/subsystem features and capabilities to library components and by synthesis and maintenance of Systems Modeling Language (SysML) design structure diagrams. The RDDA framework uses standards-based semantic web technologies and can be integrated with exiting modeling tools. Requirements specification is a major component of the system development cycle. Mistakes and omissions in requirements documents lead to ambiguous or wrong interpretation by engineers, causing errors that trickle down in design and implementation with consequences on the overall development cost. We describe a methodology for requirements specification that aims to alleviate the above issues and that produces models for functional requirements that can be automatically validated for completeness and consistency. The RDDA framework uses an ontology-based language for semantic description of functional product requirements, SysML structure diagrams, component constraints, and Quality of Service. The front-end method for requirements specification is the SysML editor in Rhapsody. A requirements model in Web Ontology Language (OWL) is converted from SysML to Extensible Markup Language Metadata Interchange (XMI) representation. / The specification is validated for completeness and consistency with a ruled-based system implemented in Prolog. With our methodology, omission s and several types of consistency errors present in the requirements specification are detected early on, before the design stage. Component selection and design automation have the potential to play a major role in reducing the system development time and cost caused by the rapid change in technology advances and the large solution search space. In our work, we start from a structured representation of requirements and components using SysML, and based on specific set of rules written in Prolog, we partially automate the process of architecture design. / by Mihai Fonoage. / Thesis (Ph.D.)--Florida Atlantic University, 2010. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2010. Mode of access: World Wide Web.
|
439 |
An efficient and scalable core allocation strategy for multicore systemsUnknown Date (has links)
Multiple threads can run concurrently on multiple cores in a multicore system and improve performance/power ratio. However, effective core allocation in multicore and manycore systems is very challenging. In this thesis, we propose an effective and scalable core allocation strategy for multicore systems to achieve optimal core utilization by reducing both internal and external fragmentations. Our proposed strategy helps evenly spreading the servicing cores on the chip to facilitate better heat dissipation. We introduce a multi-stage power management scheme to reduce the total power consumption by managing the power states of the cores. We simulate three multicore systems, with 16, 32, and 64 cores, respectively, using synthetic workload. Experimental results show that our proposed strategy performs better than Square-shaped, Rectangle-shaped, L-Shaped, and Hybrid (contiguous and non-contiguous) schemes in multicore systems in terms of fragmentation and completion time. Among these strategies, our strategy provides a better heat dissipation mechanism. / by Manira S. Rani. / Thesis (M.S.C.S.)--Florida Atlantic University, 2011. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2011. Mode of access: World Wide Web.
|
440 |
An integrated component selection framework for system level designUnknown Date (has links)
The increasing system design complexity is negatively impacting the overall system design productivity by increasing the cost and time of product development. One key to overcoming these challenges is exploiting Component Based Engineering practices. However it is a challenge to select an optimum component from a component library that will satisfy all system functional and non-functional requirements, due to varying performance parameters and quality of service requirements. In this thesis we propose an integrated framework for component selection. The framework is a two phase approach that includes a system modeling and analysis phase and a component selection phase. Three component selection algorithms have been implemented for selecting components for a Network on Chip architecture. Two algorithms are based on a standard greedy method, with one being enhanced to produce more intelligent behavior. The third algorithm is based on simulated annealing. Further, a prototype was developed to evaluate the proposed framework and compare the performance of all the algorithms. / by Chad Calvert. / Thesis (M.S.C.S.)--Florida Atlantic University, 2009. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2009. Mode of access: World Wide Web.
|
Page generated in 0.0915 seconds