• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • Tagged with
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Dataflow-processing element for a cognitive sensor platform

McDermott, Mark William, active 2014 26 June 2014 (has links)
Cognitive sensor platforms are the next step in the evolution of intelligent sensor platforms. These platforms have the capability to reason about both their external environment and internal conditions and to modify their processing behavior and configuration in a continuing effort to optimize their operational life and functional utility. The addition of cognitive capabilities is necessary for unattended sensor systems as it is generally not feasible to routinely replace the battery or the sensor(s). This platform provides a chassis that can be used to compose embedded sensor systems from composable elements. The composable elements adhere to a synchronous data flow (SDF) protocol to communicate between the elements using channels. The SDF protocol provides the capability to easily compose heterogeneous systems of multiple processing elements, sensor elements, debug elements and communications elements. The processing engine for this platform is a Dataflow-Processing Element (DPE) that receives, processes and dispatches SDF data tokens. The DPE is specifically designed to support the processing of SDF tokens using microcoded actors where programs are assembled by instantiating actors in a graphical modeling tool and verifying that the SDF protocol is adhered to. / text
2

Efficient execution of sequential applications on multicore systems

Robatmili, Behnam 19 September 2011 (has links)
Conventional CMOS scaling has been the engine of the technology revolution in most application domains. This trend has changed as in each technology generation, transistor densities continue to increase while due to the limits on threshold voltage scaling, per-transistor energy consumption decreases much more slowly than in the past. The power scaling issues will restrict the adaptability of designs to operate in different power and performance regimes. Consequently, future systems must employ more efficient architectures for optimizing every thread in the program across different power and performance regimes, rather than architectures that utilize more transistors. One solution is composable or dynamic multicore architectures that can span a wide range of energy/performance operating points by enabling multiple simple cores to compose to form a larger and more powerful core. Explicit Data Graph Execution (EDGE) architectures represent a highly scalable class of composable processors that exploit predicated dataflow block execution and distributed microarchitectures. However, prior EDGE architectures suffer from several energy and performance bottlenecks including expensive intra-block operand communication due to fine-grain instruction distribution among cores, the compiler-generated fanout trees built for high-fanout operand delivery, poor next-block prediction accuracy, and low speculation rates due to predicates and expensive refills after pipeline flushes. To design an energy-efficient and flexible dynamic multicore, this dissertation employs a systematic methodology that detects inefficiencies and then designs and evaluates solutions that maximize power and performance efficiency across different power and performance regimes. Some innovations and optimization techniques include: (a) Deep Block Mapping extracts more coarse-grained parallelism and reduces cross-core operand network traffic by mapping each block of instructions into the instruction queue of one core instead of distributing blocks across all composed cores as done in previous EDGE designs, (b) Iterative Path Predictor (IPP) reduces branch and predication overheads by unifying multi-exit block target prediction and predicate path prediction while providing improved accuracy for each, (c) Register Bypassing reduces cross-core register communication delays by bypassing register values predicted to be critical directly from producing to consuming cores, (d) Block Reissue reduces pipeline flush penalties by reissuing instructions in previously executed instances of blocks while they are still in the instruction queue, and (e) Exposed Operand Broadcasts (EOBs) reduce wide-fanout instruction overheads by extending the ISA to employ architecturally exposed low-overhead broadcasts combined with dataflow for efficient operand delivery for both high- and low-fanout instructions. These components form the basis for a third-generation EDGE microarchitecture called T3. T3 improves energy efficiency by about 2x and performance by 47% compared to previous EDGE architectures. T3 also performs in a highly power efficient manner across a wide spectrum of energy and performance operating points (low-power to high-performance), extending the domain of power/performance trade-offs beyond what dynamic voltage and frequency scaling offers on state-of-the-art conventional processors. This high level of flexibility and power efficiency makes T3 an attractive candidate for future systems which need to operate on a wide range of workloads under varying power and performance constraints. / text
3

A Learning Automata Approach for Input-rate Control in Composable Conveyor Systems

Cheerala, Chandana 13 May 2011 (has links)
No description available.
4

Scalability and Composability Techniques for Network Simulation

Xu, Donghua 13 January 2006 (has links)
Simulation has become an important way to observe and understand various networking phenomena under various conditions. As the demand to simulate larger and more complex networks increases, the limited computing capacity of a single workstation and the limited simulation capability of a single network simulator have become apparent obstacles to the simulationists. In this research we develop techniques that can scale a simulation to address the limited capacity of a single workstation, as well as techniques that can compose a simulation from different simulator components to address the limited capability of a single network simulator. We scale a simulation with two different approaches: 1) We reduce the resource requirement of a simulation substantially, so that larger simulations can fit into one single workstation. In this thesis, we develop three technqiues (Negative Forwarding Table, Multicast Routing Object Aggregation and NIx-Vector Unicast Routing) to aggregate and compress the large amount of superfluous or redundant routing state in large multicast simulations. 2) The other approach to scale network simulations is to partition a simulation model in a way that makes the best use of the resources of the available computer cluster, and distribute the simulation onto the different processors of the computer cluster to obtain the best parallel simulation performance. We develop a novel empirical methodology called BencHMAP (Benchmark-Based Hardware and Model Aware Partitioning) that runs small sets of benchmark simulations to derive the right formulas of calculating the weights that are used to partition the simulation on a given computer cluster. On the other hand, to address the problem of the limited capability of a network simulator, we develop techniques for building complex network simulations by composing from independent components. With different existing simulators good at different protocol layers/scenarios, we can make each simulator execute the layers where it excels, using a simulation backplane to be the interface between different simulators. In this thesis we demonstrate that these techniques enable us to not only scale up simulations by orders of magnitude with a good performance, but also compose complex simulations with high fidelity.

Page generated in 0.0582 seconds