• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • Tagged with
  • 9
  • 9
  • 5
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Dataflow-processing element for a cognitive sensor platform

McDermott, Mark William, active 2014 26 June 2014 (has links)
Cognitive sensor platforms are the next step in the evolution of intelligent sensor platforms. These platforms have the capability to reason about both their external environment and internal conditions and to modify their processing behavior and configuration in a continuing effort to optimize their operational life and functional utility. The addition of cognitive capabilities is necessary for unattended sensor systems as it is generally not feasible to routinely replace the battery or the sensor(s). This platform provides a chassis that can be used to compose embedded sensor systems from composable elements. The composable elements adhere to a synchronous data flow (SDF) protocol to communicate between the elements using channels. The SDF protocol provides the capability to easily compose heterogeneous systems of multiple processing elements, sensor elements, debug elements and communications elements. The processing engine for this platform is a Dataflow-Processing Element (DPE) that receives, processes and dispatches SDF data tokens. The DPE is specifically designed to support the processing of SDF tokens using microcoded actors where programs are assembled by instantiating actors in a graphical modeling tool and verifying that the SDF protocol is adhered to. / text
2

Hardware Synthesis of Synchronous Data Flow Models

Koecher, Matthew R. 06 April 2004 (has links) (PDF)
Synchronous Dataflow (SDF) graphs are a convenient way to represent many signal processing and dataflow operations. Nodes within SDF graphs represent computation while arcs represent dependencies between nodes. Using a graph representation, SDF graphs formally specify a dataflow algorithm without any assumptions on the final implementation. This allows an SDF model to be synthesized into a variety of implementation techniques including both software and hardware. This thesis presents a technique for generating an abstract hardware representation from SDF models. The techniques presented here operate on SDF models defined structurally within the Ptolemy modeling environment. The behavior of the nodes within Ptolemy SDF models is specified in software and can be simple, such as a single arithmetic operation, or arbitrarily complex. This thesis presents a technique for extracting the behavior of a limited class of SDF nodes defined in software and generating a structural description of the SDF model based on primitive arithmetic and logical operations. This synthesized graph can be used for subsequent hardware synthesis transformations.
3

Evaluation de l'affectation des tâches sur une architecture à mémoire distribuée pour des modèles flot de données / Efficient evaluation of mappings of dataflow applications onto distributed memory architectures

Lesparre, Youen 02 March 2017 (has links)
Avec l'augmentation de l'utilisation des smartphones, des objets connectés et des véhicules automatiques, le domaine des systèmes embarqués est devenu omniprésent dans notre environnement. Ces systèmes sont souvent contraints en terme de consommation et de taille. L'utilisation des processeurs many-cores dans des systèmes embarqués permet une conception rapide tout en respectant des contraintes temps-réels et en conservant une consommation énergétique basse.Exécuter une application sur un processeur many-core requiert un dispatching des tâches appelé problème de mapping et est connu comme étant NP-complet.Les contributions de cette thèse sont divisées en trois parties :Tout d'abord, nous étendons d'importantes propriétés dataflow au modèle Phased Computation Graph.Ensuite, nous présentons un générateur de graphe dataflow capable de générer des Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs et Phased Computation Graphs vivant avec plus de 10000 tâches en moins de 30 secondes. Le générateur est comparé à SDF3 et PREESM.Enfin, la contribution majeure de cette thèse propose une nouvelle méthode d'évaluation d'un mapping en utilisant les modèles Synchonous Dataflow Graphe et Cyclo-Static Dataflow Graphe. La méthode évalue efficacement la mémoire consommée par les communications d'un dataflow mappé sur une architecture à mémoire distribuée. L'évaluation est déclinée en deux versions, la première garantit la vivacité alors que la seconde ajoute une contrainte de débit. La méthode d'évaluation est expérimentée avec des dataflow générés par Turbine et avec des applications réelles. / With the increasing use of smart-phones, connected objects or automated vehicles, embedded systems have become ubiquitous in our living environment. These systems are often highly constrained in terms of power consumption and size. They are more and more implemented with many-core processor array that allow, rapid design to meet stringent real-time constraints while operating at relatively low frequency, with reduced power consumption.Running an application on a processor array requires dispatching its tasks on the processors in order to meet capacity and performance constraints. This mapping problem is known to be NP-complete.The contributions of this thesis are threefold:First we extend important notions from the Cyclo-Static Dataflow Graph to the Phased Computation Graph model and two equivalent sufficient conditions of liveness.Second, we present a random dataflow graph generator able to generate Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs and Phased Computation Graphs. The Generator, is able to generate live dataflow of up to 10,000 tasks in less than 30 seconds. It is compared with SDF3 and PREESM.Third and most important, we propose a new method of evaluation of a mapping using the Synchonous Dataflow Graph and the Cyclo-Static Dataflow Graph models. The method evaluates efficiently the memory footprint of the communications of a dataflow graph mapped on a distributed architecture. The evaluation is declined in two versions, the first guarantees a live mapping while the second accounts for a constraint on throughput.The evaluation method is experimented on dataflow graphs from Turbine and on real-life applications.
4

Energy and Design Cost Efficiency for Streaming Applications on Systems-on-Chip

Zhu, Jun January 2009 (has links)
<p>With the increasing capacity of today's integrated circuits, a number ofheterogeneous  system-on-chip (SoC)  architectures  in embedded  systemshave been proposed. In order to achieve energy and design cost efficientstreaming applications  on these  systems, new design  space explorationframeworks  and  performance  analysis  approaches are  required.   Thisthesis  considers three state-of-the-art  SoCs architectures,  i.e., themulti-processor SoCs (MPSoCs)  with network-on-chip (NoC) communication,the hybrid CPU/FPGA architectures, and the run-time reconfigurable (RTR)FPGAs.  The main topic of the  author?s research is to model and capturethe  application  scheduling,  architecture  customization,  and  bufferdimensioning  problems, according to  the real-time  requirement.  Sincethese  problems  are NP-complete,  heuristic  algorithms and  constraintprogramming solver are used to compute a solution.For  NoC  communication  based  MPSoCs,  an  approach  to  optimize  thereal-time    streaming    applications    with   customized    processorvoltage-frequency levels and memory  sizes is presented. A multi-clockedsynchronous  model  of  computation   (MoC)  framework  is  proposed  inheterogeneous  timing analysis and  energy estimation.   Using heuristicsearching  (i.e., greedy  and  taboo search),  the  experiments show  anenergy reduction (up to 21%)  without any loss in application throughputcompared with an ad-hoc approach.On hybrid CPU/FPGA architectures,  the buffer minimization scheduling ofreal-time streaming  applications is addressed.  Based  on event models,the  problem  has  been  formalized  decoratively  as  constraint  basescheduling,  and  solved  by  public domain  constraint  solver  Gecode.Compared  with  traditional  PAPS  method,  the  proposed  method  needssignificantly smaller  buffers (2.4%  of PAPS in  the best  case), whilehigh throughput guarantees can still be achieved.Furthermore, a  novel compile-time analysis approach  based on iterativetiming  phases is  proposed  for run-time  reconfigurations in  adaptivereal-time   streaming   applications  on   RTR   FPGAs.   Finally,   thereconfigurations analysis and design trade-offs analysis capabilities ofthe proposed  framework have been  exemplified with experiments  on bothexample and industrial applications.</p> / Andres
5

Mutli-objective trade-off exploration for Cyclo-Static and Synchronous Dataflow graphs

Sinha, Ashmita 30 October 2012 (has links)
Many digital signal processing and real-time streaming systems are modeled using dataflow graphs, such as Synchronous Dataflow (SDF) and Cyclo-static Dataflow (CSDF) graphs that allow static analysis and optimization techniques. However, mapping of such descriptions into tightly constrained real-time implementations requires optimization of resource sharing, buffering and scheduling across a multi-dimensional latency-throughput-area objective space. This requires techniques that can find the Pareto-optimal set of implementations for the designer to choose from. In this work, we address the problem of multi-objective mapping and scheduling of SDF and CSDF graphs onto heterogeneous multi-processor platforms. Building on previous work, this thesis extends existing two-stage hybrid heuristics that combine an evolutionary algorithm with an integer linear programming (ILP) model to jointly optimize throughput, area and latency for SDF graphs. The primary contributions of this work include: (1) extension of the ILP model to support CSDFGs with additional buffer size optimizations; (2) a further optimization in the ILP-based scheduling model to achieve a runtime speedup of almost a factor of 10 compared to the existing SDFG formulation; (3) a list scheduling heuristic that replaces the ILP model in the hybrid heuristic to generate Pareto-optimal solutions at significantly decreased runtime while maintaining near-optimality of the solutions within an acceptable gap of 10% when compared to its ILP counterparts. The list scheduling heuristic presented in this work is based on existing modulo scheduling approaches for software pipelining in the compiler domain, but has been extended by introducing a new concept of mobility-based rescheduling before resorting to backtracking. It has been proved in this work that if mobility-based rescheduling is performed, the number of required backtrackings and hence overall complexity and runtime is less. / text
6

Energy and Design Cost Efficiency for Streaming Applications on Systems-on-Chip

Zhu, Jun January 2009 (has links)
With the increasing capacity of today's integrated circuits, a number ofheterogeneous  system-on-chip (SoC)  architectures  in embedded  systemshave been proposed. In order to achieve energy and design cost efficientstreaming applications  on these  systems, new design  space explorationframeworks  and  performance  analysis  approaches are  required.   Thisthesis  considers three state-of-the-art  SoCs architectures,  i.e., themulti-processor SoCs (MPSoCs)  with network-on-chip (NoC) communication,the hybrid CPU/FPGA architectures, and the run-time reconfigurable (RTR)FPGAs.  The main topic of the  author?s research is to model and capturethe  application  scheduling,  architecture  customization,  and  bufferdimensioning  problems, according to  the real-time  requirement.  Sincethese  problems  are NP-complete,  heuristic  algorithms and  constraintprogramming solver are used to compute a solution.For  NoC  communication  based  MPSoCs,  an  approach  to  optimize  thereal-time    streaming    applications    with   customized    processorvoltage-frequency levels and memory  sizes is presented. A multi-clockedsynchronous  model  of  computation   (MoC)  framework  is  proposed  inheterogeneous  timing analysis and  energy estimation.   Using heuristicsearching  (i.e., greedy  and  taboo search),  the  experiments show  anenergy reduction (up to 21%)  without any loss in application throughputcompared with an ad-hoc approach.On hybrid CPU/FPGA architectures,  the buffer minimization scheduling ofreal-time streaming  applications is addressed.  Based  on event models,the  problem  has  been  formalized  decoratively  as  constraint  basescheduling,  and  solved  by  public domain  constraint  solver  Gecode.Compared  with  traditional  PAPS  method,  the  proposed  method  needssignificantly smaller  buffers (2.4%  of PAPS in  the best  case), whilehigh throughput guarantees can still be achieved.Furthermore, a  novel compile-time analysis approach  based on iterativetiming  phases is  proposed  for run-time  reconfigurations in  adaptivereal-time   streaming   applications  on   RTR   FPGAs.   Finally,   thereconfigurations analysis and design trade-offs analysis capabilities ofthe proposed  framework have been  exemplified with experiments  on bothexample and industrial applications. / Andres
7

Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation

Arnesen, Adam T. 17 March 2011 (has links) (PDF)
As Moore's law continues to progress, it is becoming increasingly difficult for hardware designers to fully utilize the increasing number of transistors available semiconductor devices including FPGAs. This design productivity gap must be addressed to allow designs to take full advantage of the increased logic density that results from rising transistor density. The reuse of previously developed and verified intellectual property (IP) is one approach that has claimed to narrow the design productivity gap. Reuse, however, has proved difficult to realize in practice because of the complexity of IP and the reluctance of designers to reuse IP that they do not understand. This thesis proposes to narrow the design productivity gap for FPGAs by simplifying the reuse problem by encapsulating IP with extra machine-readable information or meta-data. This meta-data simplifies reuse by providing a language independent format for composing complex systems, providing a parameter representation system, defining high-level data types for FPGA IP, and allowing arbitrary IP to be described as actors in the homogeneous synchronous dataflow model of computation.This work implements meta-data in XML and presents two XML schemas that enable reuse. A new XML schema known as CHREC XML is presented as well as extensions that enable IP-XACT to be used to describe FPGA dataflow IP. Two tools developed in this work are also presented that leverage meta-data to simplify reuse of arbitrary IP. These tools simplify structural composition of IP, allow designers to manipulate parameters, check and validate high-level data types, and automatically synthesize control circuitry for dataflow designs. Productivity improvements are also demonstrated by reusing IP to quickly compose software radio receivers.
8

Design space exploration for co-mapping of periodic and streaming applications in a shared platform / Validering av designlösningar för utforskning av rymden för samkartläggning av periodiska och strömmande applikationer i en delad plattform

Yuhan, Zhang January 2023 (has links)
As embedded systems advance, the complexity and multifaceted requirements of products have increased significantly. A trend in this domain is the selection of different types of application models and multiprocessors as the platform. However, limited design space exploration techniques often perform one particular model, and combining diverse application models may cause compatibility issues. Additionally, embedded system design inherently involves multiple objectives. Beyond the essential functionalities, other metrics always need to be considered, such as power consumption, resource utilization, cost, safety, etc. The consideration of these diverse metrics results in a vast design space, so effective design space exploration also plays a crucial role. This thesis addresses these challenges by proposing a co-mapping approach for two distinct models: the periodically activated tasks model for real-time applications and the synchronous dataflow model for digital signal processing. Our primary goal is to co-map these two kinds of models onto a multi-core platform and explore trade-offs between the solutions. We choose the number of used resources and throughput of the synchronous dataflow model as our performance metrics for assessment. We adopt a combination method in which periodic tasks are given precedence to ensure their deadlines are met. The remaining processor resources are then allocated to the synchronous dataflow model. Both the execution of periodic tasks and the synchronous dataflow model are managed by a scheduler, which prevents resource contention and optimizes the utilization of available processor resources. To achieve a balance between different metrics, we implement Pareto optimization as a guiding principle in our approach. This thesis uses the IDeSyDe tool, an extension of the ForSyDe group’s current design space exploration tool, following the Design Space Identification methodology. Implementation is based on Scala and Python, running on the Java virtual machine. The experiment results affirm the successful mapping and scheduling of the periodically activated tasks model and the synchronous dataflow model onto the shared multi-processor platform. We find the Pareto-optimal solutions by IDeSyDe, strategically aiming to maximize the throughput of synchronous dataflow while concurrently minimizing resource consumption. This thesis serves as a valuable insight into the application of different models on a shared platform, particularly for developers interested in utilizing IDeSyDe. However, due to time constraints, our test case may not fully encompass the potential scalability of our thesis method. Additional tests can demonstrate the better effectiveness of our approach. For further reference, the code can be checked in the GitHub repository at*. / Allt eftersom inbyggda system utvecklas, blir komplexiteten och de mångfacetterade kraven av produkter har ökat avsevärt. En trend inom detta område är urval av olika typer av applikationsmodeller och multiprocessorer som plattformen. Dock begränsad design utrymme utforskning tekniker ofta utföra en viss modell, och kombinera olika applikationsmodeller kan orsaka kompatibilitetsproblem. Dessutom inbyggt systemdesign i sig involverar flera mål. Utöver de väsentliga funktionerna, andra mätvärden måste alltid beaktas, såsom strömförbrukning, resurs användning, kostnad, säkerhet, etc. Övervägandet av dessa olika mätvärden resulterar i ett stort designutrymme spelar så effektiv designrumsutforskning också en avgörande roll roll. Denna avhandling tar upp dessa utmaningar genom att föreslå en samkartläggning tillvägagångssätt för två distinkta modeller: modellen med periodiskt aktiverade uppgifter för realtidsapplikationer och den synkrona dataflödesmodellen för digital signal bearbetning. Vårt primära mål är att samkarta dessa två typer av modeller på en multi-core plattform och utforska avvägningar mellan lösningarna. Vi väljer antalet använda resurser och genomströmning av det synkrona dataflödet modell som vårt prestationsmått för bedömning. Vi använder en kombinationsmetod där periodiska uppgifter ges företräde för att säkerställa att deras tidsfrister hålls. Den återstående processorn resurser allokeras sedan till den synkrona dataflödesmodellen. Både utförandet av periodiska uppgifter och den synkrona dataflödesmodellen är hanteras av en schemaläggare, vilket förhindrar resursstrid och optimerar utnyttjandet av tillgängliga processorresurser. För att uppnå en balans mellan olika mått, implementerar vi Pareto-optimering som en vägledande princip i vårt tillvägagångssätt. Denna avhandling använder verktyget IDeSyDe, en förlängning av ForSyDe gruppens nuvarande verktyg för utforskning av designutrymme, efter Design Space Identifieringsmetodik. Implementeringen är baserad på Scala och Python, körs på den virtuella Java-maskinen. Experimentresultaten bekräftar den framgångsrika kartläggningen och schemaläggningen av den periodiskt aktiverade uppgiftsmodellen och det synkrona dataflödet modell på den delade flerprocessorplattformen. Vi finner Pareto-optimal lösningar av IDeSyDe, strategiskt inriktade på att maximera genomströmningen av synkront dataflöde samtidigt som resursförbrukningen minimeras. Denna uppsats fungerar som en värdefull inblick i tillämpningen av olika modeller på en delad plattform, särskilt för utvecklare IDeSyDe. På grund av tidsbrist kanske vårt testfall inte är fullt ut omfattar den potentiella skalbarheten hos vår avhandlingsmetod. Ytterligare tester kan visa hur effektiv vår strategi är. För ytterligare referens, koden kan kontrolleras i GitHub*.
9

The Global Interconnection Scheme of Silago : RTL Design and Verification / Den globala sammankopplingsväven av Silago : RTL Design och Verifiering

Lou, Tong January 2023 (has links)
The Silago concept introduces a hardware-centric platform that is based on coarse-grained reconfigurable fabrics and networks on chips(NoCs). With the intra-region and inter-region NoC, the Silago platform can form resource clusters to host various applications. The conventional global interconnection is implemented with a two-level NoC, which potentially results in heavyweight hardware and unpredictable behavior. Targeting optimizing the global inter-region data transfer, we propose a mathematical model that clarifies the scheduling mechanism, and present a software-defined interconnection solution that exploits the awareness of access pattern. The solution requires a executor which is expected to be a programmable lightweight transmitter. Considering that existing instruction set architectures(ISAs) lack direct support for single-cycle loop instruction, we propose a self-defined instruction set, which reduces the program size and enhances the schedulability. Based on the instruction set, we implemented the transmitter in the abstraction level of register transfer level(RTL). We also established a constraint random stimulus-based verification environment. The design is verified by regression test and synthesized. The results show that the design is functionally correct and synthesizable. Overall, the programmable transmitter helps to enable a composable interconnect scheme to connect hard IPs. / Silago-konceptet introducerar en hårdvarucentrerad plattform som är baserad på grovkorniga omkonfigurerbara tyger och nätverk på chips. Med intra-region och interregion NoC kan Silago-plattformen bilda resurskluster för att vara värd för olika applikationer. Den konventionella globala sammankopplingen är implementerad med en tvånivås NoC, vilket potentiellt resulterar i tung hårdvara och oförutsägbart beteende. Med inriktning på att optimera den globala dataöverföringen mellan regioner, föreslår vi en matematisk modell som klargör schemaläggningsmekanismen och presenterar en mjukvarudefinierad sammankopplingslösning som utnyttjar medvetenheten om åtkomstmönster. Lösningen kräver en executor som förväntas till en programmerbar lättviktssändare. Med tanke på att befintliga instruktionsuppsättningsarkitekturer (ISA) saknar direkt stöd för enkelcykelslinginstruktioner, föreslår vi en självdefinierad instruktionsuppsättning, som minskar programstorleken och förbättrar schemaläggningsbarheten. Baserat på instruktionsuppsättningen implementerade vi sändaren i abstraktionsnivån för registeröverföringsnivå (RTL). Vi etablerade också en slumpmässig stimulansbaserad verifieringsmiljö. Designen verifieras genom regressionstest och syntetiseras. Resultaten visar att designen är funktionellt korrekt och syntetiserbar.

Page generated in 0.0519 seconds