Global ETD Search

1	Dataflow-processing element for a cognitive sensor platform McDermott, Mark William, active 2014 26 June 2014 (has links) Cognitive sensor platforms are the next step in the evolution of intelligent sensor platforms. These platforms have the capability to reason about both their external environment and internal conditions and to modify their processing behavior and configuration in a continuing effort to optimize their operational life and functional utility. The addition of cognitive capabilities is necessary for unattended sensor systems as it is generally not feasible to routinely replace the battery or the sensor(s). This platform provides a chassis that can be used to compose embedded sensor systems from composable elements. The composable elements adhere to a synchronous data flow (SDF) protocol to communicate between the elements using channels. The SDF protocol provides the capability to easily compose heterogeneous systems of multiple processing elements, sensor elements, debug elements and communications elements. The processing engine for this platform is a Dataflow-Processing Element (DPE) that receives, processes and dispatches SDF data tokens. The DPE is specifically designed to support the processing of SDF tokens using microcoded actors where programs are assembled by instantiating actors in a graphical modeling tool and verifying that the SDF protocol is adhered to. / text Cognitive sensor platform Synchronous dataflow Composable systems Dataflow processor
2	Hardware Synthesis of Synchronous Data Flow Models Koecher, Matthew R. 06 April 2004 (has links) (PDF) Synchronous Dataflow (SDF) graphs are a convenient way to represent many signal processing and dataflow operations. Nodes within SDF graphs represent computation while arcs represent dependencies between nodes. Using a graph representation, SDF graphs formally specify a dataflow algorithm without any assumptions on the final implementation. This allows an SDF model to be synthesized into a variety of implementation techniques including both software and hardware. This thesis presents a technique for generating an abstract hardware representation from SDF models. The techniques presented here operate on SDF models defined structurally within the Ptolemy modeling environment. The behavior of the nodes within Ptolemy SDF models is specified in software and can be simple, such as a single arithmetic operation, or arbitrarily complex. This thesis presents a technique for extracting the behavior of a limited class of SDF nodes defined in software and generating a structural description of the SDF model based on primitive arithmetic and logical operations. This synthesized graph can be used for subsequent hardware synthesis transformations. hardware synthesis ptolemy java synchronous dataflow SDF abstract circuit graph ACG Electrical and Computer Engineering
3	Evaluation de l'affectation des tâches sur une architecture à mémoire distribuée pour des modèles flot de données / Efficient evaluation of mappings of dataflow applications onto distributed memory architectures Lesparre, Youen 02 March 2017 (has links) Avec l'augmentation de l'utilisation des smartphones, des objets connectés et des véhicules automatiques, le domaine des systèmes embarqués est devenu omniprésent dans notre environnement. Ces systèmes sont souvent contraints en terme de consommation et de taille. L'utilisation des processeurs many-cores dans des systèmes embarqués permet une conception rapide tout en respectant des contraintes temps-réels et en conservant une consommation énergétique basse.Exécuter une application sur un processeur many-core requiert un dispatching des tâches appelé problème de mapping et est connu comme étant NP-complet.Les contributions de cette thèse sont divisées en trois parties :Tout d'abord, nous étendons d'importantes propriétés dataflow au modèle Phased Computation Graph.Ensuite, nous présentons un générateur de graphe dataflow capable de générer des Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs et Phased Computation Graphs vivant avec plus de 10000 tâches en moins de 30 secondes. Le générateur est comparé à SDF3 et PREESM.Enfin, la contribution majeure de cette thèse propose une nouvelle méthode d'évaluation d'un mapping en utilisant les modèles Synchonous Dataflow Graphe et Cyclo-Static Dataflow Graphe. La méthode évalue efficacement la mémoire consommée par les communications d'un dataflow mappé sur une architecture à mémoire distribuée. L'évaluation est déclinée en deux versions, la première garantit la vivacité alors que la seconde ajoute une contrainte de débit. La méthode d'évaluation est expérimentée avec des dataflow générés par Turbine et avec des applications réelles. / With the increasing use of smart-phones, connected objects or automated vehicles, embedded systems have become ubiquitous in our living environment. These systems are often highly constrained in terms of power consumption and size. They are more and more implemented with many-core processor array that allow, rapid design to meet stringent real-time constraints while operating at relatively low frequency, with reduced power consumption.Running an application on a processor array requires dispatching its tasks on the processors in order to meet capacity and performance constraints. This mapping problem is known to be NP-complete.The contributions of this thesis are threefold:First we extend important notions from the Cyclo-Static Dataflow Graph to the Phased Computation Graph model and two equivalent sufficient conditions of liveness.Second, we present a random dataflow graph generator able to generate Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs and Phased Computation Graphs. The Generator, is able to generate live dataflow of up to 10,000 tasks in less than 30 seconds. It is compared with SDF3 and PREESM.Third and most important, we propose a new method of evaluation of a mapping using the Synchonous Dataflow Graph and the Cyclo-Static Dataflow Graph models. The method evaluates efficiently the memory footprint of the communications of a dataflow graph mapped on a distributed architecture. The evaluation is declined in two versions, the first guarantees a live mapping while the second accounts for a constraint on throughput.The evaluation method is experimented on dataflow graphs from Turbine and on real-life applications. Synchronous Dataflow Graph Cyclo-Static Dataflow Graph Phased Computation Graph Génération aléatoire Évaluation d'un mapping Synchronous Dataflow Graph Cyclo-Static Dataflow Graph Phased Computation Graph Random generator Mapping evaluation Memory footprint 005.7
4	Energy and Design Cost Efficiency for Streaming Applications on Systems-on-Chip Zhu, Jun January 2009 (has links) <p>With the increasing capacity of today's integrated circuits, a number ofheterogeneous system-on-chip (SoC) architectures in embedded systemshave been proposed. In order to achieve energy and design cost efficientstreaming applications on these systems, new design space explorationframeworks and performance analysis approaches are required. Thisthesis considers three state-of-the-art SoCs architectures, i.e., themulti-processor SoCs (MPSoCs) with network-on-chip (NoC) communication,the hybrid CPU/FPGA architectures, and the run-time reconfigurable (RTR)FPGAs. The main topic of the author?s research is to model and capturethe application scheduling, architecture customization, and bufferdimensioning problems, according to the real-time requirement. Sincethese problems are NP-complete, heuristic algorithms and constraintprogramming solver are used to compute a solution.For NoC communication based MPSoCs, an approach to optimize thereal-time streaming applications with customized processorvoltage-frequency levels and memory sizes is presented. A multi-clockedsynchronous model of computation (MoC) framework is proposed inheterogeneous timing analysis and energy estimation. Using heuristicsearching (i.e., greedy and taboo search), the experiments show anenergy reduction (up to 21%) without any loss in application throughputcompared with an ad-hoc approach.On hybrid CPU/FPGA architectures, the buffer minimization scheduling ofreal-time streaming applications is addressed. Based on event models,the problem has been formalized decoratively as constraint basescheduling, and solved by public domain constraint solver Gecode.Compared with traditional PAPS method, the proposed method needssignificantly smaller buffers (2.4% of PAPS in the best case), whilehigh throughput guarantees can still be achieved.Furthermore, a novel compile-time analysis approach based on iterativetiming phases is proposed for run-time reconfigurations in adaptivereal-time streaming applications on RTR FPGAs. Finally, thereconfigurations analysis and design trade-offs analysis capabilities ofthe proposed framework have been exemplified with experiments on bothexample and industrial applications.</p> / Andres Streaming applications Systems-on-chip Synchronous dataflow energy efficiency buffer minimization performance analysis Informatik, data- och systemvetenskap
5	Mutli-objective trade-off exploration for Cyclo-Static and Synchronous Dataflow graphs Sinha, Ashmita 30 October 2012 (has links) Many digital signal processing and real-time streaming systems are modeled using dataflow graphs, such as Synchronous Dataflow (SDF) and Cyclo-static Dataflow (CSDF) graphs that allow static analysis and optimization techniques. However, mapping of such descriptions into tightly constrained real-time implementations requires optimization of resource sharing, buffering and scheduling across a multi-dimensional latency-throughput-area objective space. This requires techniques that can find the Pareto-optimal set of implementations for the designer to choose from. In this work, we address the problem of multi-objective mapping and scheduling of SDF and CSDF graphs onto heterogeneous multi-processor platforms. Building on previous work, this thesis extends existing two-stage hybrid heuristics that combine an evolutionary algorithm with an integer linear programming (ILP) model to jointly optimize throughput, area and latency for SDF graphs. The primary contributions of this work include: (1) extension of the ILP model to support CSDFGs with additional buffer size optimizations; (2) a further optimization in the ILP-based scheduling model to achieve a runtime speedup of almost a factor of 10 compared to the existing SDFG formulation; (3) a list scheduling heuristic that replaces the ILP model in the hybrid heuristic to generate Pareto-optimal solutions at significantly decreased runtime while maintaining near-optimality of the solutions within an acceptable gap of 10% when compared to its ILP counterparts. The list scheduling heuristic presented in this work is based on existing modulo scheduling approaches for software pipelining in the compiler domain, but has been extended by introducing a new concept of mobility-based rescheduling before resorting to backtracking. It has been proved in this work that if mobility-based rescheduling is performed, the number of required backtrackings and hence overall complexity and runtime is less. / text Synchronous Dataflow graph (SDF) Cyclo-Static Dataflow graph (CSDF) Integer Linear Programming (ILP) List scheduling Mapping Throughput Latency Buffer Area
6	Energy and Design Cost Efficiency for Streaming Applications on Systems-on-Chip Zhu, Jun January 2009 (has links) With the increasing capacity of today's integrated circuits, a number ofheterogeneous system-on-chip (SoC) architectures in embedded systemshave been proposed. In order to achieve energy and design cost efficientstreaming applications on these systems, new design space explorationframeworks and performance analysis approaches are required. Thisthesis considers three state-of-the-art SoCs architectures, i.e., themulti-processor SoCs (MPSoCs) with network-on-chip (NoC) communication,the hybrid CPU/FPGA architectures, and the run-time reconfigurable (RTR)FPGAs. The main topic of the author?s research is to model and capturethe application scheduling, architecture customization, and bufferdimensioning problems, according to the real-time requirement. Sincethese problems are NP-complete, heuristic algorithms and constraintprogramming solver are used to compute a solution.For NoC communication based MPSoCs, an approach to optimize thereal-time streaming applications with customized processorvoltage-frequency levels and memory sizes is presented. A multi-clockedsynchronous model of computation (MoC) framework is proposed inheterogeneous timing analysis and energy estimation. Using heuristicsearching (i.e., greedy and taboo search), the experiments show anenergy reduction (up to 21%) without any loss in application throughputcompared with an ad-hoc approach.On hybrid CPU/FPGA architectures, the buffer minimization scheduling ofreal-time streaming applications is addressed. Based on event models,the problem has been formalized decoratively as constraint basescheduling, and solved by public domain constraint solver Gecode.Compared with traditional PAPS method, the proposed method needssignificantly smaller buffers (2.4% of PAPS in the best case), whilehigh throughput guarantees can still be achieved.Furthermore, a novel compile-time analysis approach based on iterativetiming phases is proposed for run-time reconfigurations in adaptivereal-time streaming applications on RTR FPGAs. Finally, thereconfigurations analysis and design trade-offs analysis capabilities ofthe proposed framework have been exemplified with experiments on bothexample and industrial applications. / Andres Streaming applications Systems-on-chip Synchronous dataflow energy efficiency buffer minimization performance analysis Computer and Information Sciences Data- och informationsvetenskap
7	Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation Arnesen, Adam T. 17 March 2011 (has links) (PDF) As Moore's law continues to progress, it is becoming increasingly difficult for hardware designers to fully utilize the increasing number of transistors available semiconductor devices including FPGAs. This design productivity gap must be addressed to allow designs to take full advantage of the increased logic density that results from rising transistor density. The reuse of previously developed and verified intellectual property (IP) is one approach that has claimed to narrow the design productivity gap. Reuse, however, has proved difficult to realize in practice because of the complexity of IP and the reluctance of designers to reuse IP that they do not understand. This thesis proposes to narrow the design productivity gap for FPGAs by simplifying the reuse problem by encapsulating IP with extra machine-readable information or meta-data. This meta-data simplifies reuse by providing a language independent format for composing complex systems, providing a parameter representation system, defining high-level data types for FPGA IP, and allowing arbitrary IP to be described as actors in the homogeneous synchronous dataflow model of computation.This work implements meta-data in XML and presents two XML schemas that enable reuse. A new XML schema known as CHREC XML is presented as well as extensions that enable IP-XACT to be used to describe FPGA dataflow IP. Two tools developed in this work are also presented that leverage meta-data to simplify reuse of arbitrary IP. These tools simplify structural composition of IP, allow designers to manipulate parameters, check and validate high-level data types, and automatically synthesize control circuitry for dataflow designs. Productivity improvements are also demonstrated by reusing IP to quickly compose software radio receivers. meta-data FPGA intellectual property reuse interface synthesis IP-XACT synchronous dataflow architectural synthesis Electrical and Computer Engineering
8	Design space exploration for co-mapping of periodic and streaming applications in a shared platform / Validering av designlösningar för utforskning av rymden för samkartläggning av periodiska och strömmande applikationer i en delad plattform Yuhan, Zhang January 2023 (has links) As embedded systems advance, the complexity and multifaceted requirements of products have increased significantly. A trend in this domain is the selection of different types of application models and multiprocessors as the platform. However, limited design space exploration techniques often perform one particular model, and combining diverse application models may cause compatibility issues. Additionally, embedded system design inherently involves multiple objectives. Beyond the essential functionalities, other metrics always need to be considered, such as power consumption, resource utilization, cost, safety, etc. The consideration of these diverse metrics results in a vast design space, so effective design space exploration also plays a crucial role. This thesis addresses these challenges by proposing a co-mapping approach for two distinct models: the periodically activated tasks model for real-time applications and the synchronous dataflow model for digital signal processing. Our primary goal is to co-map these two kinds of models onto a multi-core platform and explore trade-offs between the solutions. We choose the number of used resources and throughput of the synchronous dataflow model as our performance metrics for assessment. We adopt a combination method in which periodic tasks are given precedence to ensure their deadlines are met. The remaining processor resources are then allocated to the synchronous dataflow model. Both the execution of periodic tasks and the synchronous dataflow model are managed by a scheduler, which prevents resource contention and optimizes the utilization of available processor resources. To achieve a balance between different metrics, we implement Pareto optimization as a guiding principle in our approach. This thesis uses the IDeSyDe tool, an extension of the ForSyDe group’s current design space exploration tool, following the Design Space Identification methodology. Implementation is based on Scala and Python, running on the Java virtual machine. The experiment results affirm the successful mapping and scheduling of the periodically activated tasks model and the synchronous dataflow model onto the shared multi-processor platform. We find the Pareto-optimal solutions by IDeSyDe, strategically aiming to maximize the throughput of synchronous dataflow while concurrently minimizing resource consumption. This thesis serves as a valuable insight into the application of different models on a shared platform, particularly for developers interested in utilizing IDeSyDe. However, due to time constraints, our test case may not fully encompass the potential scalability of our thesis method. Additional tests can demonstrate the better effectiveness of our approach. For further reference, the code can be checked in the GitHub repository at. / Allt eftersom inbyggda system utvecklas, blir komplexiteten och de mångfacetterade kraven av produkter har ökat avsevärt. En trend inom detta område är urval av olika typer av applikationsmodeller och multiprocessorer som plattformen. Dock begränsad design utrymme utforskning tekniker ofta utföra en viss modell, och kombinera olika applikationsmodeller kan orsaka kompatibilitetsproblem. Dessutom inbyggt systemdesign i sig involverar flera mål. Utöver de väsentliga funktionerna, andra mätvärden måste alltid beaktas, såsom strömförbrukning, resurs användning, kostnad, säkerhet, etc. Övervägandet av dessa olika mätvärden resulterar i ett stort designutrymme spelar så effektiv designrumsutforskning också en avgörande roll roll. Denna avhandling tar upp dessa utmaningar genom att föreslå en samkartläggning tillvägagångssätt för två distinkta modeller: modellen med periodiskt aktiverade uppgifter för realtidsapplikationer och den synkrona dataflödesmodellen för digital signal bearbetning. Vårt primära mål är att samkarta dessa två typer av modeller på en multi-core plattform och utforska avvägningar mellan lösningarna. Vi väljer antalet använda resurser och genomströmning av det synkrona dataflödet modell som vårt prestationsmått för bedömning. Vi använder en kombinationsmetod där periodiska uppgifter ges företräde för att säkerställa att deras tidsfrister hålls. Den återstående processorn resurser allokeras sedan till den synkrona dataflödesmodellen. Både utförandet av periodiska uppgifter och den synkrona dataflödesmodellen är hanteras av en schemaläggare, vilket förhindrar resursstrid och optimerar utnyttjandet av tillgängliga processorresurser. För att uppnå en balans mellan olika mått, implementerar vi Pareto-optimering som en vägledande princip i vårt tillvägagångssätt. Denna avhandling använder verktyget IDeSyDe, en förlängning av ForSyDe gruppens nuvarande verktyg för utforskning av designutrymme, efter Design Space Identifieringsmetodik. Implementeringen är baserad på Scala och Python, körs på den virtuella Java-maskinen. Experimentresultaten bekräftar den framgångsrika kartläggningen och schemaläggningen av den periodiskt aktiverade uppgiftsmodellen och det synkrona dataflödet modell på den delade flerprocessorplattformen. Vi finner Pareto-optimal lösningar av IDeSyDe, strategiskt inriktade på att maximera genomströmningen av synkront dataflöde samtidigt som resursförbrukningen minimeras. Denna uppsats fungerar som en värdefull inblick i tillämpningen av olika modeller på en delad plattform, särskilt för utvecklare IDeSyDe. På grund av tidsbrist kanske vårt testfall inte är fullt ut omfattar den potentiella skalbarheten hos vår avhandlingsmetod. Ytterligare tester kan visa hur effektiv vår strategi är. För ytterligare referens, koden kan kontrolleras i GitHub. Design Space Exploration Periodically activated tasks Synchronous dataflow IDeSyDe Designutrymmesutforskning Periodiskt aktiverade uppgifter Synkron data-flöde IDeSyDe Elektroteknik och elektronik
9	The Global Interconnection Scheme of Silago : RTL Design and Verification / Den globala sammankopplingsväven av Silago : RTL Design och Verifiering Lou, Tong January 2023 (has links) The Silago concept introduces a hardware-centric platform that is based on coarse-grained reconfigurable fabrics and networks on chips(NoCs). With the intra-region and inter-region NoC, the Silago platform can form resource clusters to host various applications. The conventional global interconnection is implemented with a two-level NoC, which potentially results in heavyweight hardware and unpredictable behavior. Targeting optimizing the global inter-region data transfer, we propose a mathematical model that clarifies the scheduling mechanism, and present a software-defined interconnection solution that exploits the awareness of access pattern. The solution requires a executor which is expected to be a programmable lightweight transmitter. Considering that existing instruction set architectures(ISAs) lack direct support for single-cycle loop instruction, we propose a self-defined instruction set, which reduces the program size and enhances the schedulability. Based on the instruction set, we implemented the transmitter in the abstraction level of register transfer level(RTL). We also established a constraint random stimulus-based verification environment. The design is verified by regression test and synthesized. The results show that the design is functionally correct and synthesizable. Overall, the programmable transmitter helps to enable a composable interconnect scheme to connect hard IPs. / Silago-konceptet introducerar en hårdvarucentrerad plattform som är baserad på grovkorniga omkonfigurerbara tyger och nätverk på chips. Med intra-region och interregion NoC kan Silago-plattformen bilda resurskluster för att vara värd för olika applikationer. Den konventionella globala sammankopplingen är implementerad med en tvånivås NoC, vilket potentiellt resulterar i tung hårdvara och oförutsägbart beteende. Med inriktning på att optimera den globala dataöverföringen mellan regioner, föreslår vi en matematisk modell som klargör schemaläggningsmekanismen och presenterar en mjukvarudefinierad sammankopplingslösning som utnyttjar medvetenheten om åtkomstmönster. Lösningen kräver en executor som förväntas till en programmerbar lättviktssändare. Med tanke på att befintliga instruktionsuppsättningsarkitekturer (ISA) saknar direkt stöd för enkelcykelslinginstruktioner, föreslår vi en självdefinierad instruktionsuppsättning, som minskar programstorleken och förbättrar schemaläggningsbarheten. Baserat på instruktionsuppsättningen implementerade vi sändaren i abstraktionsnivån för registeröverföringsnivå (RTL). Vi etablerade också en slumpmässig stimulansbaserad verifieringsmiljö. Designen verifieras genom regressionstest och syntetiseras. Resultaten visar att designen är funktionellt korrekt och syntetiserbar. Silago global interconnection synchronous dataflow network on chip Silago global sammankoppling synkront dataflöde nätverk på chip Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Elektroteknik och elektronik

Search results