Global ETD Search

141	ESCALONAMENTO DE TAREFAS E FLUXOS DE COMUNICAÇÃO PARA SISTEMAS SEMI-PARTICIONADOS EM ARQUITETURAS NOC / SEMI-PARTITIONED SCHEDULING OF TASKS AND COMMUNICATION FLOWS ON NOC ARCHTECTURES Bonilha, Iaê Santos 24 March 2014 (has links) Despiste the fact that many scheduling models teoretically capable of high system resource utilization were proposed with the development of the real-time system, the industry still uses the first scheduling model proposed for multi-processor real-time systems, the partitioned scheduling model. This scheduling model can guarantee scheduling of task sets up to around 69% processor utilization, which falls pale in comparison to recent scheduling models that can guarantee scheduling up to 97% processor utilization. The motive behind the utilization of the partitioned scheduling as industrial model is the amount of studies made on this model and the development of scheduling analysis capable of providing temporal guarantees for this model on a real system environment. Recent scheduling models, like semi-partitioned scheduling, offer the possibility of a higher system resource utilization, it still lack studies and scheduling analysis capable of provide temporal guarantees under a real environment. The current scheduling analysis for most of the more recent models take advantage of a series of abstractions, failing to provide guarantees under real circumstances. This papers primary objective is to produce a new scheduling analysis for semi-partitioned scheduling, capable of achieving temporal guarantees taking some of the previously abstracted factors, like task communication and the impact f task migration on its communications flows, approximating the scheduling model to real environmental conditions. With the development of such analysis preliminary studies were made on heuristic task mapping algorithms for semipartitioned systems. / Com a popularização de sistemas multi-processador, surgiu uma série de propostas de modelos de escalonamento, na área de sistemas de tempo real que, teoricamente, são capazes de obter um alto aproveitamento dos recursos do sistema. Entretanto, o modelo de escalonamento mais adotado continua sendo um dos primeiros modelos de escalonamento propostos na área, o modelo de escalonamento particionado. O modelo de escalonamento particionado só pode garantir o escalonamento de conjuntos com até cerca de 69% de utilização de processador, sendo limitado se comparado com garantias de escalonamento de até 97% de utilização de modelos mais recentes. O motivo pelo qual o escalonamento particionado continua sendo utilizado é a grande concentração de estudos a respeito do modelo e o desenvolvimento de análises de escalonamento capazes de garantir o escalonamento do modelo em condições reais do sistema. Modelos mais recentes, como o escalonamento semi-particionado, apresentam uma possibilidade de um maior aproveitamento do sistema, porém, ainda possuem estudos limitados e não dispõe de análises de escalonamento capazes de prover garantias temporais para o sistema em condições reais, devido à presença de diversas abstrações no modelo. Neste sentido, este trabalho foca em arquiteturas Network-on-Chip que apresentam comunicação explícita, abstraída nos trabalhos encontrados na literatura. Este trabalho tem como objetivo primário o desenvolvimento de uma análise de escalonamento capaz de prover garantias temporais para o modelo de escalonamento semi-particionado levando em consideração fatores previamente abstraídos, como a necessidade de comunicação entre tarefas e o impacto da migração das tarefas nos seus fluxos de comunicação, aproximando o modelo da realidade. O desenvolvimento de tal análise possibilita o estudo preliminar de algoritmos heurísticos de mapeamento de tarefas, capazes de mapear conjuntos de tarefas levando em consideração migrações de tarefas e comunicação entre tarefas em um modelo de escalonamento semi-particionado. Sistemas de tempo real Escalonamento semi-particionado Sistemas multiprocessados Análise de escalonamento Migração de tarefas Alocação de recursos Real-time systems Semi-partioned scheduling Multiprocessor systems Scheduling analysis Task migration Resource allocation
142	High level design and control of adaptive multiprocessor system-on-chips / Conception et contrôle de haut niveau pour les systèmes sur puce multiprocesseurs adaptatifs An, Xin 16 October 2013 (has links) La conception de systèmes embarqués modernes est de plus en plus complexe, car plus de fonctionnalités sont intégrées dans ces systèmes. En même temps, afin de répondre aux exigences de calcul tout en conservant une consommation d'énergie de faible niveau, MPSoCs sont apparus comme les principales solutions pour tels systèmes embarqués. En outre, les systèmes embarqués sont de plus en plus adaptatifs, comme l’adaptabilité peut apporter un certain nombre d'avantages, tels que la flexibilité du logiciel et l'efficacité énergétique. Cette thèse vise la conception sécuritaire de ces MPSoCs adaptatifs. Tout d'abord, chaque configuration de système doit être analysée en ce qui concerne ses propriétés fonctionnelles et non fonctionnelles. Nous présentons un cadre abstraite de conception et d’analyse qui permet des décisions d’implémentation plus rapide et plus rentable. Ce cadre est conçu comme un support de raisonnement intermédiaire pour les environnements de co-conception de logiciel / matériel au niveau de système. Il peut élaguer l'espace de conception à sa plus grande portée, et identifier les candidats de solutions de conception de manière rapide et efficace. Dans ce cadre, nous utilisons un codage basé sur l’horloge abstrait pour modéliser les comportements du système. Différents scénarios d'applications de mapping et de planification sur MPSoCs sont analysés via les traces d'horloge qui représentent les simulations du système. Les propriétés d'intérêt sont l’exactitude du comportement fonctionnel, la performance temporelle et la consommation d'énergie. Deuxièmement, la gestion de la reconfiguration de MPSoCs adaptatifs doit être abordée. Nous sommes particulièrement intéressés par les MPSoCs implémentés sur des architectures reconfigurables de hardware (ex. FPGA tissus) qui offrent une bonne flexibilité et une efficacité de calcul pour les MPSoCs adaptatifs. Nous proposons un cadre général de conception basésur la technique de la synthèse de contrôleurs discrets (SCD) pour résoudre ce problème. L’avantage principal de cette technique est qu'elle permet une synthèse d'un contrôleur automatique vis-à-vis d’une spécification donnée des objectifs de contrôle. Dans ce cadre, le comportement de reconfiguration du système est modélisé en termes d'automates synchrones en parallèle. Le problème de calcul de la gestion reconfiguration vis-à-vis de multiples objectifs concernant, par exemple, les usages des ressources, la performance et la consommation d’énergie est codé comme un problème de SCD . Le langage de programmation BZR existant et l’outil Sigali sont employés pour effectuer SCD et générer un contrôleur qui satisfait aux exigences du système. Finalement, nous étudions deux façons différentes de combiner les deux cadres de conception proposées pour MPSoCs adaptatifs. Tout d'abord, ils sont combinés pour construire un flot de conception complet pour MPSoCs adaptatifs. Deuxièmement, ils sont combinés pour présenter la façon dont le gestionnaire d'exécution conçu dans le second cadre peut être intégré dans le premier cadre de sorte que les simulations de haut niveau peuvent être effectuées pour évaluer le gestionnaire d'exécution. / The design of modern embedded systems is getting more and more complex, as more func- tionality is integrated into these systems. At the same time, in order to meet the compu- tational requirements while keeping a low level power consumption, MPSoCs have emerged as the main solutions for such embedded systems. Furthermore, embedded systems are be- coming more and more adaptive, as the adaptivity can bring a number of benefits, such as software flexibility and energy efficiency. This thesis targets the safe design of such adaptive MPSoCs. First, each system configuration must be analyzed concerning its functional and non- functional properties. We present an abstract design and analysis framework, which allows for faster and cost-effective implementation decisions. This framework is intended as an intermediate reasoning support for system level software/hardware co-design environments. It can prune the design space at its largest, and identify candidate design solutions in a fast and efficient way. In the framework, we use an abstract clock-based encoding to model system behaviors. Different mapping and scheduling scenarios of applications on MPSoCs are analyzed via clock traces representing system simulations. Among properties of interest are functional behavioral correctness, temporal performance and energy consumption. Second, the reconfiguration management of adaptive MPSoCs must be addressed. We are specially interested in MPSoCs implemented on reconfigurable hardware architectures (i.e., FPGA fabrics), which provide a good flexibility and computational efficiency for adap- tive MPSoCs. We propose a general design framework based on the discrete controller syn- thesis (DCS) technique to address this issue. The main advantage of this technique is that it allows the automatic controller synthesis w.r.t. a given specification of control objectives. In the framework, the system reconfiguration behavior is modeled in terms of synchronous parallel automata. The reconfiguration management computation problem w.r.t. multiple objectives regarding e.g., resource usages, performance and power consumption is encoded as a DCS problem. The existing BZR programming language and Sigali tool are employed to perform DCS and generate a controller that satisfies the system requirements. Finally, we investigate two different ways of combining the two proposed design frame- works for adaptive MPSoCs. Firstly, they are combined to construct a complete design flow for adaptive MPSoCs. Secondly, they are combined to present how the designed run-time manager by the second framework can be integrated into the first framework so that high level simulations can be performed to assess the run-time manager. Approche synchrone Synthèse de contrôleurs discrets Exploration de l'espace de conception Conception et validation formelle Adaptive Multiprocessor System-on-Chips Synchronous approach Discrete controller syntheis Design space exploration Formal design and validation 004
143	Polymorphic ASIC : For Video Decoding Adarsha Rao, S J January 2013 (has links) (PDF) Video applications are becoming ubiquitous in recent times due to an explosion in the number of devices with video capture and display capabilities. Traditionally, video applications are implemented on a variety of devices with each device targeting a speciﬁc application. However, the advances in technology have created a need to support multiple applications from a single device like a smart phone or tablet. Such convergence of applications necessitates support for interoperability among various applications, scalable performance meet the requirements of different applications and a high degree of reconﬁgurability to accommodate rapid evolution in applications features. In addition, low power consumption requirement is also very stringent for many video applications. The conventional custom hardware implementations of video applications deliver high performance at low power consumption while the recent MPSoC implementations enable high degree of interoperability and are useful to support application evolution. In this thesis, we combine the best features of custom hardware and MPSoC approaches to design a Polymorphic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements of several applications belonging to a particular domain. A polymorphic ASIC consists of a fabric of computation, storage and communication resources, using which applications are composed dynamically. Although different video applications differ widely in the internal de-tails of operation, at the heart of almost every video application is a video codec (encoder and decoder). The requirements of scalability, high performance and low power consumption are very stringent for video decoding. Therefore this thesis focuses mainly on the architectural design of a Polymorphic ASIC for video decoding. We present an uniﬁed software and hardware architecture (USHA) for Polymorphic ASIC. USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are software programmable and hardware reconﬁgurable respectively. The distinctive feature of Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-plication processes onto the computational tiles. Depending on the application scenarios, a process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-porates a network–on–chip (NoC) to achieve ﬂexible communication across different tiles. Formulation of a programming framework for Polymorphic ASIC requires an implementation model that captures the structure of video decoder applications as well as the properties of the Polymorphic ASIC architecture. We derive an implementation model based on a combination of parametric polyhedral process networks, stream based functions and windowed dataﬂow models of computation. The implementation model leads to a process network oriented compilation ﬂow that achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi–processor, semi hardware and full hardware conﬁgurations of a video decoder. The thesis also presents an application QoS aware scheduler that selects a decoder conﬁguration that best meets the application performance requirements, thereby enabling dynamic performance scaling. The memory hierarchy of Polymorphic ASIC makes use of an application speciﬁc cache. Through a combined analysis of miss rate and external memory bandwidth, we show that the degradation in decoder performance due to memory stall cycles depends on the properties of the video being decoded as well as the behavior of the external memory interface. Based on this observation, we present the design of a reconﬁgurable 2–D cache architecture which can adjust its parameters in accordance with the characteristics of the video stream being decoded. We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA. The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi processor, hardware accelerated and full hardware conﬁgurations. The scaling in performance delivered by these conﬁgurations shows that the Polymorphic ASIC enables the application to achieve super linear speedups [1]. The experimental results show that different implementations of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like IBM Cell. We also present the energy consumption of various conﬁgurations of video decoders on Polymorphic ASIC and an application to conﬁguration mapping aimed at minimizing the overall energy consumption of a Polymorphic ASIC. Video Decoding Polymorphic ASIC Architecture Multiprocessor System-On-Chip (MPSoC) H.264 Decoder Video Encoding H.264 Decoders H.264 Decoding Computer Science
144	Polymorphic ASIC : For Video Decoding Adarsha Rao, S J January 2013 (has links) (PDF) Video applications are becoming ubiquitous in recent times due to an explosion in the number of devices with video capture and display capabilities. Traditionally, video applications are implemented on a variety of devices with each device targeting a speciﬁc application. However, the advances in technology have created a need to support multiple applications from a single device like a smart phone or tablet. Such convergence of applications necessitates support for interoperability among various applications, scalable performance meet the requirements of different applications and a high degree of reconﬁgurability to accommodate rapid evolution in applications features. In addition, low power consumption requirement is also very stringent for many video applications. The conventional custom hardware implementations of video applications deliver high performance at low power consumption while the recent MPSoC implementations enable high degree of interoperability and are useful to support application evolution. In this thesis, we combine the best features of custom hardware and MPSoC approaches to design a Polymorphic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements of several applications belonging to a particular domain. A polymorphic ASIC consists of a fabric of computation, storage and communication resources, using which applications are composed dynamically. Although different video applications differ widely in the internal de-tails of operation, at the heart of almost every video application is a video codec (encoder and decoder). The requirements of scalability, high performance and low power consumption are very stringent for video decoding. Therefore this thesis focuses mainly on the architectural design of a Polymorphic ASIC for video decoding. We present an uniﬁed software and hardware architecture (USHA) for Polymorphic ASIC. USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are software programmable and hardware reconﬁgurable respectively. The distinctive feature of Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-plication processes onto the computational tiles. Depending on the application scenarios, a process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-porates a network–on–chip (NoC) to achieve ﬂexible communication across different tiles. Formulation of a programming framework for Polymorphic ASIC requires an implementation model that captures the structure of video decoder applications as well as the properties of the Polymorphic ASIC architecture. We derive an implementation model based on a combination of parametric polyhedral process networks, stream based functions and windowed dataﬂow models of computation. The implementation model leads to a process network oriented compilation ﬂow that achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi–processor, semi hardware and full hardware conﬁgurations of a video decoder. The thesis also presents an application QoS aware scheduler that selects a decoder conﬁguration that best meets the application performance requirements, thereby enabling dynamic performance scaling. The memory hierarchy of Polymorphic ASIC makes use of an application speciﬁc cache. Through a combined analysis of miss rate and external memory bandwidth, we show that the degradation in decoder performance due to memory stall cycles depends on the properties of the video being decoded as well as the behavior of the external memory interface. Based on this observation, we present the design of a reconﬁgurable 2–D cache architecture which can adjust its parameters in accordance with the characteristics of the video stream being decoded. We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA. The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi processor, hardware accelerated and full hardware conﬁgurations. The scaling in performance delivered by these conﬁgurations shows that the Polymorphic ASIC enables the application to achieve super linear speedups [1]. The experimental results show that different implementations of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like IBM Cell. We also present the energy consumption of various conﬁgurations of video decoders on Polymorphic ASIC and an application to conﬁguration mapping aimed at minimizing the overall energy consumption of a Polymorphic ASIC. Video Decoding Polymorphic ASIC Architecture Multiprocessor System-On-Chip (MPSoC) H.264 Decoder Video Encoding H.264 Decoders H.264 Decoding Computer Science
145	Efficient optimal multiprocessor scheduling algorithms for real-time systems Nelissen, Geoffrey 08 January 2013 (has links) Real-time systems are composed of a set of tasks that must respect some deadlines. We find them in applications as diversified as the telecommunications, medical devices, cars, planes, satellites, military applications, etc. Missing deadlines in a real-time system may cause various results such as a diminution of the quality of service provided by the system, the complete stop of the application or even the death of people. Being able to prove the correct operation of such systems is therefore primordial. This is the goal of the real-time scheduling theory.<p><p>These last years, we have witnessed a paradigm shift in the computing platform architectures. Uniprocessor platforms have given place to multiprocessor architectures. While the real-time scheduling theory can be considered as being mature for uniprocessor systems, it is still an evolving research field for multiprocessor architectures. One of the main difficulties with multiprocessor platforms, is to provide an optimal scheduling algorithm (i.e. scheduling algorithm that constructs a schedule respecting all the task deadlines for any task set for which a solution exists). Although optimal multiprocessor real-time scheduling algorithms exist, they usually cause an excessive number of task preemptions and migrations during the schedule. These preemptions and migrations cause overheads that must be added to the task execution times. Therefore, task sets that would have been schedulable if preemptions and migrations had no cost, become unschedulable in practice. An efficient scheduling algorithm is therefore an algorithm that either minimize the number of preemptions and migrations, or reduce their cost.<p><p>In this dissertation, we expose the following results:<p>- We show that reducing the "fairness" in the schedule, advantageously impacts the number of preemptions and migrations. Hence, all the scheduling algorithms that will be proposed in this thesis, tend to reduce or even suppress the fairness in the computed schedule.<p><p>- We propose three new online scheduling algorithms. One of them --- namely, BF2 --- is optimal for the scheduling of sporadic tasks in discrete-time environments, and reduces the number of task preemptions and migrations in comparison with the state-of-the-art in discrete-time systems. The second one is optimal for the scheduling of periodic tasks in a continuous-time environment. Because this second algorithm is based on a semi-partitioned scheme, it should favorably impact the preemption overheads. The third algorithm --- named U-EDF --- is optimal for the scheduling of sporadic and dynamic task sets in a continuous-time environment. It is the first real-time scheduling algorithm which is not based on the notion of "fairness" and nevertheless remains optimal for the scheduling of sporadic (and dynamic) systems. This important result was achieved by extending the uniprocessor algorithm EDF to the multiprocessor scheduling problem. <p><p>- Because the coding techniques are also evolving as the degree of parallelism increases in computing platforms, we provide solutions enabling the scheduling of parallel tasks with the currently existing scheduling algorithms, which were initially designed for the scheduling of sequential independent tasks. / Doctorat en Sciences de l'ingénieur / info:eu-repo/semantics/nonPublished Electricité Real-time data processing Multiprocessors Temps réel Multiprocesseurs Parallélisme (Informatique) Algorithm optimal real-time multiprocessor scheduling fairness U-EDF BF2 parallel tasks temps réel ordonnancement tâches parallèles
146	Energy-aware real-time scheduling in embedded multiprocessor systems / Ordonnancement temps réel dans les systèmes embarqués multiprocesseurs contraints par l'énergie Nélis, Vincent 18 October 2010 (has links) Nowadays, computer systems are everywhere. From simple portable devices such as watches and MP3 players to large stationary installations that control nuclear power plants, computer systems are now present in all aspects of our modern and every-day life. In about only 70 years, they have completely perturbed our way of life and they reached a so high degree of sophistication that they will be soon capable of driving our cars and cleaning our houses without any human intervention. As computer systems gain in responsibilities, it becomes essential that they provide both safety and reliability. Indeed, a failure in systems such as the anti-lock braking system (ABS) in cars could threaten human lives and generate catastrophic and irreversible consequences. Hence, for many years, researchers have addressed these emerging problems of system safety and reliability which come along with this fulgurant evolution. <p><p>This thesis provides a general overview of embedded real-time computer systems, i.e. a particular kind of computer system whose number grows daily. We provide the reader with some preliminary knowledge and a good understanding of the concepts that underlie this emerging technology. We focus especially on the theoretical problems related to the real-time issue and briefly summarizes the main solutions, together with their advantages and drawbacks. This brings the reader through all the conceptual layers constituting a computer system, from the software level---the logical part---that specifies both the system behavior and requirements to the hardware level---the physical part---that actually performs the expected treatments and reacts to the environment. In the meanwhile, we introduce the theoretical models that allow researchers for theoretical analyses which ensure that all the system requirements are fulfilled. Finally, we address the energy consumption problem in embedded systems. We describe the various factors of power dissipation in modern technologies and we introduce different solutions to reduce this consumption./Cette thèse se focalise sur un type de systèmes informatiques bien précis appelés “systèmes embarqués temps réel”. Un système est dit “embarqué” lorsqu’il est développé afin de servir un but bien précis. Un téléphone portable est un parfait exemple de système embarqué étant donné que toutes ses fonctionnalités sont rigoureusement définies avant même sa conception. Au contraire, un ordinateur personnel n’est généralement pas considéré comme un système embarqué, les concepteurs ne sachant pas à l’avance à quelles fins il sera utilisé. Une grande partie de ces systèmes embarqués ont des contraintes temporelles très fortes, ce qui les distingue encore plus des ordinateurs grand public. A titre d’exemple, lorsqu’un conducteur de voiture freine brusquement, l’ordinateur de bord déclenche l’application ABS et il est primordial que cette application soit traitée endéans une courte échéance. Autrement dit, cette fonctionnalité ABS doit être traitée prioritairement par rapport aux autres fonctionnalités du véhicule. Ce type de système embarqué est alors dit “temps réel”, dû à ces notions de temps et de priorités entre les applications. La problèmatique posée par les systèmes temps réel est la suivante. Comment déterminer, à tout moment, un ordre d’exécution des différentes fonctionnalités de telle sorte qu’elles soient toutes exécutées entièrement endéans leur échéance ?De plus, avec l’apparition récente des systèmes multiprocesseurs, cette problématique s’est fortement complexifiée, vu que le système doit à présent déterminer quelle fonctionnalité s’exécute à quel moment sur quel processeur afin que toutes les contraintes temporelles soient respectées. Pour finir, ces systèmes embarqués temp réel multiprocesseurs se sont rapidement retrouvés confrontés à un problème de consommation d’énergie. Leur demande en terme de performance (et donc en terme d’énergie) à évolué beaucoup plus rapidement que la capacité des batteries qui les alimentent. Ce problème est actuellement rencontré par de nombreux systèmes, tels que les téléphones portables par exemple. L’objectif de cette thèse est de parcourir les différents composants de tels système embarqués et de proposer des solutions afin de réduire leur consommation d’énergie. / Doctorat en Sciences / info:eu-repo/semantics/nonPublished Informatique générale Sciences exactes et naturelles Embedded computer systems Computer scheduling Real-time programming Systèmes enfouis (Informatique) Ordonnancement (Informatique) Programmation en temps réel ordonnancement temps réel energy-aware scheduling ordonnancement multiprocesseur systèmes embarqués embedded systems real-time scheduling multiprocessor scheduling
147	Power Efficient Last Level Cache for Chip Multiprocessors Mandke, Aparna January 2013 (has links) (PDF) The number of processor cores and on-chip cache size has been increasing on chip multiprocessors (CMPs). As a result, leakage power dissipated in the on-chip cache has become very significant. We explore various techniques to switch-off the over-allocated cache so as to reduce leakage power consumed by it. A large cache offers non-uniform access latency to different cores present on a CMP and such a cache is called “Non-Uniform Cache Architecture (NUCA)”. Past studies have explored techniques to reduce leakage power for uniform access latency caches and with a single application executing on a uniprocessor. Our ideas of power optimized caches are applicable to any memory technology and architecture for which the difference of leakage power in the on-state and off-state of on-chip cache bank is significant. Switching off the last level shared cache on a CMP is a challenging problem due to concurrently executing threads/processes and large dispersed NUCA cache. Hence, to determine cache requirement on a CMP, first we propose a new highly accurate method to estimate working set size of an application, which we call “tagged working set size estimation (TWSS)” method. This method has a negligible hardware storage overhead of 0.1% of the cache size. The use of TWSS is demonstrated by adaptively adjusting cache associativity. Our ideas of adaptable associative cache is scalable with respect to the number of cores present on a CMP. It uses information available locally in a tile on a tiled CMP and thus avoids network access unlike other commonly used heuristics such as average memory access latency and cache miss ratio. Our implementation gives 25% and 19% higher EDP savings than that obtained with average memory access latency and cache miss ratio heuristics on a static NUCA platform (SNUCA), respectively. Cache misses increase with reduced cache associativity. Hence, we also propose to map some of the L2 slices onto the rest L2 slices and switch-off mapped L2 slices. The L2 slice includes all L2 banks in a tile. We call this technique the “remap policy”. Some applications execute with lesser number of threads than available cores during their execution. In such applications L2 slices which are farther to those threads are switched-off and mapped on-to L2 slices which are located nearer to those threads. By using nearer L2 slices with the help of remapped technology, some applications show improved execution time apart from reduction in leakage power consumption in NUCA caches. To estimate the maximum possible gains that can be obtained using the remap policy, we statically determine the near-optimal remap configuration using the genetic algorithms. We formulate this problem as a energy-delay product minimization problem. Our dynamic remap policy implementation gives energy-delay savings within an average of 5% than that obtained with the near-optimal remap configuration. Energy-delay product can also be minimized by improving execution time, which depends mainly on the static and dynamic NUCA access policies (DNUCA). The suitability of cache access policy depends on data sharing properties of a multi-threaded application. Hence, we propose three indices to quantify data sharing properties of an application and use them to predict a more suitable cache access policy among SNUCA and DNUCA for an application. Processor Architecture Chip Multiprocessor Cache Memory Cache Genetic Algorithms Leakage Power Optimization Working Set Size Optimization Near Optimal Remap Configuration Thread Contention Predictors On-Chip Cache Cache Architecture NUCA SNUCA Non-Uniform Cache Architecture Computer Science
148	Návrh systému řízení inteligentního domu / The design of the inteligent house control system Kutílek, Pavel January 2008 (has links) The work is oriented on design and realization of the complex intelligent house control system with central visualization panel. System contains industrial buses connection.
149	Réconcilier performance et prédictibilité sur un many-coeur en utilisant des techniques d'ordonnancement hors-ligne / Reconciling performance and predictability on a noc-based mpsoc using off-line scheduling techniques Fakhfakh, Manel 27 June 2014 (has links) Les réseaux-sur-puces (NoCs) utilisés dans les architectures multiprocesseurs-sur-puces posent des défis importants aux approches d'ordonnancement temps réel en ligne (dynamique) et hors-ligne (statique). Un NoC contient un grand nombre de points de contention potentiels, a une capacité de bufferisation limitée et le contrôle réseau fonctionne à l'échelle de petits paquets de données. Par conséquent, l'allocation efficace de ressources nécessite l'utilisation des algorithmes da faible complexité sur des modèles de matériel avec un niveau de détail sans précédent dans l'ordonnancement temps réel. Nous considérons dans cette thèse une approche d'ordonnancement statique sur des architectures massivement parallèles (Massively parallel processor arrays ou MPPAs) caractérisées par un grand nombre (quelques centaines) de c¿urs de calculs. Nous identifions les mécanismes matériels facilitant l'analyse temporelle et l'allocation efficace de ressources dans les MPPAs existants. Nous déterminons que le NoC devrait permettre l'ordonnancement hors-ligne de communications, d'une manière synchronisée avec l'ordonnancement de calculs sur les processeurs. Au niveau logiciel, nous proposons une nouvelle méthode d'allocation et d'ordonnancement capable de synthétiser des ordonnancements globaux de calculs et de communications couvrants toutes les ressources d'exécution, de communication et de la mémoire d'un MPPA. Afin de permettre une utilisation efficace de ressources du matériel, notre méthode prend en compte les spécificités architecturales d'un MPPA et implémente des techniques d'ordonnancement avancées comme la préemption pré-calculée de transmissions de données. Nous avons évalué n / On-chip networks (NoCs) used in multiprocessor systems-on-chips (MPSoCs) pose significant challenges to both on-line (dynamic) and off-line (static) real-time scheduling approaches. They have large numbers of potential contention points, have limited internal buffering capabilities, and network control operates at the scale of small data packets. Therefore, efficient resource allocation requires scalable algorithms working on hardware models with a level of detail that is unprecedented in real-time scheduling. We consider in this thesis a static scheduling approach, and we target massively parallel processor arrays (MPPAs), which are MPSoCs with large numbers (hundreds) of processing cores. We first identify and compare the hardware mechanisms supporting precise timing analysis and efficient resource allocation in existing MPPA platforms. We determine that the NoC should ideally provide the means of enforcing a global communications schedule that is computed off-line (before execution) and which is synchronized with the scheduling of computations on processors. On the software side, we propose a novel allocation and scheduling method capable of synthesizing such global computation and communication schedules covering all the execution, communication, and memory resources in an MPPA. To allow an efficient use of the hardware resources, our method takes into account the specificities of MPPA hardware and implements advanced scheduling techniques such as pre-computed preemption of data transmissions. We evaluate our technique by mapping two signal processing applications, for which we obtain good latency, throughput, and resource use figures. Multiprocesseurs-sur-puce (many-coeur) Réseau-sur-puce (NoC) Ordonnancement hors-ligne Ordonnancement temps réel Chip-multiprocessor (Many-core) On-chip network (NoC) Off-line scheduling Real-time scheduling 004.2
150	A SIMD Approach To Large-scale Real-time System Air Traffic Control Using Associative Processor and Consequences For Parallel Computing Yuan, Man 01 October 2012 (has links) No description available. Computer Science Air Traffic Control (ATC) SIMD MIMD Real-Time Systems Associative Processor (AP) Conflict Detection and Resolution (CDR) ClearSpeed CSX600 Multicore Processor OpenMP Federal Aviation Administration (FAA) Multiprocessor NP-complete Predictable

Search results