Global ETD Search

141	A transparent and energy aware reconfigurable multiprocessor platform for efficient ILP and TLP exploitation Rutzig, Mateus Beck January 2012 (has links) As the number of embedded applications is increasing, the current strategy of several companies is to launch a new platform within short periods, to execute the application set more efficiently, with low energy consumption. However, for each new platform deployment, new tool chains must come along, with additional libraries, debuggers and compilers. This strategy implies in high hardware redesign costs, breaks binary compatibility and results in a high overhead in the software development process. Therefore, focusing on area savings, low energy consumption, binary compatibility maintenance and mainly software productivity improvement, we propose the exploitation of Custom Reconfigurable Arrays for Multiprocessor System (CReAMS). CReAMS is composed of multiple adaptive reconfigurable systems to efficiently explore Instruction and Thread Level Parallelism (ILP and TLP) at hardware level, in a totally transparent fashion. Conceived as homogeneous organization, CReAMS shows a reduction of 37% in energy-delay product (EDP) compared to an ordinary multiprocessing platform when assuming the same chip area. When a variety of processor with different capabilities on exploiting ILP are coupled in a single die, conceiving CReAMS as a heterogeneous organization, performance improvements of up to 57% and energy savings of up to 36% are showed in comparison with the homogenous platform. In addition, the efficiency of the adaptability provided by CReAMS is demonstrated in a comparison to a multiprocessing system composed of 4- issue Out-of-Order SparcV8 processors, 28% of performance improvements are shown considering a power budget scenario. Multiprocessadores Microeletrônica Sistemas embarcados Multiprocessors Reconfigurable architectures Instruction and thread level parallelism
142	[en] AN ABSTRACTION FOR PARALLEL PROGRAMMING: SUPPORT FOR DEVEDEVELOPING / [pt] ABSTRAÇÃO PARA PROGRAMAÇÃO PARALELA: SUPORTE PARA O DESENVOLVIMENTO DE APLICAÇÕES PAULO ROGERIO DA MOTTA JUNIOR 09 March 2012 (has links) [pt] A evolução do campo de programação tradicionalmente troca desempenho por abstrações mais poderosas capazes de simplificar o trabalho do programador. É possível observar os efeitos dessa evolução na área de programação paralela. Tipicamente, a programação paralela se concentra em alto desempenho baseado no paradigma procedural para atingir o mais alto rendimento possível, porém determinar o ponto em que deve-se trocar desempenho por abstrações mais poderosas continua um problema em aberto. Com o advento de novas ferramentas e bibliotecas de sistema que fornecem melhor desempenho sem a intervenção do programador, a crença de que o programador da aplicação deve otimizar o código de comunicação começa a ser questionada. De acordo com a crescente demanda por soluções paralelas de larga escala tornando-se evidentes, problemas como complexidade de código, poder de modelagem e projeto, manutenibilidade, desenvolvimento rápido, maior segurança e reuso, deverão ser considerados quando for necessário decidir que abordagem usar. Nesse trabalho, investigamos o custo do uso de camadas de abstração de mais alto-nível e que podem prover muitos benefícios para desenvolvedores de aplicações paralelas. Além disso argumentamos que o uso de linguagens interpretadas pode ajudar na abstração da arquitetura de processador surgindo a oportunidade para otimizar as máquinas virtuais sem que isso afete o código da aplicação do usuário. / [en] The evolution of the field of programming traditionally trades performance for more powerful abstractions that are able to simplify the programmer’s work. It is possible to observe the effects of this evolution on the parallel programming area. Typically parallel programming focuses on high performance based on the procedural paradigm to achieve the highest possible throughput, but determining the point in which one should trade performance for more powerful abstractions remains an open problem. With the advent of new system level tools and libraries that deliver greater performance without programmer’s intervention, the belief that the application programmer should optimize communication code starts to be challenged. As the growing demand for large scale parallel solutions becomes noticeable, problems like code complexity, design and modeling power, maintainability, faster development, greater reliability and reuse, are expected to take part on the decision of which approach to use. In the present work, we investigate the use of higher-level abstraction layers that could provide many benefits for the parallel application developer. We argue that the use of interpreted languages may aid the abstraction of the processor architecture providing an opportunity to optimize the virtual machines without affecting the user’s application code. [pt] PARALELISMO [en] PARALLELISM [pt] LINGUAGENS INTERPRETADAS [en] INTERPRETED LANGUAGES [pt] DESEMPENHO [en] ACHIEVEMENT
143	Conception faible consommation d'un système de détection de chute / Low power architecture for fall detection system Nguyen, Thi Khanh Hong 18 November 2015 (has links) De nos jours, la détection de chute est un défi pour la santé, notamment pour la surveillance des personnes âgées. Le but de cette thèse est de concevoir un système de détection de chute basée sur une surveillance par caméra et d’étudier les aspects algorithmiques et architecturaux. Notre système se compose de quatre modules : la segmentation d’objet, le filtrage, l’extraction de caractéristiques et la reconnaissance qui permettent en plus de la détection de chute d’identifier leur type afin de définir un niveau d’alerte. En premier lieu, différents algorithmes ont été étudiés et comparés comme le Background Subtraction-Neural Network; le Background Subtraction-Template Matching (BGS-TM); le Background Subtraction-Hidden Markov Model ; et le Gaussian Mixture Model. Le BGS/TM présentant le meilleur taux de reconnaissance a alors été retenu. Une nouvelle base de donnée DTU-HBU a été construite et classifiée selon différentes actions : chute, non-chute (assis, couché, rampant, etc.) selon trois angles de caméra (face, côtés et de biais). Le second objectif fut de définir une méthode de conception permettant de sélectionner les architectures présentant la meilleure performance. Un premier travail fut de définir des modèles de la consommation et du temps d’exécution pour différentes cibles (processeur, FPGA). A titre d’exemple, la plateforme ZYNQ a été considérée. Les modèles proposés présentent un taux erreur inférieur à 3,5%. Une méthodologie de conception DSE basée sur deux techniques de parallélisme (Intra-task et inter-task) et couplant le taux de reconnaissance (ACC) a été définie. Les résultats obtenus montrent que l’ACC atteint 98,3% pour une énergie de 29,5 mJ/f. / Nowadays, fall detection is a major challenge in the public health care domain, especially for the elderly living alone and rehabilitants in hospitals. This thesis presents an exploration for a Fall Detection System based on camera under an algorithmic and architectural point of view. Our system includes four modules: Object Segmentation, Filter, Feature Extraction and Recognition and give an urgent alarm for detecting different kinds of fall. Firstly, different algorithms for the Fall Detection System are proposed and compared the efficiency among Background Subtraction-Neural Network, Background Subtraction-Template Matching (BGS/TM), Background Subtraction-Hidden Markov Model, and Gaussian Mixture Model. Therefore, the selected BGS/TM with 91.67% (Recall), 100% (Precision) and 95.65% (Accuracy) will be implemented on ZYNQ platform. Moreover, a DUT-HBU database which is classified with different actions: fall, non-fall in three camera directions is used to evaluate the efficiency of this system. Secondly, the aim is to explore low cost architectures for this system, new power consumption and execution time models for processor core and FPGA are defined according to the different configurations of architecture and applications. The error rates of the proposed models don’t exceed 3.5%. The models are then extended to hardware/software architectures to explore low cost architecture by defining a suitable Design Space Exploration methodology. Two techniques for parallelization which are based on intra-task and inter-task static scheduling are applied with the aim to enhance the accuracy and the power consumption of this system reaches 98.3% with energy per frame of 29.5mJ/f. Détection de chute Architecture hétérogène DSE Techniques de parallélisme Fall Detection Heterogeneous architecture Design Space Exploration Parallelism
144	Um mecanismo de busca especulativa de múltiplos fluxos de instruções / A multistreamed speculative instruction fetch mechanism Santos, Rafael Ramos dos January 1997 (has links) Este trabalho apresenta um novo modelo de busca especulativa de múltiplos fluxos de instruções em arquiteturas superescalares. A avaliação de desempenho de uma arquitetura superescalar com esta característica é também apresentada como forma de validar o modelo proposto e comparar seu desempenho frente a uma arquitetura superescalar real. O modelo em questão pretende eliminar a latência de busca de instruções introduzida pela ocorrência de comandos de desvio em pipelines superescalares. O desempenho de uma arquitetura superescalar dotada de escalonamento dinâmico de instruções, previsão de desvios e execução especulatva é bastante inferior ao desempenho máximo teórico esperado. Como demonstrado em outros trabalhos, isto ocorre devido às constantes quebras de fluxo, derivadas de instruções de desvio, e do conseqüente esvaziamento da fila de instruções. O emprego desta técnica permite encadear instruções pertencentes a diferentes fluxos lógicos, logo após a identificação de uma instrução de desvio, disponibilizando um maior número de instruções ao mecanismo de escalonamento dinâmico e diminuindo o número de ciclos com despacho nulo devido as quebras de fluxo. Algumas considerações sobre a implementação do modelo descrito são apresentadas ao final do trabalho assim como sugestões para trabalhos futuros. / This work presents a new model to fetch instructions along multiple streams in superscalar pipelines. Also, the performance evaluation of a superscalar architecture including this feature is presented in order to validate the model and to compare its performance with a real superscalar architecture. The proposed technique intents to eliminate the instruction fetch latency introduced by branch instructions in superscalar pipelines. The performance delivered by a superscalar architecture which incorporate dynamic instruction scheduling, branch prediction and speculative execution is not the expected one which should be at least proportional to the number of functional units. Related works have shown that constant stream breaks caused by disruptions in the sequential flow of control reduce the amount of instructions into the instruction queue. This technique allows instruction fetch through different logic streams, as soon as the branch instruction has been detected during the fetch. The scheduler needs a large instruction window to be able to schedule efficiently consequently the instructions window should hold as many instructions as possible to allow an efficient schedule. The improvement realized by the proposed scheme is to increase the size of the instruction window by putting there more instructions avoiding interruptions on the event of branch occurrence. Some considerations about the implementation of this model are presented at final as well as suggestions to future works. Arquitetura de computadores Arquiteturas super escalares Pipelining Instruction-level parallelism Superscalar architectures
145	Android Application Context Aware I/O Scheduler January 2014 (has links) abstract: Android has been the dominant platform in which most of the mobile development is being done. By the end of the second quarter of 2014, 84.7 percent of the entire world mobile phones market share had been captured by Android. The Android library internally uses the modified Linux kernel as the part of its stack. The I/O scheduler, is a part of the Linux kernel, responsible for scheduling data requests to the internal and the external memory devices that are attached to the mobile systems. The usage of solid state drives in the Android tablet has also seen a rise owing to its speed of operation and mechanical stability. The I/O schedulers that exist in the present Linux kernel are not better suited for handling solid state drives in particular to exploit the inherent parallelism offered by the solid state drives. The Android provides information to the Linux kernel about the processes running in the foreground and background. Based on this information the kernel decides the process scheduling and the memory management, but no such information exists for the I/O scheduling. Research shows that the resource management could be done better if the operating system is aware of the characteristics of the requester. Thus, there is a need for a better I/O scheduler that could schedule I/O operations based on the application and also exploit the parallelism in the solid state drives. The scheduler proposed through this research does that. It contains two algorithms working in unison one focusing on the solid state drives and the other on the application awareness. The Android application context aware scheduler has the features of increasing the responsiveness of the time sensitive applications and also increases the throughput by parallel scheduling of request in the solid state drive. The suggested scheduler is tested using standard benchmarks and real-time scenarios, the results convey that our scheduler outperforms the existing default completely fair queuing scheduler of the Android. / Dissertation/Thesis / Masters Thesis Computer Science 2014 Computer engineering Android application aware I/O scheduling Parallelism in SSD Solid state drive
146	[en] CONCURRENT PROGRAMMING IN LUA: REVISITING THE LUAPROC LIBRARY / [pt] PROGRAMAÇÃO CONCORRENTE EM LUA: REVISITANDO A BIBLIOTECA LUAPROC LIANDER MILLAN FERNANDEZ 09 June 2017 (has links) [pt] Nos últimos anos, a tendência por aumentar o desempenho de um microprocessador, como uma solução alternativa para a crescente demanda por recursos computacionais de aplicações e sistemas, diminuiu significativamente. Isto levou a um aumento do interesse em utilizar ambientes multi-processados. Embora muitos modelos e bibliotecas tenham sido desenvolvidos para oferecer suporte à programação concorrente, garantir que vários fluxos de execução acessem recursos compartilhados de forma controlada continua a ser uma tarefa complexa. A biblioteca Luaproc, que oferece suporte para a concorrência em Lua, mostrou alguma promessa em termos de desempenho e casos de uso. Nesta tese, nós estudamos a biblioteca Luaproc e incorporamos-lhe novas funcionalidades a fim de torná-la mais amigável e estender o seu uso a novos cenários. Primeiro, nós apresentamos as motivações para nossas extensões a Luaproc, discutindo formas alternativas de lidar com as limitações existentes. Em seguida, nós apresentamos requisitos, características da implementação e limitações associadas a cada um dos mecanismos desenvolvidos como soluções alternativas a essas limitações. Finalmente, nós utilizamos as funcionalidades incorporadas na implementação de algumas aplicações concorrentes, a fim de avaliar o desempenho e testar o funcionamento adequado de tais mecanismos. / [en] In recent years, the tendency to increase the performance of a microprocessor, as an alternative solution to the increasing demand for computational resources of both applications and systems, has decreased significantly. This has led to an increase of the interest in employing multiprocessing environments. Although many models and libraries have been developed to offer support for concurrent programming, ensuring that several execution ows access shared resources in a controlled way remains a complex task. The Luaproc library, which provides support for concurrency in Lua, has shown some promise in terms of performance and cases of use. In this thesis, we study the Luaproc library and incorporate to it new functionalities in order to make it more user friendly and extend its use to new scenarios. First, we introduce the motivations to our extensions to Luaproc, discussing alternative ways of dealing with the existing limitations. Then, we present requirements, characteristics of the implementation, and limitations associated to each of the mechanisms implemented as alternative solutions to these limitations. Finally, we employ the incorporated functionalities in implementing some concurrent applications, in order to evaluate the performance and test the proper functioning of such mechanisms. [pt] PARALELISMO [en] PARALLELISM [pt] CONCORRENCIA [en] CONCURRENCE [pt] LUA [en] LUA [pt] LUAPROC [pt] TROCA DE MENSAGENS
147	Les automates cellulaires en tant que modèle de complexités parallèles / Cellular automata as a model of parallel complexities Meunier, Pierre-Etienne 26 October 2012 (has links) The intended goal of this manuscript is to build bridges between two definitions of complexity. One of them, called the algorithmic complexity is well-known to any computer scientist as the difficulty of performing some task such as sorting or optimizing the outcome of some system. The other one, etymologically closer from the word "complexity" is about what happens when many parts of a system are interacting together. Just as cells in a living body, producers and consumers in some non-planned economies or mathematicians exchanging ideas to prove theorems. On the algorithmic side, the main objects that we are going to use are two models of computations, one called communication protocols, and the other one circuits. Communication protocols are found everywhere in our world, they are the basic stone of almost any human collaboration and achievement. The definition we are going to use of communication reflects exactly this idea of collaboration. Our other model, circuits, are basically combinations of logical gates put together with electrical wires carrying binary values, They are ubiquitous in our everyday life, they are how computers compute, how cell phones make calls, yet the most basic questions about them remain widely open, how to build the most efficient circuits computing a given function, How to prove that some function does not have a circuit of a given size, For all but the most basic computations, the question of whether they can be computed by a very small circuit is still open. On the other hand, our main object of study, cellular automata, is a prototype of our second definition of complexity. What "does" a cellular automaton is exactly this definition, making simple agents evolve with interaction with a small neighborhood. The theory of cellular automata is related to other fields of mathematics�� such as dynamical systems, symbolic dynamics, and topology. Several uses of cellular automata have been suggested, ranging from the simple application of them as a model of other biological or physical phenomena, to the more general study in the theory of computation. / The intended goal of this manuscript is to build bridges between two definitions of complexity. One of them, called the algorithmic complexity is well-known to any computer scientist as the difficulty of performing some task such as sorting or optimizing the outcome of some system. The other one, etymologically closer from the word "complexity" is about what happens when many parts of a system are interacting together. Just as cells in a living body, producers and consumers in some non-planned economies or mathematicians exchanging ideas to prove theorems. On the algorithmic side, the main objects that we are going to use are two models of computations, one called communication protocols, and the other one circuits. Communication protocols are found everywhere in our world, they are the basic stone of almost any human collaboration and achievement. The definition we are going to use of communication reflects exactly this idea of collaboration. Our other model, circuits, are basically combinations of logical gates put together with electrical wires carrying binary values, They are ubiquitous in our everyday life, they are how computers compute, how cell phones make calls, yet the most basic questions about them remain widely open, how to build the most efficient circuits computing a given function, How to prove that some function does not have a circuit of a given size, For all but the most basic computations, the question of whether they can be computed by a very small circuit is still open. On the other hand, our main object of study, cellular automata, is a prototype of our second definition of complexity. What "does" a cellular automaton is exactly this definition, making simple agents evolve with interaction with a small neighborhood. The theory of cellular automata is related to other fields of mathematics, such as dynamical systems, symbolic dynamics, and topology. Several uses of cellular automata have been suggested, ranging from the simple application of them as a model of other biological or physical phenomena, to the more general study in the theory of computation. Complexité de communication, Automates cellulaires Circuits Parallélisme Communication complexity Cellular automat Circuits Parallelism
148	Extração de informações de desempenho em GPUs NVIDIA / Performance Information Extraction on NVIDIA GPUs Paulo Carlos Ferreira dos Santos 15 March 2013 (has links) O recente crescimento da utilização de Unidades de Processamento Gráfico (GPUs) em aplicações científicas, que são voltadas ao desempenho, gerou a necessidade de otimizar os programas que nelas rodam. Uma ferramenta adequada para essa tarefa é o modelo de desempenho que, por sua vez, se beneficia da existência de uma ferramenta de extração de informações de desempenho para GPUs. Este trabalho cobre a criação de um gerador de microbenchmark para instruções PTX que também obtém informações sobre as características do hardware da GPU. Os resultados obtidos com o microbenchmark foram validados através de um modelo simplificado que obteve erros entre 6,11% e 16,32% em cinco kernels de teste. Também foram levantados os fatores de imprecisão nos resultados do microbenchmark. Utilizamos a ferramenta para analisar o perfil de desempenho das instruções e identificar grupos de comportamentos semelhantes. Também testamos a dependência do desempenho do pipeline da GPU em função da sequência de instruções executada e verificamos a otimização do compilador para esse caso. Ao fim deste trabalho concluímos que a utilização de microbenchmarks com instruções PTX é factível e se mostrou eficaz para a construção de modelos e análise detalhada do comportamento das instruções. / The recent growth in the use of tailored for performance Graphics Processing Units (GPUs) in scientific applications, generated the need to optimize GPU targeted programs. Performance models are the suitable tools for this task and they benefits from existing GPUs performance information extraction tools. This work covers the creation of a microbenchmark generator using PTX instructions and it also retrieves information about the GPU hardware characteristics. The microbenchmark results were validated using a simplified model with errors rates between 6.11% and 16.32% under five diferent GPU kernels. We also explain the imprecision factors present in the microbenchmark results. This tool was used to analyze the instructions performance profile, identifying groups with similar behavior. We also evaluated the corelation of the GPU pipeline performance and instructions execution sequence. Compiler optimization capabilities for this case were also verified. We concluded that the use of microbenchmarks with PTX instructions is a feasible approach and an efective way to build performance models and to generate detailed analysis of the instructions\' behavior. desempenho de GPU linguagem PTX microbenchmark modelo de desempenho paralelismo. GPU performance microbenchmark parallelism. performance model PTX language
149	Análise da escalabilidade de aplicações em computadores multicore Silva, Samuel Reghim 14 June 2013 (has links) Made available in DSpace on 2016-06-02T19:06:05Z (GMT). No. of bitstreams: 1 5312.pdf: 1746409 bytes, checksum: f1bdfc6eec1ef747466c9ed99d5d8835 (MD5) Previous issue date: 2013-06-14 / Financiadora de Estudos e Projetos / Multicore processors allow applications to explore thread-level parallelism in order to enable improvements on the elapsed time. The sharing of the memory subsystem and the discrepancy between the speeds of processors and memory access operations, however, may entail limitations to the scalability caused by thread competition for the resources. The automatic determination of the appropriate number of threads for an application that ensure efficient executions, although widely desired, is a non-trivial problem. This work aimed to evaluate the factors limiting the scalability of OpenMP parallel applications related to the contention for shared resources in multicore processors, with the goal of identifying the characteristics of applications that limit their scalability. It was found that memory accesses are a major limitation to the performance gains with parallelism. The granularity, indicating the ratio of memory accesses to processing, has been verified as being an important performance factor of parallel executions. Estimates of granularity can be obtained from the applications' source code. Different data access modes, however, point to the need to estimate the combination of granularity with information about the data access locality to properly determine the scalability of applications. / Processadores multicore permitem que aplicações explorem paralelismo no nível de threads para habilitar melhorias no tempo de conclusão da execução. O compartilhamento do subsistema de memória e a disparidade entre as velocidades dos processadores e das operações de acesso à memória, contudo, podem implicar em limitações na escalabilidade causadas pela competição das threads pelos recursos. A determinação da quantidade apropriada de threads que garanta execuções eficientes para uma aplicação é um problema não trivial cuja obtenção automatizada é amplamente desejada. Neste trabalho, buscou-se avaliar os fatores limitantes para a escalabilidade de aplicações paralelas com OpenMP relacionados à contenção pelos recursos compartilhados em processadores multicore, com o objetivo de identificar características das aplicações que limitem sua escalabilidade. Constatou-se que os acessos à memória são a principal limitação aos ganhos de desempenho com o paralelismo. A granularidade, que indica a proporção de acessos à memória em relação ao processamento, foi verificada como sendo um indicativo importante do desempenho das execuções paralelas. Estimativas de granularidade podem ser obtidas a partir do código-fonte das aplicações. Diferentes modos de acessos aos dados apontam, todavia, para a necessidade de combinação da estimativa de granularidade com informações sobre a localidade dos acessos aos dados para determinar corretamente a escalabilidade das aplicações. Computação Escalabilidade Processamento paralelo Paralelismo Scalability Parallelism Multicore Threads
150	Advances in SystemC/TLM virtual platforms : configuration, communication and parallelism / Contribution à l'amélioration des plateformes virtuelles SystemC/TLM : configuration, communication et parallélisme Delbergue, Guillaume 18 December 2017 (has links) Le marché de l’Internet des Objets (IdO) est en pleine progression. Il va continuer à croître et à se développer à un rythme soutenu dans les prochaines années. Les objets connectés sont constitués de composants électroniques dédiés, de processeurs et de codes logiciels. La conception de tels systèmes constitue aujourd’hui un challenge au niveau industriel. Ce challenge est renforcé par la concurrence du marché et le délai de commercialisation qui impactent directement sur le développement d’un système. Le processus de conception actuel consiste en l’élaboration d’un cahier des charges. Dans un premier temps, l’équipe en charge du développement matériel commence à développer le produit. Ensuite, la partie applicative peut être mise au point par les développeurs logiciels. Une fois le premier prototype matériel disponible, l’équipe logicielle peut alors intégrer sa partie et tenter de la valider fonctionnellement. Cette étape peut mettre en lumière des défauts dans le logiciel mais aussi lors de la conception matérielle. Malheureusement,la découverte ce type d’erreurs intervient beaucoup trop tard dans le processus de conception retardant la commercialisation du système. Afin de sécuriser au plus tôt les développements matériel et logiciel, des méthodologies basées sur le standard SystemC/Transaction Level Modeling (TLM) ont été proposées. Elles permettent de modéliser et de simuler du matériel. Durant les phases amont de conception d’un système, elles permettent de mettre en commun une version virtuelle du (futur) système entre les équipes logicielle et matérielle. Cette version virtuelle est plus couramment appelée plateforme virtuelle. Elle permet de tester et de valider le plus tôt possible lors du cycle de conception, de réduire le coût matériel en limitant la fabrication de prototypes, mais aussi de gagner du temps et donc de l’argent en diminuant les risques. Or, les objets intègrent de plus en plus de fonctionnalités aux niveaux matériel et logiciel. Les besoins ayant évolué, le standard de simulation SystemC/TLM ne répond plus à l’heure actuelle à toutes les attentes. Ces attentes concernent plus particulièrement les aspects liés à la simulation de systèmes composés de nombreuses fonctionnalités, de protocoles de communication disparates mais aussi de modèles complexes et consommateur de temps pendant la simulation. Des activités de recherche ont déjà été menées sur ces sujets. Cependant, elles ont pour la plupart abouti à des solutions qui ne sont pas interopérables. Les solutions existantes ne permettent donc pas de bénéficier de la réutilisation des modèles de la littérature. Afin de répondre à ces problèmes,une solution permettant la configuration de modèles SystemC/TLM a été recherchée. Cette dernière fait désormais partie du standard Configuration, Control and Inspection (CCI). Dans un second temps, la modélisation de protocoles de communication à un haut niveau d’abstraction(TLM Loosely Timed (LT) et Approximately Timed (AT)) a été étudiée, et plus précisément des protocoles de type non bus. Une évolution du standard actuel permettant d’améliorer le support,l’interopérabilité, la réutilisation a été proposée dans le cadre de la thèse. Ensuite, une évolution du standard SystemC et plus précisément du comportement du noyau de simulation a été étudiée pour supporter l’attente d’événements asynchrones. Ce type d’événement ouvre la voie à la parallélisation et la distribution de modèles sur différents threads / machines. Enfin, une solution permettant l’intégration de modèles de Central Processing Units (CPU) intégrés dans QuickEMUlator (QEMU), un émulateur / virtualisateur de système, a été étudiée. Finalement, toutes ces contributions ont été associées à travers la modélisation d’un ensemble d’objets connectés à une passerelle. / The market for Internet Of Things (IOT) is on the rise. It is predicted to continue to grow at a sustained pace in the coming years. Connected objects are composed of dedicated electronic components, processors and software. The design of such systems is today a challenge from an industrial point of view. This challenge is reinforced by market competition and time tomarket that directly impact the success of a system. In a current design process involvesthe development of a specification. Initially, the team in charge of hardware development beginsto design the system. Second, the application part can be done by software developers. Oncethe first hardware prototype is available, the software team can then integrate their part and try tovalidate the functionality. This step may reveal defects in the software but also in the hardware architecture. Unfortunately, the discovery of these errors occurs far too late in the design process,could impacts the marketing of the system and potentially its success. In order to ensure that the hardware and software designs will work together as early as possible, methodologies based onthe SystemC / Transaction Level Modeling (TLM) standard have been widely adopted. They involvethe modelling and simulation of the proposed hardware architectures. During the initial phasesof a product’s design, they enable the software and hardware team to share a virtual version ofthe (future) system. This virtual version is more commonly referred to as a virtual platform. It facilitates early software development, test and validation; reduces material cost by limiting the number of prototypes; saves time and money by reducing risks. However, connected objects are increasingly incorporating hardware and software features. As the requirements have evolved, theSystemC / TLM simulation standard no longer meets all expectations. It includes aspects related to the simulation of systems composed of many functionality, disparate communication protocolsbut also complex and time consuming models during the simulation. Some works have already been carried out on these subjects. However, as the number of components increases, all formsof interoperability of models and tools become increasingly difficult to handle. Moreover, mostof the research has resulted in solutions that are not inter-operable and can not reuse existingmodels. To solve these problems, this thesis proposes a solution for configuring SystemC / TLMmodels. It is now part of the standard Configuration, Control and Inspection (CCI). In a secondstep, the modeling of high-level abstraction communication protocols (TLM Loosely Timed (LT)and Approximately Timed (AT)) has been studied, as it relates to non-bus protocols. An evolution of the standard to improve support, interoperability and reuse is also proposed. In a third step,a change of the SystemC standard and more precisely of the behavior of the simulation kernelhas been studied to support asynchronous events. These open the way to parallelization and distribution of models on different threads / machines. In a fourth step, a solution to integrate Central Processing Units (CPU) models integrated in Quick EMUlator (QEMU), a system emulator/ virtualizer, has been studied. Finally, all these contributions have been applied in the modeling ofa set of objects connected to a gateway. SystemC TLM Plateforme Virtuelle Configuration Communication Parallélisme SystemC TLM Virtual Platform Configuration Communication Parallelism

Search results