Spelling suggestions: "subject:"19kontext bewitch"" "subject:"19kontext eswitch""
1 |
On the Performance of Fast Context Switch for MinixARMLin, Cheng-chi 14 January 2009 (has links)
The methods of improving the cache performance are multiform and advanced of
nowadays. We are concerned about the cache and TLB utility. To reduce the context switch
cost on system, we utilize an address-space switching hardware of ARMS3C2410 processor
to realize the fast address switching mechanism. The Fast Context Switch can help to improve
cache and TLB utility and performance.
Fast Context Switch is a method that can help to improve the cache performance. The
key feature of Fast Context Switch is without any cache and TLB flush on process context
switching. To implement Fast Context Switch, we address the different processes to different
address space by process ID. When context switch occurs, we can just change the working
space without the cache and TLB flush.
This thesis emphasizes on the performance measure for improvement on the cache
and TLB. We use a high dependable microkernel architecture for message passing between
processes, this microkernel called MinixARM. Rely on the microkernel, we can more easily
understand and analyze the system performance and additional cost of the cache scheme. We
provide more complete performance tests by benchmarks, fast context switch can increase the
system performance about 65% at most.
|
2 |
Infrastructure portable pour un système hétérogène reconfigurable dans un environnement de cloud-FPGA / Portable infrastructure for heterogeneous reconfigurable devices in a cloud-FPGA environmentWicaksana, Arief 02 October 2018 (has links)
La haute performance ainsi que la basse consommation d’énergie offertes par lesField-Programmable Gate Arrays (FPGAs) contribuent à leur popularité en tant queaccélérateurs matériels. Cet argument a été soutenu par les intégrations récentes des FPGAs dans des systèmes cloud et centre de données. Toutefois, le potentiel d’une architecture reconfigurable peut être encore optimisé en traitant les FPGAs comme une ressource virtualisée et en les offrant une capacité de multitâche. La solution pour interrompre une tâche sur FPGAs à pour objectif d’effectuer un changement de contexte matériel (hardware context switch) a été un sujet de recherche depuis des nombreuses années. Les travaux précédents ont principalement proposé une stratégie pour extraire le contexte d’une tâche en cours de son exécution d’un FPGA pour offrir la possibilité de sa reprise plus tard. Cependant, la communication tout au long du processus n’a pas reçu autant d’attention.Dans cette thèse, nous étudions la gestion de communication d’une tâche matérielle durant son changement de contexte. Cette gestion de communicationest nécessaire pour garantir la cohérence de la communication d’une tâche dans un système reconfigurable avec la capacité de changement de contexte. Autrement, un changement de contexte matériel est seulement autorisé sous des contraintes restrictifs; il est possible après que les flux de communication soient fini et que toutes les données d’entrées/de sorties sont déjà consommées. De plus, certaines techniques demandent l’homogénéité au sein de la plate-forme pour qu’un changement de contexte matériel puisse se réaliser.Nous présentons içi un mécanisme qui conserve la cohérence de communication durant un changement de contexte matériel dans une architecture reconfigurable. Les données de communication sont gérées avec le contexte de tâche pour assurer leur intégrité. La gestion du contexte et les données de communication suivent un protocole spécifique pour des architectures hétérogènes reconfigurables. Ce protocole permet donc un changement de contexte matériel pendant que la tâche a encore des flux de communication. À partir des expérimentations, nous découvrons que le surcoût de la gestion de communication devient négligeable car notre mécanisme fournit une grande réactivité nécessaire pour l’allocation de tâche de façon préemptive - outre que sa consistance de communication. Enfin, les applications de solution proposée sont présentées dans un prototypage de tâche migration et dans un système utilisant un hyperviseur. / Field-Programmable Gate Arrays (FPGAs) have been gaining popularity as hardware accelerators in heterogeneous architectures thanks to their high performance and low energy consumption. This argument has been supported by the recent integration of FPGA devices in cloud services and data centers. The potential offered by the reconfigurable architectures can still be optimized by treating FPGAs as virtualizable resources and offering them multitasking capability. The solution to preempt a hardware task on an FPGA with the objective of context switching it has been in research for many years. The previous works mainly proposed the strategy to extract the context of a running task from the FPGA to provide the possibility of its resumption at a later time. The communication during the process, on the contrary, has not been receiving much attention.In this work, we study the communication management of a hardware task whileit is being context switched. This communication management is necessary to ensure the consistency in the communication of a task with context switch capability in a reconfigurable system. Otherwise, a hardware context switch can only be allowed under restrictive constraints which may lead to a considerable penalty in performance; context switching a task is possible after the communication flows finish and the input/output data have been consumed. Furthermore, certain techniques demand homogeneity in the platform for a hardware context switch can take place.We present a mechanism which preserves the communication consistency during ahardware context switch in a reconfigurable architecture. The input/output communication data are managed together with the task context to ensure their integrity. The overall management of the hardware task context and communication data follows a dedicated protocol developed for heterogeneous reconfigurable architectures. This protocol thus allows a hardware context switch to take place while the task still has ongoing communication flows on Reconfigurable System-on-Chips (RSoCs). From the experiments, we discover that the overhead due to managing the communication data becomes negligible since our mechanism provides the necessary high responsiveness for preemptive scheduling, besides the consistency in communication. Finally, the applications of the proposed solution are presented in a task migration prototyping and in a hypervisor-based system.
|
3 |
The Named-State Register FileNuth, Peter R. 01 August 1993 (has links)
This thesis introduces the Named-State Register File, a fine-grain, fully-associative register file. The NSF allows fast context switching between concurrent threads as well as efficient sequential program performance. The NSF holds more live data than conventional register files, and requires less spill and reload traffic to switch between contexts. This thesis demonstrates an implementation of the Named-State Register File and estimates the access time and chip area required for different organizations. Architectural simulations of large sequential and parallel applications show that the NSF can reduce execution time by 9% to 17% compared to alternative register files.
|
4 |
Dynamisk grafik med WebGL och Canvas : Atlas och context-switch / Dynamic graphics using WebGL and Canvas : Atlas and context-switchingFrick, Erik January 2015 (has links)
Att ha grafiska applikationer i webben har blivit allt mer vanligt sedan World Wide Web kom till i slutet på 80-talet. Till en början handlade det om effektfulla interaktiva element så som reklamskyltar, logotyper och menyknappar. Idag år 2015 har webbläsarna utvecklats så pass långt att inga tredjepartsprogram krävs för att interaktiv grafik ska fungera, vilket tidigare var fallet. Grafiska funktioner och bibliotek finns nu istället inbyggda i webbläsaren. De tekniker som denna rapport/arbete ska behandla är Canvas och WebGL. Dessa är tekniker som används för att presentera interaktiv grafik på webben. WebGL är ett grafiskt bibliotek som bygger på ett känt grafiskt bibliotek vid namnet OpenGL, men konstruerat för webben. Grafiken är hårdvaruaccelererad precis som OpenGL, vilket innebär att tekniken kan åstadkomma relativt kraftfull grafik för att vara en webbapplikation. För en utbildad webbutvecklare kan WebGL upplevas som en svårare värld jämfört med Canvas som ligger närmare en webbutvecklares kunskapsområde. Canvas har även en större tillgänglighet bland webbläsare än WebGL. Detta arbete ska redovisa hur dessa två tekniker förhåller sig till varandra i utritningshastighet tillsammans med en bildteknik kallad Atlas. Atlas teknik är enkelt förklarat när ett bildobjekt är som en atlas med flertal bildobjekt där i som hade kunnat motsvara separata bildobjekt. Detta examensarbete kommer jämföra alla fallen i ett experiment för att kunna ge svar på hur prestanda i utritningshastighet står sig mellan teknikerna Canvas och WebGL med eller utan Atlas teknik.
|
5 |
An Empirical Study of the Effects of Context-Switch, Object Distance, and Focus Depth on Human Performance in Augmented RealityGupta, Divya 21 June 2004 (has links)
Augmented reality provides its user with additional information not available through the natural real-world environment. This additional information displayed to the user potentially poses a risk of perceptual and cognitive load and vision-based difficulties. The presence of real-world objects together with virtual augmenting information requires the user to repeatedly switch eye focus between the two in order to extract information from both environments. Switching eye focus may result in additional time on user tasks and lower task accuracy. Thus, one of the goals of this research was to understand the impact of switching eye focus between real-world and virtual information on user task performance.
Secondly, focus depth, which is an important parameter and a depth cue, may affect the user's view of the augmented world. If focus depth is not adjusted properly, it may result in vision-based difficulties and reduce speed, accuracy, and comfort while using an augmented reality display. Thus, the second goal of this thesis was to study the effect of focus depth on task performance in augmented reality systems.
In augmented reality environments, real-world and virtual information are found at different distances from the user. To focus at different depths, the user's eye needs to accommodate and converge, which may strain the eye and degrade performance on tasks. However, no research in augmented reality has explored this issue. Hence, the third goal of this thesis was to determine if distance of virtual information from the user impacts task performance.
To accomplish these goals, a 3x3x3 within subjects design was used. The experimental task for the study required the user to repeatedly switch eye focus between the virtual text and real-world text. A monocular see-through head- mounted display was used for this research.
Results of this study revealed that switching between real-world and virtual information in augmented reality is extremely difficult when information is displayed at optical infinity. Virtual information displayed at optical infinity may be unsuitable for tasks of the nature used in this research. There was no impact of focus depth on user task performance and hence it is preliminarily recommended that manufacturers of head-mounted displays may only need to make fixed focus depth displays; this clearly merits additional intensive research. Further, user task performance was better when focus depth, virtual information, and real-world information were all at the same distance from the user as compared to conditions when they were mismatched. Based on this result we recommend presenting virtual information at the same distance as real-world information of interest. / Master of Science
|
6 |
Accessing an FPGA-based Hardware Accelerator in a Paravirtualized EnvironmentWang, Wei January 2013 (has links)
In this thesis we present pvFPGA, the first system design solution for virtualizing an FPGA - based hardware accelerator on the x86 platform. The accelerator design on the FPGA can be used for accelerating various applications, regardless of the application computation latencies. Our design adopts the Xen virtual machine monitor (VMM) to build a paravirtualized environment, and a Xilinx Virtex - 6 as an FPGA accelerator. The accelerator communicates with the x86 server via PCI Express (PCIe). In comparison to the current GPU virtualization solutions, which primarily intercept and redirect API calls to the hosted or privileged domain’s user space, pvFPGA virtualizes an FPGA accelerator directly at the lower device driver layer. This gives rise to higher efficiency and lower overhead. In pvFPGA, each unprivileged domain allocates a shared data pool for both user - kernel and inter-domain data transfer. In addition, we propose the coprovisor, a new component that enables multiple domains to simultaneously access an FPGA accelerator. The experimental results have shown that 1) pvFPGA achieves close-to-zero overhead compared to accessing the FPGA accelerator without the VMM layer, 2) the FPGA accelerator is successfully shared by multiple domains, 3) distributing different maximum data transfer bandwidths to different domains can be achieved by regulating the size of the shared data pool at the split driver loading time, 4) request turnaround time is improved through DMA (Direct Memory Access) context switches implemented by the coprovisor.
|
7 |
Dedicated Hardware Context-Switch Services for Real-Time Multiprocessor SystemsAllard, Yannick 07 November 2017 (has links) (PDF)
Computers are widely present in our daily life and are used in critical applic-ations like cars, planes, pacemakers. Those real-time systems are nowadaysbased on processors which have an increasing complexity and have specifichardware services designed to reduce task preemption and migration over-heads. However using those services can add unpredictable overheads whenthe system has to switch from one task to another in some cases.This document screens existing solutions used in commonly availableprocessors to ease preemption and migration to highlight their strengths andweaknesses. A new hardware service is proposed to speed up task switchingat the L1 cache level, to reduce context switch overheads and to improvesystem predictability.The solution presented is based on stacking several identical cachememories at the L1 level. Each layer is able to save and restore its completestate independently to/from the main memory. One layer can be used forthe active task running on the processor while another layers can be restoredor saved concurrently. The active task can remain in execution until thepreempting task is ready in another layer after restoration from the mainmemory. The context switch between tasks can then be performed in avery short time by switching to the other layer which is now ready to runthe preempting task. Furthermore, the task will be resumed with the exactL1 cache memory state as saved earlier after the previous preemption. Theprevious task state can be sent back to the main memory for future use.Using this mechanism can lead to minimise the time required for migrationsand preemptions and consequently lower overheads and limit cache missesdue to preemptions and usually considered in the cache migration andpreemption delays. Isolation between tasks is also provided as they areexecuted from a dedicated layer.Both uniprocessor and multiprocessor designs are presented along withimplications on the real-time theory induced by the use of this hardware ser-vice. An implementation of the system is characterized and results show im-provements to the maximum and average execution time of a set of varioustasks: When the same size is used for the baseline cache and HwCS layers,94% of the tasks have a better execution time (up to 67%) and 80% have a bet-ter Worst Case Execution Time (WCET). 80% of the tasks are more predictableand the remaining 20% still have a better execution time. When we split thebaseline cache size among layers of the HwCS, measurements show that 75%of the tasks have a better execution time (up to 67%) leading to 50% of thetasks having a better WCET. Only 6% of the tasks suffer from worse executiontime and worse predictability while 75% of the tasks remain more predictablewhen using the HwCS compared to the baseline cache. / Les ordinateurs ont envahi notre quotidien et sont de plus en plus souventutilisés pour remplir des missions critiques. Ces systèmes temps réel sontbasés sur des processeurs dont la complexité augmente sans cesse. Des ser-vices matériels spécifiques permettent de réduire les coûts de préemption etmigration. Malheureusement, ces services ajoutent des temps morts lorsquele système doit passer d’une tâche à une autre.Ce document expose les solutions actuelles utilisées dans les processeurscourants pour mettre en lumière leurs qualités et défauts. Un nouveau ser-vice matériel (HwCS) est proposé afin d’accélérer le changement de tâches aupremier niveau de mémoire (L1) et de réduire ainsi les temps morts dus auxchangements de contextes tout en améliorant la prédictibilité du système.Bien que cette thèse se concentre sur le cache L1, le concept développépeut également s’appliquer aux autres niveaux de mémoire ainsi qu’àtout bloc dépendant du contexte. La solution présentée se base sur unempilement de caches identiques au premier niveau. Chaque couche del’empilement est capable de sauvegarder ou recharger son état vers/depuisla mémoire principale du système en toute autonomie. Une couche peutêtre utilisée par la tâche active pendant qu’une autre peut sauvegarder ourestaurer l’état d’une autre tâche. La tâche active peut ainsi poursuivre sonexécution en attendant que la tâche suivante soit rechargée. Le changementde contexte entre la tâche active et la tâche suivante peut alors avoir lieu enun temps très court. De plus, la tâche reprendra son exécution sur un cacheL1 dont l’état sera identique à celui au moment où elle a été interrompueprécédemment. L’état du cache de la tâche désormais inactive peut êtresauvegardé dans la mémoire principale en vue d’une utilisation ultérieure.Ce mécanisme permet de réduire au strict minimum le temps de calculperdu à cause des préemptions et migrations, les temps de sauvegarde et derechargement de la L1 n’ayant plus d’influence sur l’exécution des tâches. Deplus, chaque niveau étant dédié à une tâche, les interférences entre tâchessont réduites.Les propriétés ainsi que les implications sur les aspects temps réelsthéoriques sont présentées pour des systèmes mono et multiprocesseurs.Une implémentation d’un système uniprocesseur incluant ce servicematériel et sa caractérisation par rapport à l’exécution d’un set de tâchessont également présentées ainsi que les bénéfices apportés par le HwCS:Lorsque les couches du HwCS ont la même taille que le cache de base, 94%des tâches ont un meilleur temps d’exécution (jusqu’à 67%) et 80% ont unmeilleur pire temps d’exécution (WCET). 80% des tâches deviennent plusprédictibles et les 20% restants bénéficient néanmoins d’un meilleur WCET.Toutefois, si la taille du cache est partagée entre les couches du HwCS, lesmesures montrent que 75% des tâches ont un meilleur temps d’exécution,impliquant un meilleur WCET pour la moitié des tâches du système. Seule-ment 6% des tâches voient leur WCET augmenter et leur prédictibilitédiminuer tandis que 75% des tâches améliorent leur prédictibilité grâce auHwCS. / Doctorat en Sciences de l'ingénieur et technologie / info:eu-repo/semantics/nonPublished
|
8 |
Hardware Support for a Configurable Architecture for Real-Time Embedded Systems on a Programmable ChipIsaacson, Spencer W. 12 July 2007 (has links) (PDF)
Current FPGA technology has advanced to the point that useful embedded SoPCs can now be designed. The Real Time Processor (RTP) project at Brigham Young University leverages the advances in FPGA technology with a system architecture that is customizable to specific applications. A simple real-time processor has been designed to provide support for a hardware-assisted real-time operating system providing fast context switches. As part of the hardware RTOS, the following have been implemented in hardware: scheduler, register banks, mutex, semaphore, queue, interrupts, event, and others. A novel circuit called the Task-Resource Matrix has been created to allow fast inter/intra processor communication and synchronization.
|
9 |
Firmware pro robotické vozítko / Firmware for the Robotic VehicleOtava, Lukáš January 2013 (has links)
This thesis is focused on a firmware for robotic vehicle based on the ARM Cortex-M3 architecture that is running a real-time operating system (RTOS). Theoretical part describes available solutions of embedded RTOS and concrete HW implementation of the robotic vehicle. There is also comparison of the three selected RTOS with their measurements. Result of this thesis is base firmware compounded by a program modules that controls HW parts. There is also a sample PC and firmware application that extends base firmware. This sample application is able to communicate with robotic vehicle, control wheel motion and measure process data.
|
Page generated in 0.0592 seconds