• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 1
  • 1
  • 1
  • Tagged with
  • 14
  • 6
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Nonblocking Memory Refresh

Nguyen, Kate Vy Hoang 08 August 2018 (has links)
Since its inception half a century ago, DRAM has required dynamic/active refresh operations that block read requests and decrease performance. We propose refreshing DRAM in the background without stalling read accesses to refreshing memory blocks, similar to the static/background refresh in SRAM. Our proposed Nonblocking Refresh works by refreshing a portion of the data in a memory block at a time and uses redundant data, such as Reed-Solomon codes, in the block to compute the block's refreshing/unreadable data to satisfy read requests. For proof of concept, we apply Nonblocking Refresh to server memory systems, where every memory block already contains redundant data to provide hardware failure protection. In this context, Nonblocking Refresh can utilize server memory system's existing per-block redundant data in the common-case when there are no hardware faults to correct, without requiring any dedicated redundant data of its own. Our evaluations show that on average across five server memory systems with different redundancy and failure protection strengths, Nonblocking Refresh improves performance by 16.2% and 30.3% for 16gb and 32gb DRAM chips, respectively. / Master of Science / Main memory is an essential component of computers, which stores data being actively used. The dominant type of computer main memory is Dynamic Random Access Memory (DRAM). DRAM is divided into thousands of memory cells. Each cell stores a single bit of data as a charge on a capacitor. Charges may leak over time, causing the data stored to be lost. To maintain the data stored in memory, DRAM must periodically restore charges held by memory cells through an operation known as memory refresh. Refresh operations decrease system performance because they stall read requests to refreshing memory blocks. A memory block refers to the unit of data transferred per memory request. Conventional memory systems refresh all the data within the block at a time, therefore the entire memory block is inaccessible while it is being refreshed. Our proposed Nonblocking Refresh reduces the amount of data in a memory block which is inaccessible due to refresh by refreshing only a portion the memory block at a time. To satisfy read requests, the block’s refreshing/inaccessible data is computed using redundant data. Nonblocking Refresh improves DRAM performance by refreshing DRAM in the background without stalling read accesses to refreshing memory blocks. For proof of concept, we apply Nonblocking Refresh to server memory systems, where every memory block already contains redundant data to provide hardware failure protection. In this context, Nonblocking Refresh can utilize server memory system’s existing redundant data to improve performance, without adding additional redundancy overhead. Our evaluations show that on average across five server memory systems with different redundancy and failure protection strengths, Nonblocking Refresh improves performance by 16%-30%.
2

Developing Dependable IoT Systems: Safety Perspective

Abdulhamid, Alhassan, Kabir, Sohag, Ghafir, Ibrahim, Lei, Ci 05 September 2023 (has links)
Yes / The rapid proliferation of internet-connected devices in public and private spaces offers humanity numerous conveniences, including many safety benefits. However, unlocking the full potential of the Internet of Things (IoT) would require the assurance that IoT devices and applications do not pose any safety hazards to the stakeholders. While numerous efforts have been made to address security-related challenges in the IoT environment, safety issues have yet to receive similar attention. The safety attribute of IoT systems has been one of the system’s vital non-functional properties and a remarkable attribute of its dependability. IoT systems are susceptible to safety breaches due to a variety of factors, such as hardware failures, misconfigurations, conflicting interactions of devices, human error, and deliberate attacks. Maintaining safety requirements is challenging due to the complexity, autonomy, and heterogeneity of the IoT environment. This article explores safety challenges across the IoT architecture and some application domains and highlights the importance of safety attributes, requirements, and mechanisms in IoT design. By analysing these issues, we can protect people from hazards that could negatively impact their health, safety, and the environment. / The full text will be available at the end of the publisher's embargo: 11th Feb 2025
3

A CONTROLLER AREA NETWORK LAYER FOR RECONFIGURABLE EMBEDDED SYSTEMS

Jeganathan, Nithyananda Siva 01 January 2007 (has links)
Dependable and Fault-tolerant computing is actively being pursued as a research area since the 1980s in various fields involving development of safety-critical applications. The ability of the system to provide reliable functional service as per its design is a key paradigm in dependable computing. For providing reliable service in fault-tolerant systems, dynamic reconfiguration has to be supported to enable recovery from errors (induced by faults) or graceful degradation in case of service failures. Reconfigurable Distributed applications provided a platform to develop fault-tolerant systems and these reconfigurable architectures requires an embedded network that is inherently fault-tolerant and capable of handling movement of tasks between nodes/processors within the system during dynamic reconfiguration. The embedded network should provide mechanisms for deterministic message transfer under faulty environments and support fault detection/isolation mechanisms within the network framework. This thesis describes the design, implementation and validation of an embedded networking layer using Controller Area Network (CAN) to support reconfigurable embedded systems.
4

Dependable Cyber-Physical Systems

Kim, Junsung 01 May 2014 (has links)
CPS (Cyber-Physical Systems) enable a new class of applications that perceive their surroundings using raw data from sensors, monitor the timing of dynamic processes, and control the physical environment. Since failures and misbehaviors in application domains such as cars, medical devices, nuclear power plants, etc., may cause significant damage to life and/or property, CPS need to be safe and dependable. A conventional way of improving dependability is to use redundant hardware to replicate the whole (sub)system. Although hardware replication has been widely deployed in conventional mission-critical systems, it is cost-prohibitive to many emerging CPS application domains. Hardware replication also leads to limited system flexibility. This dissertation studies the problem of making CPS affordably dependable and develops a system-level framework that manages critical CPS resources including processors, networks, and sensors. Our framework called SAFER (System-level Architecture for Failure Evasion in Real-time applications) incorporates configurable software mechanisms and policies to tolerate failures of critical CPS resources while meeting their timing constraints. It supports adaptive graceful degradation, the effective use of different sensor modalities, and the fault-tolerant schemes of hot standby, cold standby, and re-execution. SAFER reliably and efficiently allocates tasks and their backups to CPU and sensor resources while satisfying network traffic constraints. It also fuses and (re)configures sensor data used by tasks to recover from system failures. The SAFER framework aims to guarantee the timeliness of different types of tasks that fall into one of four categories: (1) tasks with periodic arrivals, (2) tasks with continually varying periods, (3) tasks with parallel threads, and (4) tasks with self-suspensions. We offer the schedulability analyses and runtime support for such tasks with and without resource failures. Finally, the functionality of the proposed system is evaluated on a self-driving car using SAFER. We conclude that the proposed framework analytically satisfies timing constraints and predictably operates systems with and without resource failures, hence making CPS dependable and timely.
5

A Dependable Computing Application

Gungor, Ugur 01 April 2005 (has links) (PDF)
ABSTRACT A DEPENDABLE COMPUTING APPLICATION G&uuml / ng&ouml / r, Ugur M.S., Department of Electric and Electronics Engineering Supervisor : Prof. Dr. Hasan Cengiz G&uuml / ran April 2005, 129 pages This thesis focuses on fault tolerance which is kind of dependable computing implementation. It deals with the advantages of fault tolerance techniques on Single Event Upsets (SEU) occurred in a Field Programmable Gate Array (FPGA). Two fault tolerant methods are applied to floating point multiplier. Most common SEU mitigation method is Triple Modular Redundancy (TMR). So, two fault tolerance methods, which use TMR, are tested. There are three printed circuit boards (PCBs) and one user interface software in the setup. By user interface software running on a computer, user can inject fault or faults to the selected part of the system, which uses TMR with voting circuit or TMRVC TMR with voting and correction circuits on floating point multiplier. After inserting fault or faults, user can watch results of the fault injection test by user interface software. One of these printed circuit boards is called as a Test Pattern Generator. It is responsible for communication between the Fault Tolerant Systems and the user interface software running on a computer. Fault Tolerant Systems is second PCB in the setup. It is used to implement fault tolerant methods on fifteen bits floating point multiplier in the FPGA. First one of these methods is TMR with voter circuit (TMRV) and second one is TMR with voter and correction circuits (TMRVC). Last PCB in the setup is Display PCB. This PCB displays fault tolerant test result and floating point multiplication result. All the functions on Test Pattern Generator and Fault Tolerant Systems are implemented through the use of a Field Programmable Gate Array (FPGA), which is programmed using the Very High Speed IC Description Language (VHDL). Implementation results of the used methods in FPGA are evaluated to observe the performance of applied methods for tolerating SEU.
6

Towards Optimization of Software V&V Activities in the Space Industry [Two Industrial Case Studies] / Mot Optimering av Software V & V Aktiviteter i rymdindustrins [Två Industriella Fallstudier]

Ahmad, Ehsan, Raza, Bilal January 2009 (has links)
Developing software for high-dependable space applications and systems is a formidable task. With new political and market pressures on the space industry to deliver more software at a lower cost, optimization of their methods and standards need to be investigated. The industry has to follow standards that strictly sets quality goals and prescribes engineering processes and methods to fulfill them. The overall goal of this study is to evaluate if current use of ECSS standards is cost efficient and if there are ways to make the process leaner while still maintaining the quality and to analyze if their V&V activities can be optimized. This paper presents results from two industrial case studies of companies in the European space industry that are following ECSS standards and have various V&V activities. The case studies reported here focused on how the ECSS standards were used by the companies and how that affected their processes and how their V&V activities can be optimized. / Utveckling av programvara för hög funktionssäkra rymden applikationer och system är en formidabel uppgift. Med nya politiska och marknadsmässiga trycket på rymdindustrin att leverera mer mjukvara till en lägre kostnad, optimering av deras metoder och standarder måste utredas. Industrin har att följa standarder som absolut uppsättningar kvalitetsmål och föreskriver tekniska processer och metoder för att uppfylla dem. Det övergripande målet för denna studie är att utvärdera om den nuvarande användningen av ECSS standarder är kostnaden effektivt och om det finns sätt att göra processen smidigare och samtidigt bibehålla kvaliteten och för att analysera om V & V verksamhet kan optimeras. Detta dokument presenterar resultat från två industriella fallstudier av företag inom den europeiska rymdindustrin som är Följande ECSS krav och ha olika V & V verksamhet. Fallstudierna redovisas här fokuserat på hur ECSS standarder som används av företag och hur detta påverkat deras processer och hur deras V & V verksamhet kan optimeras.
7

ETFIDS: Efficient Transient Fault Injection and Detection System

Tian, Ninghan January 2018 (has links)
No description available.
8

Architecture-Based Verification of Dependable Embedded Systems

Johnsen, Andreas January 2013 (has links)
Quality assurance of dependable embedded systems is becoming increasingly difficult, as developers are required to build more complex systems on tighter budgets. As systems become more complex, system architects must make increasingly complex architecture design decisions. The process of making the architecture design decisions of an intended system is the very first, and the most significant, step of ensuring that the developed system will meet its requirements, including requirements on its ability to tolerate faults. Since the decisions play a key role in the design of a dependable embedded system, they have a comprehensive effect on the development process and the largest impact on the developed system. Any faulty architecture design decision will, consequently, propagate throughout the development process, and is likely to lead to a system not meeting the requirements, an unacceptable level of dependability and costly corrections. Architecture design decisions are in turn critical with respect to quality and dependability of a system, and the cost of the development process. It is therefore crucial to prevent faulty architecture design decisions and, as early as practicable, detect and remove faulty decisions that have not successfully been prevented. The use of Architecture Description Languages (ADLs) helps developers to cope with the increasing complexity by formal and standardized means of communication and understanding. Furthermore, the availability of a formal description enables automated and formal analysis of the architecture design. The contribution of this licentiate thesis is an architecture quality assurance framework for safety-critical, performance-critical and mission-critical embedded systems specified by the Architecture Analysis and Design Language (AADL). The framework is developed through the adaption of formal methods, in particular traditional model checking and model-based testing techniques, to AADL, by defining formal verification criteria for AADL, and a formal AADL-semantics. Model checking of AADL models provides evidence of the completeness, consistency and correctness of the model, and allows for automated avoidance of faulty architecture design decisions, costly corrections and threats to quality and dependability. In addition, the framework can automatically generate test suites from AADL models to test a developed system with respect to the architecture design decisions. A successful test suite execution provides evidence that the architecture design has been implemented correctly. Methods for selective regression verification are included in the framework to cost-efficiently re-verify a modified architecture design, such as after a correction of a faulty design decision. / Kvalitetssäkring av tillförlitliga inbyggda system är en ständigt växande utmaning då utvecklare av sådana system är tvungna att bygga allt mer komplexa system inom allt mer begränsade budgetar. Då komplexiteten av systemen ökar måste systemarkitekter göra allt mera komplicerade beslut om systemens arkitekturdesign. Processen att besluta arkitekturdesignen av ett tilltänkt system är det allra första, och det mest signifikanta, steget att försäkra att det utvecklade systemet kommer uppnå dess krav, inklusive krav på dess möjlighet att tolerera defekter. Då dessa designbeslut dessutom har en nyckelroll i designen av ett tillförlitligt inbyggt system har de en omfattande effekt på utvecklingsprocessen samt den största påverkan på det utvecklade systemet. På grund av detta kommer ett felaktigt beslut om arkitekturdesignen propagera igenom hela utvecklingsprocessen och sannolikt resultera i ett system som inte uppnår kraven, får en oacceptabel tillförlitlighetsnivå, och kostsamma korrigeringar. De är därmed kritiska med hänsyn till kvaliteten och tillförlitligheten av ett inbyggt system, och kostnaden av utvecklingsprocessen. Således är det kritiskt att förhindra felaktiga beslut om arkitekturdesign och, så tidigt som möjligt, detektera och avlägsna felaktiga beslut som inte har lyckats att förhindras. Användningen av språk för arkitekturbeskrivning hjälper utvecklare att hantera den ökande komplexiteten genom standardiserade kommunikationsmedel och förståelsemedel. Dessutom möjliggör en formell beskrivning automatiserad och formell analys av arkitekturdesignen. Bidraget av denna licentiatavhandling är ett formellt kvalitetssäkringsramverk för säkerhetskritiska, prestandakritiska och uppdragskritiska inbyggda system specificerade i arkitekturbeskrivningsspråket ”Architecture Analysis and Design Language” (AADL). Ramverket är utvecklat genom adaptionen av formella metoder, i synnerhet traditionella modellkontrolltekniker och modellbaserad testningstekniker, till AADL, med hjälp av att definiera formella verifikationskriterier för AADL och en formell AADL-semantik. Modellkontroll av AADL-modeller analyserar modellens fullständighet, konsistens och korrekthet och möjliggör automatisk undvikande av felaktiga arkitekturdesignbeslut, kostsamma korrigeringar och hot mot kvalitet och tillförlitlighet. Därutöver kan ramverket automatiskt generera testsviter från AADL-modeller för att testa ett utvecklat system mot den bestämda arkitekturdesignen. En lyckad testsvitexekvering garanterar att arkitekturdesignen är korrekt implementerad. Metoder för selektiv regressionsverifiering är inkluderade i ramverket för att på ett kostnadseffektivt tillvägagångssätt verifiera en, tidigare verifierad, arkitekturdesign som har blivit modifierad, såsom efter en korrigering av ett felaktigt designbeslut.
9

Développement incrémental de spécifications d'architectures en UML intégrant des procédures de vérification / Incremental development of UML architectural specification based on behavioural verification.

Phan, Thanh-Liêm 17 December 2013 (has links)
Le langage UML est devenu un standard de fait, y compris pour le développement de systèmes critiques. Néanmoins, les outils actuels apportent peu d'aide pour exploiter et vérifier les modèles proposés, surtout en cours de développement. Cette thèse se concentre sur l'aide à la construction d'architectures en UML durant les phases d'analyse et de conception de systèmes réactifs. Elle vise à développer un cadre théorique et pragmatique pour mettre en œuvre une approche incrémentale. Ce cadre fournit un outil permettant de vérifier les architectures durant leur modélisation. Les architectures sont modélisées par des diagrammes UML de structures composites alors que les composants primitifs sont présentés par une combinaison de diagrammes de machines d'états et de diagramme d'activités. Ce travail offre les moyens de vérifier d'une part si une nouvelle architecture est un raffinement, une extension ou un incrément de celle définie durant les étapes précédentes, et d'autre part si un composant est compatible avec un environnement ou s'il est substituable par un autre. L'analyse des architectures impose de leur donner une sémantique formelle. Concernant les composants primitifs, nous leur associons une sémantique en LTS (Labelled Transition Systems) ce qui nous a conduit à définir une procédure de transformation automatique de machines d'états et diagrammes d'activités en LTS. Concernant les composants composites, nous leur associons un LTS en transformant un diagramme de structure composite en une spécification Exp.Open, puis en générant la fusion des LTS grâce à la boîte à outils CADP. Dans un second temps, nous avons mis en œuvre des techniques de vérifications de relations de conformité de LTS que sont les préordres de raffinement, d'extension, et d'incrément. Nous avons également défini et implanté une relation de compatibilité et de substituabilité. L'ensemble de ces techniques de construction incrémentale se positionne selon deux axes. L'axe vertical représente le niveau d'abstraction. L'évolution d'une architecture peut se faire sur cet axe dans deux sens : i) par des techniques de raffinement dans le sens descendant et ii) par des techniques d'abstraction dans le sens ascendant. L'axe horizontal représente le niveau de couverture des exigences. L'évolution d'une architecture peut se faire sur cet axe selon deux sens : i) par des techniques d'extension et ii) par des techniques de restriction. Ces travaux ont été réalisés de façon théorique et pratique : ils ont donné lieu au développement d'un outil dédié à la construction incrémentale de modèles UML, appelé IDCM (Incremental Development of Conforming Models), regroupant la transformation de modèles et la mise en œuvre de l'ensemble des relations incrémentales. Ceci a été validé sur diverses études de cas. / UML is becoming a de facto standard, including for development of dependable systems. However, current tools offer little help to take benefit of proposed models and to verify them, especially during development phases. This thesis focuses on supporting construction of UML architectures of reactive systems. It aims at developing a theoretic and pragmatic framework to implement the incremental approach. The framework provides tools to verify the coherenceof architectures during the modelling phase. Architectures are modelled by UML diagram of composite structures while primitive components are represented by a combination of state machine diagram and activity diagram. This work provides a means to verify in one hand if a new architecture is a refinement, an extension or an increment of those defined in the previous steps, and in another hand, if a component is compatible with an environment or if it is substitutableby another.In order to analyse UML architectures, we must give them a formal semantics. We associated primitive components with LTS (Labelled Transition Systems) which led us to define a procedure for automatic transformation of state machines and activities diagrams into LTS. We associated composite components with LTS by transforming a diagram of composite structure into Exp.Open specification, then by generating LTS fusion with the toolbox CADP. We have implemented verification techniques of conformance relations on LTS such as the preorders: refinement, extension, and increment. We also defined and implemented compatibility relation and substitutability relation. All these incremental construction techniques are positioned along two axes. The vertical axis represents the level of bstraction.The development of an architecture following this axis in two directions: i) refinement techniques in the downward direction and ii) abstraction techniques in the upward direction. The horizontal axis represents the coverage level of requirements. The development of architectures can be realized following this axis in two directions : i) extension direction and ii) restrictiondirection.This work has been carried out in theory and practice : it has led to the development of a dedicated tool for incremental construction of UML models, called IDCM (Incremental Development of Conforming Models), grouping the transformation of models and the implementation of a set of incremental relations. This has been validated on various case studies.
10

Detección concurrente de errores en el flujo de ejecución de un procesador

Rodríguez Ballester, Francisco 02 May 2016 (has links)
[EN] Incorporating error detection mechanisms is a key element in the design of fault tolerant systems. For many of those systems the detection of an error (whether temporary or permanent) triggers a bunch of actions or activation of elements pursuing any of these objectives: continuation of the system operation despite the error, system recovery, system stop into a safe state, etc. Objectives ultimately intended to improve the characteristics of reliability, security, and availability, among others, of the system in question. One of these error detection elements is a watchdog processor; it is responsible to monitor the system processor and check that no errors occur during the program execution. The main drawback of the existing proposals in this regard and that prevents a more widespread use of them is the loss of performance and the increased memory consumption suffered by the monitored system. In this PhD a new technique to embed signatures is proposed. The technique is called ISIS - Interleaved Signature Instruction Stream - and it embeds the watchdog signatures interspersed with the original program instructions in the memory. With this technique it is a separate element of the system processor (a watchdog processor as such) who carries out the operations to detect errors. Although signatures are mixed with program instructions, and unlike previous proposals, the main system processor is not involved neither in the recovery of these signatures from memory nor in the corresponding calculations, reducing the performance loss. A novel technique is also proposed that enables the watchdog processor verification of the structural integrity of the monitored program checking the jump addresses used. This jump address processing technique comes to largely solve the problem of verifying a jump to a new program area when there are multiple possible valid destinations of the jump. This problem did not have an adequate solution so far, and although the proposal made here can not solve every possible jump scenario it enables the inclusion of a large number of them into the set verifiable jumps. The theoretical ISIS proposal and its error detection mechanisms are complemented by the contribution of a complete system (processor, watchdog processor, cache memory, etc.) based on ISIS which incorporates the detection mechanisms proposed here. This system has been called HORUS, and is developed in the synthesizable subset of the VHDL language, so it is possible not only to simulate the behavior of the system at the occurrence of a fault and analyze its evolution from it but it is also possible to program a programmable logic device like an FPGA for its inclusion in a real system. To program the HORUS system in this PhD a modified version of the gcc compiler has been developed which includes the generation of signatures for the watchdog processor as an integral part of the process to create the executable program (compilation, assembly, and link) from a source code written in the C language. Finally, another work developed in this PhD is the development of FIASCO (Fault Injection Aid Software Components), a set of scripts using the Tcl/Tk language that allow the injection of a fault during the simulation of HORUS in order to study its behavior and its ability to detect subsequent errors. With FIASCO it is possible to perform hundreds or thousands of simulations in a distributed system environment to reduce the time required to collect the data from large-scale injection campaigns. Results show that a system using the techniques proposed here is able to detect errors during the execution of a program with a minimum loss of performance, and that the penalty in memory consumption when using a watchdog processor is similar to previous proposals. / [ES] La incorporación de mecanismos de detección de errores es un elemento fundamental en el diseño de sistemas tolerantes a fallos en los que, en muchos casos, la detección de un error (ya sea transitorio o permanente) es el punto de partida que desencadena toda una serie de acciones o activación de elementos que persiguen alguno de estos objetivos: la continuación de las operaciones del sistema a pesar del error, la recuperación del mismo, la parada de sus operaciones llevando al sistema a un estado seguro, etc. Objetivos, en definitiva, que pretenden la mejora de las características de fiabilidad, seguridad y disponibilidad, entre otros, del sistema en cuestión. Uno de estos elementos de detección de errores es un procesador de guardia; su trabajo consiste en monitorizar al procesador del sistema y comprobar que no se producen errores durante la ejecución del programa. El principal inconveniente de las propuestas existentes a este respecto y que impiden una mayor difusión de su uso es la pérdida de prestaciones y el aumento de consumo de memoria que sufre el sistema monitorizado. En este trabajo se propone una nueva técnica de empotrado de firmas (ISIS -Interleaved Signature Instruction Stream) intercaladas dentro del espacio de la memoria del programa. Con ella un elemento separado del procesador del sistema realiza las operaciones encaminadas a detectar los errores. A pesar de que las firmas se encuentran mezcladas con las instrucciones del programa que está ejecutando, y a diferencia de las propuestas previas, el procesador principal del sistema no se involucra ni en la recuperación de las firmas ni en las operaciones de cálculo correspondientes, lo que reduce la pérdida de prestaciones. También se propone una novedosa técnica para que el procesador de guardia pueda verificar la integridad estructural del programa que monitoriza comprobando las direcciones de salto empleadas. Esta técnica de procesado de las direcciones de salto viene a resolver en gran medida el problema de la comprobación de un salto a una nueva zona del programa cuando existen múltiples posibles destinos válidos. Este problema no tenía una solución adecuada hasta el momento, y aunque la propuesta que aquí se hace no consigue resolver todos los posibles escenarios de salto sí permite incorporar un buen números de ellos al conjunto de saltos verificables. ISIS y sus mecanismos de detección de errores se complementan con la aportación de un sistema completo (procesador, procesador de guardia, memoria caché, etc.) basado en ISIS denominado HORUS. Está desarrollado en lenguaje VHDL sintetizable, de manera que es posible tanto simular el comportamiento del sistema ante la aparición de un fallo y analizar su evolución a partir de éste como programar un dispositivo lógico programable tipo FPGA para su inclusión en un sistema real. Para programar el sistema HORUS se ha desarrollado una versión modificada del compilador gcc que incluye la generación de las firmas de referencia para el procesador de guardia como parte del proceso de creación del programa ejecutable a partir de código fuente escrito en lenguaje C. Finalmente, otro trabajo desarrollado en esta tesis es el desarrollo de FIASCO (Fault Injection Aid Software COmponents), un conjunto de scripts en lenguaje Tcl/Tk que permiten la inyección de un fallo durante la simulación de HORUS con el objetivo de estudiar su comportamiento y su capacidad para detectar los errores subsiguientes. Con FIASCO es posible lanzar cientos o miles de simulaciones en un entorno distribuido para reducir el tiempo necesario para obtener los datos de campañas de inyección a gran escala. Los resultados demuestran que un sistema que utilice las técnicas que aquí se proponen es capaz de detectar errores durante la ejecución del programa con una mínima pérdida de prestaciones, y que la penalización en el consumo de memoria al usar un procesador de guardia es similar a la de las propu / [CAT] La incorporació de mecanismes de detecció d'errors és un element fonamental en el disseny de sistemes tolerants a fallades. En aquests sistemes la detecció d'un error, tant transitori com permanent, sovint significa l'inici d'una sèrie d'accions o activació d'elements per assolir algun del objectius següents: mantenir les operacions del sistema malgrat l'error, la recuperació del sistema, aturar les operacions situant el sistema en un estat segur, etc. Aquests objectius pretenen, fonamentalment, millorar les característiques de fiabilitat, seguretat i disponibilitat del sistema. El processador de guarda és un dels elements emprats per a la detecció d'errors. El seu treball consisteix en monitoritzar el processador del sistema i comprovar que no es produeixen error durant l'execució de les instruccions. Els principals inconvenients de l'ús del processadors de guarda és la pèrdua de prestacions i l'increment de les necessitats de memòria del sistema que monitoritza, per la qual cossa la seva utilització no està molt generalitzada. En aquest treball es proposa una nova tècnica de encastat de signatures (ISIS - Interleaved Signature Instruction Stream) intercalant-les en l'espai de memòria del programa. D'aquesta manera és possible que un element extern al processador realitze les operacions dirigides a detectar els errors, i al mateix temps permet que el processador execute el programa original sense tenir que processar les signatures, encara que aquestes es troben barrejades amb les instruccions del programa que s'està executant. També es proposa en aquest treball una nova tècnica que permet al processador de guarda verificar la integritat estructural del programa en execució. Aquesta verificació permet resoldre el problema de com comprovar que, al executar el processador un salt a una nova zona del programa, el salt es realitza a una de les possibles destinacions que són vàlides. Fins el moment no hi havia una solució adequada per a aquest problema i encara que la tècnica presentada no resol tots el cassos possibles, sí afegeix un bon nombre de salts al conjunt de salts verificables. Les tècniques presentades es reforcen amb l'aportació d'un sistema complet (processador, processador de guarda, memòria cache, etc.) basat en ISIS i que incorpora els mecanismes de detecció que es proposen en aquest treball. A aquest sistema se li ha donat el nom de HORUS, i està desenvolupat en llenguatge VHDL sintetitzable, la qual cosa permet no tan sols simular el seu comportament davant la aparició d'un error i analitzar la seva evolució, sinó també programar-lo en un dispositiu FPGA per incloure'l en un sistema real. Per poder programar el sistema HORUS s'ha desenvolupat una versió modificada del compilador gcc. Aquesta versió del compilador inclou la generació de les signatures de referència per al processador de guarda com part del procés de creació del programa executable (compilació, assemblat i enllaçat) des del codi font en llenguatge C. Finalment en aquesta tesis s'ha desenvolupat un altre treball anomenat FIASCO (Fault Injection Aid Software COmponents), un conjunt d'scripts en llenguatge Tcl/Tk que permeten injectar fallades durant la simulació del funcionament d'HORUS per estudiar la seua capacitat de detectar els errors i el seu comportament posterior. Amb FIASCO és possible llançar centenars o milers de simulacions en entorns distribuïts per reduir el temps necessari per obtenir les dades d'una campanya d'injecció de fallades de grans proporcions. Els resultats obtinguts demostren que un sistema que utilitza les tècniques descrites és capaç de detectar errors durant l'execució del programa amb una pèrdua mínima de prestacions, i amb un requeriments de memòria similars als de les propostes anteriors. / Rodríguez Ballester, F. (2016). Detección concurrente de errores en el flujo de ejecución de un procesador [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/63254 / TESIS

Page generated in 0.0564 seconds