Global ETD Search

21	Improving System Reliability for Cyber-Physical Systems Wu, Leon L. January 2015 (has links) Cyber-physical systems (CPS) are systems featuring a tight combination of, and coordination between, the system’s computational and physical elements. Cyber-physical systems include systems ranging from critical infrastructure such as a power grid and transportation system to health and biomedical devices. System reliability, i.e., the ability of a system to perform its intended function under a given set of environmental and operational conditions for a given period of time, is a fundamental requirement of cyber-physical systems. An unreliable system often leads to disruption of service, financial cost and even loss of human life. An important and prevalent type of cyber-physical system meets the following criteria: processing large amounts of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and an accountability requirement for safety critical systems. This thesis aims to improve system reliability for this type of cyber-physical system. To improve system reliability for this type of cyber-physical system, I present a system evaluation approach entitled automated online evaluation (AOE), which is a data-centric runtime monitoring and reliability evaluation approach that works in parallel with the cyber-physical system to conduct automated evaluation along the workflow of the system continuously using computational intelligence and self-tuning techniques and provide operator-in-the-loop feedback on reliability improvement. For example, abnormal input and output data at or between the multiple stages of the system can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and increased system reliability. One technique used by the approach is data quality analysis using computational intelligence, which applies computational intelligence in evaluating data quality in an automated and efficient way in order to make sure the running system perform reliably as expected. Another technique used by the approach is self-tuning which automatically self-manages and self-configures the evaluation system to ensure that it adapts itself based on the changes in the system and feedback from the operator. To implement the proposed approach, I further present a system architecture called autonomic reliability improvement system (ARIS). This thesis investigates three hypotheses. First, I claim that the automated online evaluation empowered by data quality analysis using computational intelligence can effectively improve system reliability for cyber-physical systems in the domain of interest as indicated above. In order to prove this hypothesis, a prototype system needs to be developed and deployed in various cyber-physical systems while certain reliability metrics are required to measure the system reliability improvement quantitatively. Second, I claim that the self-tuning can effectively self-manage and self-configure the evaluation system based on the changes in the system and feedback from the operator-in-the-loop to improve system reliability. Third, I claim that the approach is efficient. It should not have a large impact on the overall system performance and introduce only minimal extra overhead to the cyber- physical system. Some performance metrics should be used to measure the efficiency and added overhead quantitatively. Additionally, in order to conduct efficient and cost-effective automated online evaluation for data-intensive CPS, which requires large volumes of data and devotes much of its processing time to I/O and data manipulation, this thesis presents COBRA, a cloud-based reliability assurance framework. COBRA provides automated multi-stage runtime reliability evaluation along the CPS workflow using data relocation services, a cloud data store, data quality analysis and process scheduling with self-tuning to achieve scalability, elasticity and efficiency. Finally, in order to provide a generic way to compare and benchmark system reliability for CPS and to extend the approach described above, this thesis presents FARE, a reliability benchmark framework that employs a CPS reliability model, a set of methods and metrics on evaluation environment selection, failure analysis, and reliability estimation. The main contributions of this thesis include validation of the above hypotheses and empirical studies of ARIS automated online evaluation system, COBRA cloud-based reliability assurance framework for data-intensive CPS, and FARE framework for benchmarking reliability of cyber-physical systems. This work has advanced the state of the art in the CPS reliability research, expanded the body of knowledge in this field, and provided some useful studies for further research. Cooperating objects (Computer systems) Computer systems--Reliability Computer systems--Evaluation Computer science Information science
22	Detecting and tolerating faults in distributed systems Ogale, Vinit Arun, 1979- 05 October 2012 (has links) This dissertation presents techniques for detecting and tolerating faults in distributed systems. Detecting faults in distributed or parallel systems is often very difficult. We look at the problem of determining if a property or assertion was true in the computation. We formally define a logic called BTL that can be used to define such properties. Our logic takes temporal properties in consideration as these are often necessary for expressing conditions like safety violations and deadlocks. We introduce the idea of a basis of a computation with respect to a property. A basis is a compact and exact representation of the states of the computation where the property was true. We exploit the lattice structure of the computation and the structure of different types of properties and avoid brute force approaches. We have shown that it is possible to efficiently detect all properties that can be expressed by using nested negations, disjunctions, conjunctions and the temporal operators possibly and always. Our algorithm is polynomial in the number of processes and events in the system, though it is exponential in the size of the property. After faults are detected, it is necessary to act on them and, whenever possible, continue operation with minimal impact. This dissertation also deals with designing systems that can recover from faults. We look at techniques for tolerating faults in data and the state of the program. Particularly, we look at the problem where multiple servers have different data and program state and all of these need to be backed up to tolerate failures. Most current approaches to this problem involve some sort of replication. Other approaches based on erasure coding have high computational and communication overheads. We introduce the idea of fusible data structures to back up data. This approach relies on the inherent structure of the data to determine techniques for combining multiple such structures on different servers into a single backup data structure. We show that most commonly used data structures like arrays, lists, stacks, queues, and so on are fusible and present algorithms for this. This approach requires less space than replication without increasing the time complexities for any updates. In case of failures, data from the back up and other non-failed servers is required to recover. To maintain program state in case of failures, we assume that programs can be represented by deterministic finite state machines. Though this approach may not yet be practical for large programs it is very useful for small concurrent programs like sensor networks or finite state machines in hardware designs. We present the theory of fusion of state machines. Given a set of such machines, we present a polynomial time algorithm to compute another set of machines which can tolerate the required number of faults in the system. / text Fault-tolerant computing System design Computer systems--Reliability
23	Επικαλυπτόμενες ροές επιτυχιών και εφαρμογές Σπέη, Μαρία 05 June 2015 (has links) Θεωρούμε μία ακολουθία από n ανεξάρτητες τυχαίες μεταβλητές Bernoulli, X1,X2,...,Xn (n>0) διατεταγμένες σε γραμμή. Τα δυνατά αποτελέσματα είναι δύο και χαρακτηρίζονται ως επιτυχία (S ή 1) ή αποτυχία (F ή 0). Ροή επιτυχιών είναι μία ακολουθία συνεχόμενων επιτυχιών (S) των οποίων προηγούνται και έπονται αποτυχίες (F) ή τίποτα. Μήκος μιας ροής επιτυχιών είναι ο αριθμός των επιτυχιών που περιλαμβάνονται στη ροή. Η μελέτη τυχαίων μεταβλητών που σχετίζονται με ροές είναι ιδιαίτερα αποτελεσματική σε πολλά επιστημονικά πεδία. Συγκεκριμένα, η μελέτη του αριθμού των ροών επιτυχιών σύμφωνα με διάφορα σχήματα απαρίθμησης αποτελεί ένα ενδιαφέρον θέμα ήδη από την εποχή του De Moivre (1756). Το 1940, ορίστηκε η βάση για τη δημιουργία ελέγχων υποθέσεων από τους Wald και Wolfowitz (1940) και τον Wolfowitz (1943). Επίσης, οι ροές χρησιμοποιήθηκαν και στον ποιοτικό έλεγχο από τους Mosteller (1941) και Wolfowitz (1943). Στις μέρες μας πέρα από τη Στατιστική, εφαρμόζεται και σε άλλες επιστημονικές περιοχές όπως η βιολογία (ακολουθίες DNA), η οικολογία, η ψυχολογία, η αστρονομία και η αξιοπιστία μηχανικών συστημάτων. Η παρούσα εργασία επικεντρώνεται στην μελέτη τυχαίων μεταβλητών, που μετρούν ροές επιτυχιών μήκους k. Αρχικά, αναλύονται οι τυχαίες μεταβλητές Nn,k και Mn,k, που παριστάνουν τον αριθμό των μη επικαλυπτόμενων ροών επιτυχιών μήκους k σύμφωνα με τον Feller (1968) και τον αριθμό των επικαλυπτόμενων ροών επιτυχιών μήκους k σύμφωνα με τον Ling (1988), αντίστοιχα. Επίσης, μελετάται η ασυμπτωτική τους συμπεριφορά και προσδιορίζεται η κατανομή τους μέσω συνδυαστικών μεθόδων, αναδρομικών σχημάτων, αθροισμάτων πολυωνυμικών και διωνυμικών συντελεστών καθώς και μέσω της μεθόδου εμβάπτισης τυχαίας μεταβλητής σε Μαρκοβιανή αλυσίδα. Δίνονται εκφράσεις για τη μέση τιμή, τη διασπορά και τη ροπογεννήτρια της τυχαίας μεταβλητής Mn,k. Επιπλέον, αναλύεται μια νέα κατηγορία αρνητικής διωνυμικής κατανομής τάξης k. Στη συνέχεια, δίνεται έμφαση στη μελέτη της τυχαίας μεταβλητής Nn,k,l, η οποία παριστάνει τον αριθμό των l-επικαλυπτόμενων ροών επιτυχιών μήκους k σε n ανεξάρτητες δοκιμές Bernoulli και γίνεται μία αναφορά στις γενικευμένες διωνυμικές κατανομές τάξης k. Παρουσιάζονται εκφράσεις για τη μέση τιμή και τη πιθανογεννήτρια συνάρτηση της τυχαίας μεταβλητής Nn,k,l και προσδιορίζεται η κατανομή της αναδρομικά, συνδυαστικά και μέσω της μεθόδου εμβάπτισης τυχαίας μεταβλητής σε Μαρκοβιανή αλυσίδα. Επίσης, μελετάται η τυχαία μεταβλητή Nn,k,l σε ακολουθία που προκύπτει από το σχήμα δειγματοληψίας Polya-Eggenberger. Τέλος, γίνεται σύνδεση της αξιοπιστίας m-συνεχόμενων-k-από-τα-n συστημάτων αποτυχίας με τις κατανομές των τυχαίων μεταβλητών Nn,k, Mn,k και Nn,k,l και παρουσιάζονται εκφράσεις για τον υπολογισμό της αξιοπιστίας αυτών των συστημάτων. / Consider a sequence X1,X2,...,Xn (n>0) of binary trials with outcomes arranged on a line. There are two possible outcomes, either a success (S ή 1) or a failure (F ή 0). A success run is a sequence of consecutive successes preceded and followed by failures (F) or by nothing. The number of successes in a success run is referred to as its length. The concept of runs has been used in various areas. In the early 1940s it was used in the area of hypothesis testing (run test) by Wald and Wolfowitz (1940) and Wolfowitz (1943) and in the area of statistical quality control by Mosteller (1941) and Wolfowitz (1943). Recently, it has been successfully used in many other areas, such as reliability of engineering systems, quality control, DNA sequencing, psychology, ecology and radar astronomy. Different enumerative schemes have been employed while discussing the number of success runs. The study of the random variables Nn,k and Mn,k, representing the number of non-overlapping consecutive k successes, in the sense of Feller’s (1968) counting and the number of overlapping consecutive k successes, in the sense of Ling’s (1988) counting, respectively, is important for this study. Also, the asymptotic behavior of these random variables is discussed. The methods that have been used to obtain the distributions of Nn,k and Mn,k are also presented, i.e. combinatorial analysis, recursive schemes and the Markov chain imbedding technique. The mean, the variance and the moment generating function of Mn,k are given. In addition, a new class of negative binomial distribution of order k is analyzed. This work is focused on the study of the random variable Nn,k,l, which represents the number of l-overlapping success runs of length k in n Bernoulli trials. Our study gives an overview of results referring to the distribution of the random variable Nn,k,l defined on sequences of Bernoulli trials (independent and identically distributed) and Markov trials. Also, formulae for the mean value and the probability generating function of Nn,k,l are presented. The distribution of Nn,k,l is determined recursively, combinatorially and via the Markov chain imbedding technique. Moreover, the random variable Nn,k,l is studied for sequences with outcomes from a Polya-Eggenberger sampling scheme. The distributions of Nn,k, Mn,k and Nn,k,l is used to study m-consecutive-k-out-of-n:F systems, i.e. systems that fail if and only if at least m sequences of k consecutive components fail. Several results concerning the reliability of such systems are also presented. 519.84 Overlapping success runs Systems reliability
24	Adaptable stateful application server replication Wu, Huaigu, 1975- January 2008 (has links) In recent years, multi-tier architectures have become the standard computing environment for web- and enterprise applications. The application server tier is often the heart of the system embedding the business logic. Adaptability, in particular the capability to adjust to the load submitted to the system and to handle the failure of individual components, are of outmost importance in order to provide 7/24 access and high performance. Replication is a common means to achieve these reliability and scalability requirements. With replication, the application server tier consists of several server replicas. Thus, if one replica fails, others can take over. Furthermore, the load can be distributed across the available replicas. Although many replication solutions have been proposed so far, most of them have been either developed for fault-tolerance or for scalability. Furthermore, only few have considered that the application server tier is only one tier in a multi-tier architecture, that this tier maintains state, and that execution in this environment can follow complex patterns. Thus, existing solutions often do not provide correctness beyond some basic application scenarios. / In this thesis we tackle the issue of replication of the application server tier from ground off and develop a unified solution that provides both fault-tolerance and scalability. We first describe a set of execution patterns that describe how requests are typically executed in multi-tier architectures. They consider the flow of execution across client tier, application server tier, and database tier. In particular, the execution patterns describe how requests are associated with transactions, the fundamental execution units at application server and database tiers. Having these execution patterns in mind, we provide a formal definition of what it means to provide a correct execution across all tiers, even in case failures occur and the application server tier is replicated. Informally, a replicated system is correct if it behaves exactly as a non-replicated that never fails. From there, we propose a set of replication algorithms for fault-tolerance that provide correctness for the execution patterns that we have identified The main principle is to let a primary AS replica to execute all client requests, and to propagate any state changes performed by a transaction to backup replicas at transaction commit time. The challenges occur as requests can be associated in different ways with transactions. Then, we extend our fault-tolerance solution and develop a unified solution that provides both fault-tolerance and load-balancing. In this extended solution, each application server replica is able to execute client requests as a primary and at the same time serves as backup for other replicas. The framework provides a transparent, truly distributed and lightweight load distribution mechanism which takes advantage of the fault-tolerance infrastructure. Our replication tool is implemented as a plug-in of JBoss application server and the performance is carefully evaluated, comparing with JBoss' own replication solutions. The evaluation shows that our protocols have very good performance and compare favorably with existing solutions. Client/server computing. Internet programming. Computer systems -- Reliability. Fault-tolerant computing.
25	Performance analysis of cellular networks. Rajaratnam, Myuran. January 2000 (has links) Performance analysis in cellular networks is the determination of customer orientated grade-of-service parameters, such as call blocking and dropping probabilities, using the methods of stochastic theory. This stochastic theory analysis is built on certain assumptions regarding the arrival and service processes of user-offered calls in a network. In the past, cellular networks were analysed using the classical assumptions, Poisson call arrivals and negative exponential channel holding times, borrowed from earlier fixed network analysis. However, cellular networks are markedly different from fixed networks, in that, they afford the user a unique opportunity: the ability to communicate while on the move. User mobility and various other cellular network characteristics, such as customer-billing, cell· layout and hand·off mechanisms, generally invalidate the use of Poisson arrivals and negative exponential holding times. Recent measurements on live networks substantiate this view. Consequently, over the past few years, there has been a noticeable shift towards using more generalised arrival and service distributions in the performance analysis of cellular networks. However, two shortcomings with the resulting models are that they suffer from state space explosion and / or they represent hand off traffic as a state dependent mean arrival rate (thus ignoring the higher moments of the hand-off arrival process). This thesis's contribution to cellular network analysis is a moment-based approach that avoids full state space description but ensures that the hand-off arrival process is modelled beyond the first moment. The thesis considers a performance analysis model that is based on Poisson new call arrivals, generalised hand-off call arrivals and a variety of channel holding times. The thesis shows that the performance analysis of a cellular network may be loosely decomposed into three parts, a generic cell traffic characterising model, a generic cell traffic blocking model and a quality of service evaluation model. The cell traffic characterising model is employed to determine the mean and variance of hand-off traffic offered by a cell to its neighbour. The cell traffic-blocking model is used to detennine the blocking experienced by the various traffic streams offered to each cell. The quality of service evaluation part is essentially afued-point iteration of the cell traffic characterising and cell traffic blocking parts to determine customer orientated grade-of-service parameters such as blocking and dropping probabilities. The thesis also presents detailed mathematical models for user mobility modelling. Finally, the thesis provides extensive results to validate the proposed analysis and to illustrate the accuracy of the proposed analysis when compared to existing methods. / Thesis (Ph.D.)-University of Natal, Durban, 2000. Cell phone systems.--Reliability. Telecommunication systems. Mobile communication systems. Cell phone services industry. Theses--Electronic engineering.
26	Low-cost and efficient architectural support for correctness and performance debugging Venkataramani, Guru Prasadh V. January 2009 (has links) Thesis (Ph.D)--Computing, Georgia Institute of Technology, 2010. / Committee Chair: Prvulovic, Milos; Committee Member: Hughes, Christopher J.; Committee Member: Kim, Hyesoon; Committee Member: Lee, Hsien-Hsin S.; Committee Member: Loh, Gabriel H. Part of the SMARTech Electronic Thesis and Dissertation Collection.
27	Integration Paradigms for Ensemble-based Smart Cyber-Physical Systems / Integration Paradigms for Ensemble-based Smart Cyber-Physical Systems Matěna, Vladimír January 2018 (has links) Smart Cyber-Physical Systems (sCPS) are complex systems performing smart coordination that often require decentralized and network resilient operation. New development in the fields of the robotic systems, Industry 4.0 and autonomous vehicular system brings challenges that can be tackled with deployment of ensemble based sCPS, but require further refinement in terms of network resilience and data propagation. This thesis maps the use cases of the sCPS in the aforementioned domains, discusses requirements on the ensemble based architecture in terms of network properties, and proposes recommendations and technical means that help to design network aware ensemble based sCPS. The proposed solutions are evaluated by the means of target systems simulation using state of the art realistic network and vehicular simulators.
28	Adaptable stateful application server replication Wu, Huaigu, 1975- January 2008 (has links) No description available. Client/server computing. Computer systems -- Reliability. Internet programming. Fault-tolerant computing.
29	Investigation into a high reliability micro-grid for a nuclear facility emergency power supply Lekhema, Gerard Ratoka January 2017 (has links) A research report submitted to the Faculty of Engineering, University of the Witwatersrand, Johannesburg in partial fulfilment of the requirements for the degree of Master of Science in Engineering, Johannesburg, August 2017 / The objective of this research work is to investigate the use of a high reliability micro grid to supply emergency electrical power to a nuclear facility following loss of offsite power (LOOP) accident. Most of the nuclear facilities around the world utilize diesel generators and battery banks as emergency power to back up the grid power supply. This power supply configuration represents the concept of the micro-grid system. The research work proposes reliability improvement of the emergency power supply by introducing diverse energy sources and energy storage systems. The energy sources and storage systems that were investigated include renewable energy sources, decay heat recovery system and large scale energy storage systems. The investigation results presented include information on the suitable energy sources and energy storage system, establishment of the reliable architectural layout and evaluation of the micro-grid system in terms of capacity adequacy and reliability. / XL2018 Smart power grids Renewable energy sources Nuclear power plants--Reliability Emergency power supply Electric power systems--Reliability
30	Otimização multiobjetivo de projetos de redes de distribuição de água / Multiobjective optimization of water distribution network projects Formiga, Klebber Teodomiro Martins 09 June 2005 (has links) O dimensionamento otimizado de sistemas de distribuição de águas tem originado centenas de trabalhos científicos nas últimas quatro décadas. Vários pesquisadores têm buscado encontrar uma metodologia capaz de dimensionar essas redes considerando diversos aspectos e incertezas características desse tipo de projeto. No entanto, os resultados da maioria das metodologias desenvolvidas não podem ser aplicados na prática. O objetivo deste trabalho é elaborar uma metodologia de dimensionamento de redes de distribuição de água considerando um enfoque multiobjetivo. A metodologia desenvolvida considera três aspectos referentes ao projeto desses sistemas: custo; confiabilidade e perdas por vazamentos. Para tanto, empregou-se um método de otimização multiobjetivo baseado em algoritmos genéticos para a geração do conjunto de soluções não-dominadas e um método multicriterial para escolha da alternativa final. Para representar os objetivos do problema, foram testadas nove funções: custo, vazamentos, entropia, resiliência, tolerância à falha, expansibilidade, efeito do envelhecimento e resilientropia, sendo que sete destas são específicas para a representação da confiabilidade. Para se avaliar as alternativas geradas foi desenvolvido um modelo de análise hidráulica que fosse capaz de trabalhar com vazamentos e com demandas dependente da pressão. Os métodos escolhidos foram o Híbrido de Nielsen e o Gradiente. Das funções testadas, a resilientropia, proposta originalmente neste trabalho, foi a que melhor se ajustou ao conceito formal de confiabilidade, representado pela função tolerância. Os resultados encontrados pela metodologia mostraram-se promissores, uma vez esta foi capaz de encontrar redes eficientes ao final das simulações. / The topic \"Optimized design of water distribution systems\" has generated hundreds of scientific publications in the last four decades. Several researchers have searched for a technology which would take into account a variety of aspects and uncertainties innate to the design of such networks. However, the results of most methodologies developed are not practical. The objective of this work is to develop a methodology for water distribution systems design that has a multi-objective focus. The methodology developed focuses in three aspects of the design of such systems: cost, reliability and losses by leaking. A multiobjective optimization method based on generic algorithms, generating a set of non-defined solutions, and a multi-criteria method for choosing the final alternative, was employed. Nine functions representing the objectives of the problem (method) were tested: cost, leakages, entropy, resilience, failure tolerance, expansibility, aging effect and resilienthropy, seven of which are specific to representing reliability. In order to evaluate the generated alternatives, a hydraulic analysis model, that could handle leakages and pressure dependent demands, was developed. The chosen methods were Nielsen\'s Hybrid, and the Gradient. Of all tested functions, resilientropy, originally proposed in this work, proved to be the one best adjusted to the formal concept of reliability, represented by the tolerance function. The results obtained by this methodology are promising, as they produced efficient distribution networks at the end of the simulations performed. Algoritmos evolucionários Confiabilidade de sistemas Design of water distribution networks Evolutionary algorithms Multiobjective optimization Otimização multiobjetivo Systems reliability

Search results