Return to search

Extension to models of coincident failure in multiversion software

Fault-tolerant architectures for software-based systems have been used in various practical applications, including Right control systems for commercial airliners (e.g. AIRBUS A340, A310) as part of an aircraft's so-called fiy-bY-'win: Right control system [1], the control systems for autonomous spacecrafts (e.g. Cassini-Huygens Saturn orbiter and probe) [2], rail interlocking systems [3] and nuclear reactor safety systems [4, 5]. The use of diverse, independently developed, functionally equivalent software modules in a fault-tolerant con- figura tion has been advocated as a means of achieving highly reliable systems from relatively less reliable system components [6, 7, 8, 9]. In this regard it had been postulated that [6] "The independence of programming efforts will greatly reduce the probability of identical softuiare faults occurring 'in two 01' more versions of the proqram." Experimental evaluation demonstrated that despite the independent creation of such versions positive failure correlation between the versions can be expected in practice [10, 11]. The conceptual models of Eckhardt et al [12] and Littlewood et al [13], referred to as the EL model and LM model respectively, were instrumental in pointing out sources of uncertainty that determine both the size and sign of such failure correlation. In particular, there are two important sources of uncertainty: The process of developing software: given sufficiently complex system requirements, the particular software version that will be produced from such a process is not knqwn with certainty. Consequently, complete knowledge of what the failure behaviour of the software will be is also unknown; The occurrence of demands during system operation: during system operation it may not be certain which demand 11 system will receive next from the environment. To explain failure correlation between multiple software versions the EL model introduced lite notion of difficulty: that is, given a demand that could occur during system operation there is a chance that a given software development team will develop a software component that fails when handling such a demand as part of the system. A demand with an associated high probability of developed software failing to handle it correctly is considered to be a "difficult" demand for a development team: a low probability of failure would suggest an "easy" demand. In the EL model different development. teams, even when isolated from each other, are identical in how likely they are to make mistakes while developing their respective software versions. Consequently, despite the teams possibly creating software versions that fail on different demands, in developing their respective versions the teams find the same demands easy, and the same demands difficult. The implication of this is the versions developed by the teams do not fail independently; if one observes t.he failure-of one team's version this could indicate that the version failed on a difficult. demand, thus increasing one's expectation that the second team's version will also fail on that demand. Succinctly put, due to correlated "difficulties" between the teams across the demands, "independently developed software cannot be expected to fail independently". The LM model takes this idea a step further by illustrating, under rather general practical conditions, that negative failure correlation is also possible; possible, because the teams may be sufficiently diverse in which demands they find "difficult". This in turn implies better reliability than would be expected under naive assumptions of failure independence between software modules built by the respective teams. Although these models provide such insight they also pose questions yet to be answered.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:555445
Date January 2012
CreatorsSalako, Kizito Oluwaseun
PublisherCity University London
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://openaccess.city.ac.uk/1302/

Page generated in 0.0027 seconds