Return to search

Automatic Reasoning Techniques for Non-Serializable Data-Intensive Applications

<div>
<div>
<div>
<p>The performance bottlenecks in modern data-intensive applications have induced
database implementors to forsake high-level abstractions and trade-off simplicity and
ease of reasoning for performance. Among the first casualties of this trade-off are the
well-known ACID guarantees, which simplify the reasoning about concurrent database
transactions. ACID semantics have become increasingly obsolete in practice due
to serializable isolation – an integral aspect of ACID, being exorbitantly expensive.
Databases, including the popular commercial offerings, default to weaker levels of
isolation where effects of concurrent transactions are visible to each other. Such weak
isolation guarantees, however, are extremely hard to reason about, and have led to
serious safety violations in real applications. The problem is further complicated
in a distributed setting with asynchronous state replications, where high availability
and low latency requirements compel large-scale web applications to embrace weaker
forms of consistency (e.g., eventual consistency) besides weak isolation. Given the
serious practical implications of safety violations in data-intensive applications, there
is a pressing need to extend the state-of-the-art in program verification to reach non-
serializable data-intensive applications operating in a weakly-consistent distributed
setting.
</p>
<p>This thesis sets out to do just that. It introduces new language abstractions, program logics, reasoning methods, and automated verification and synthesis techniques
that collectively allow programmers to reason about non-serializable data-intensive
applications in the same way as their serializable counterparts. The contributions
</p>
</div>
</div>
<div>
<div>
<p>xi
</p>
</div>
</div>
</div>
<div>
<div>
<div>
<p>made are broadly threefold. Firstly, the thesis introduces a uniform formal model to
reason about weakly isolated (non-serializable) transactions on a sequentially consistent (SC) relational database machine. A reasoning method that relates the semantics
of weak isolation to the semantics of the database program is presented, and an automation technique, implemented in a tool called ACIDifier is also described. The
second contribution of this thesis is a relaxation of the machine model from sequential
consistency to a specifiable level of weak consistency, and a generalization of the data
model from relational to schema-less or key-value. A specification language to express
weak consistency semantics at the machine level is described, and a bounded verification technique, implemented in a tool called Q9 is presented that bridges the gap
between consistency specifications and program semantics, thus allowing high-level
safety properties to be verified under arbitrary consistency levels. The final contribution of the thesis is a programming model inspired by version control systems that
guarantees correct-by-construction <i>replicated data types</i> (RDTs) for building complex distributed applications with arbitrarily-structured replicated state. A technique
based on decomposing inductively-defined data types into <i>characteristic relations</i> is
presented, which is used to reason about the semantics of the data type under state
replication, and eventually derive its correct-by-construction replicated variant automatically. An implementation of the programming model, called Quark, on top of
a content-addressable storage is described, and the practicality of the programming
model is demonstrated with help of various case studies.
</p>
</div>
</div>
</div>

  1. 10.25394/pgs.8977562.v1
Identiferoai:union.ndltd.org:purdue.edu/oai:figshare.com:article/8977562
Date14 August 2019
CreatorsGowtham Kaki (7022108)
Source SetsPurdue University
Detected LanguageEnglish
TypeText, Thesis
RightsCC BY 4.0
Relationhttps://figshare.com/articles/Automatic_Reasoning_Techniques_for_Non-Serializable_Data-Intensive_Applications/8977562

Page generated in 0.0022 seconds