Consistency Models

Serializability


Informally, serializability means that transactions appear to have occurred in some total order.

Serializability is a transactional model: operations (usually termed “transactions”) can involve several primitive sub-operations performed in order. Serializability guarantees that operations take place atomically: a transaction’s sub-operations do not appear to interleave with sub-operations from other transactions.

It is also a multi-object property: operations can act on multiple objects in the system. Indeed, serializability applies not only to the particular objects involved in a transaction, but to the system as a whole—operations may act on predicates, like “the set of all cats”.

Serializability cannot be totally or sticky available; in the event of a network partition, some or all nodes will be unable to make progress.

Serializability implies repeatable readsnapshot isolation, etc. However, it does not impose any real-time, or even per-process constraints. If process A completes write w, then process B begins a read rr is not necessarily guaranteed to observe w. For those kinds of real-time guarantees, see strict serializable.

Moreover, serializability does not require a per-process order between transactions. A process can observe a write, then fail to observe that same write in a subsequent transaction. In fact, a process can fail to observe its own prior writes, if those writes occurred in different transactions.

<aside> 💡 not requre a per-process order between transactions

即使在同一个 process 内也不保证任何跨事务的顺序,一个 process 的后一个事务的读可能看不见上一个事务的写。

</aside>

<aside> 💡 要保证多 process 实时或任何 process 执行顺序的强一致性,只能依靠 Strict Serializability

</aside>

The requirement for a total order of transactions is strong—but still allows Pathological Ordering . For instance, a serializable database can always return the empty state for any reads, by appearing to execute those reads at time 0. It can also discard write-only transactions by reordering them to execute at the very end of the history, after any reads. Operations like increments can also be discarded, assuming the result of the increment is never observed. Luckily, most implementations don’t seem to take advantage of these optimization opportunities.

Formally

The ANSI SQL 1999 spec says:

The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.

… and goes on to define its isolation levels in terms of proscribed anomalies: serializable is read committed, but without phenomenon P3:

P3 (“Phantom”): SQL-transaction T1 reads the set of rows N that satisfy some <search condition>. SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the <search condition> used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same <search condition>, it obtains a different collection of rows.

<aside> 💡 数据库脏读、不可重复读、幻读以及对应的隔离级别

</aside>

However, as Berenson, Bernstein, et al observed, the ANSI specification allows multiple intepretations, and one of those interpretations (the “anomaly interpretation”) admits nonserializable histories.

Adya’s formalization of transactional isolation levels provides a more thorough summary of the preventative interpretation of the ANSI levels, defining serializability as the absence of four phenomena. Serializability prohibits: