Repeatable read is closely related to serializability, but unlike serializable, it allows phantoms: if a transaction T1 reads a predicate, like "the set of all people with the name “Dikembe”, then another transaction T2 may create or modify a person with the name “Dikembe” before T1 commits. Individual objects are stable once read, but the predicate itself may not be.
<aside> 💡 从 ANSI SQL 标准来说,Repeatable Read 允许幻读。
但是现实中 DBMS 普遍在实现 RR 的时候避免了幻读,比如 MySQL 的 RR 允许幻读,但是 InnoDB 的 RR 通过 gap lock 避免了幻读。在 DBMS 领域,为了对标 ANSI SQL 的真 RR,又发明了 Read Stability 。
幻读和不可重复读的区别在于,不可重复读读取的是同一个数据,而幻读是读取的不同的数据,只是这些数据作为一个整体的完整性被破坏了。
使用一致性快照可以解决幻读。
‣ - P84
</aside>
Repeatable read is a transactional model: operations (usually termed “transactions”) can involve several primitive sub-operations performed in order. It is also a multi-object property: operations can act on multiple objects in the system.
Repeatable read cannot be totally available; in the presence of network partitions, some or all nodes may be unable to make progress. For total availability, at the cost of allowing fuzzy reads, consider read committed.
Repeatable read implies Cursor Stability , read committed, etc.
Note that repeatable read does not impose any real-time constraints. If process A completes write w, then process B begins a read r, r is not necessarily guaranteed to observe w. For a transactional model that provides real-time constraints, consider Strict Serializability .
Moreover, repeatable read does not require a per-process order between transactions. A process can observe a write, then fail to observe that same write in a subsequent transaction. In fact, a process can fail to observe its own prior writes, if those writes occurred in different transactions.
<aside> 💡 not requre a per-process order between transactions
即使在同一个 process 内也不保证任何跨事务的顺序,一个 process 的后一个事务的读可能看不见上一个事务的写。
</aside>
Like Serializability , repeatable read allows Pathological Ordering . For instance, a repeatable-read database can always return the empty state for any reads, by appearing to execute those reads at time 0. It can also discard write-only transactions by reordering them to execute at the very end of the history, after any reads. Operations like increments can also be discarded, assuming the result of the increment is never observed. Luckily, most implementations don’t seem to take advantage of these optimization opportunities.
The ANSI SQL 1999 spec specifies repeatable read as read committed but disallowing phenomenon P2:
P2 (“Non-repeatable read”): SQL-transaction T1 reads a row. SQL-transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.
However, as Berenson, Bernstein, et al observed, the ANSI specification allows multiple intepretations, and one of those interpretations (the "anomaly interpretation) still admits nonserializable histories for “serializable” systems. Instead, we prefer Adya’s formalization of transactional isolation levels, which provides a concise definition of the preventative interpretation. In this model, repeatable read prohibits:
$P0(Dirty Write): w_1(x)...w_2(x)$:两个 tx 同时修改了某个数据,然后其中一个 tx 执行回滚,无法确认该回滚到什么值。(P0 Dirty Write)
$P1(Dirty Read): w_1(x)...r_2(x)$:读未提交
$P2(Fuzzy Read): r_1(x)...w_2(x)$:两次读到的值不一致(不可重复读)