<aside> 📘 Series:

Reflections on Solana's Sept 14 Outage

solana: Tower BFT

Solana Validator

</aside>


When a vote is added to the tower, the lockouts of all the previous votes in the tower are doubled (more on this in ). With each new vote, a validator commits the previous votes to an ever-increasing lockout. At 32 votes we can consider the vote to be at max lockout any votes with a lockout equal to or above 1<<32 are dequeued (FIFO). Dequeuing a vote is the trigger for a reward. If the vote on the top of the tower expires before it is dequeued, it and subsequent expired votes are popped in a LIFO fashion from the vote tower. The validator needs to start rebuilding the tower from that point.

<aside> 📘 每个 validator 都在尽力 vote 尽可能多的 slots,并且尽快实现 lockout ≥ 32 的 slot 并领取奖励。这个行为就像是在堆一座塔,塔顶就是 lockout == 32,所以这一算法称为 vote tower,或 tower Byzantine Fault Tolerance。

lockout 也称为 timeout,都是指一个 slot 的剩余过期时间,以 slot 为单位,validator 每 400ms 打包一次 slot,所以 lockout/timeout * 0.4s 就是过期时间。

lockout 不仅仅是 slot 的过期时间,也是 validator 的承诺,在 slot 过期前,自己不会去投其他的 fork。

<aside> ❓ 我有个疑问,lockout/timeout 每次都翻番,文档说 2^32 == 136 yrs,这时间时很长,足以认定不会发生 rollback。但是这又不代表 validator 真的需要花这么长时间,仅需 2s 的连续 5 次投票就可以搞出一个 32-lockout 来,这才是真正的攻击时间窗口吧

</aside>

</aside>

Vote Tower

Before a vote is pushed to the tower, all the votes leading up to vote with a lower lock expiration slot than the new vote are popped. After rollback lockouts are not doubled until the validator catches up to the rollback height of votes.

For example, a vote tower with the following state:

<aside> 📘 表格的说明

vote vote slot lockout lock expiration slot
4 4 2 6
3 3 4 7
2 2 8 10
1 1 16 17

Vote 5 is at slot 9, and the resulting state is

<aside> 📘 第 5 轮投票直接投给了 slot 9,暗示 validator 从 slot 4 - 9 期间休眠了。这段时间导致所有 lock expiration slot < 9 的 slot 全部过期。也就是 3、4 会被移除。

lockout 都从 2 开始。

</aside>

vote vote slot lockout lock expiration slot
5 9 2 11
2 2 8 10
1 1 16 17

Vote 6 is at slot 10

<aside> 📘 vote 6 使得 vote 5 的 lockout 翻倍了,其 lock expiration slot 也同步增长。

vote 6 投给了 slot 10,可以注意 vote 2 的 lock expiration slot 恰好是 10。如果下一轮投票不能让 vote 2 的 lockout 翻倍,vote 2 就要过期了。

</aside>

vote vote slot lockout lock expiration slot
6 10 2 12
5 9 4 13
2 2 8 10
1 1 16 17

At slot 10 the new votes caught up to the previous votes. When vote 7 at slot 11 is applied we scan top down to pop expired votes. Although vote 2 has expired, since vote 6 has not expired, we do not continue scanning. Finally we have reached a new stack depth, lockouts are doubled

<aside> 📘 vote 7 顺利投给 slot-11,使得所有的 ancestor 的 lockout 都翻倍。

如果此处 vote-7 投给了 slot-12 或更后面的 slot,就会导致 vote-2 过期。

</aside>

vote vote slot lockout lock expiration slot
7 11 2 13
6 10 4 14
5 9 8 17
2 2 16 18
1 1 32 33

Finally we have vote 8 at slot 18, this leads to the expiry of vote 7, vote 6, and vote 5.

<aside> 📘 此处 validator 又休眠了很久,vote-8 直接投给了 slot-18,导致 vote-5、6、7 全部过期。

</aside>