<aside> 📘 TL;DR

本文探讨了 TCP 连接中，客户端和服务端的三种意外情况，此处以 alice & bob 来指代 TCP 的两端，假设 alice 和 bob 已经建立了 TCP 连接。

alice 断电后重启，bob 会认为连接仍然存续。如果 bob 发送数据，alice 会响应 ECONNRESET ，bob 断开连接。
alice 断电，bob 会认为连接仍然存续。如果 bob 发送数据，因为 alice 已关机，bob 会等待超时（一般为 5 分钟）后断开连接
当 alice 向 bob 发送 FIN 后，bob 可以在 ACK 后不发送 FIN，此时 alice 进入 FIN_WAIT，只能接收不能发送，而 Bob 处于 CLOSE_WAIT 状态（此时双方状态不对等，但依然可以单向传输数据），Bob 可以继续向 Alice 发信息。直到信息的间隔时间超过 TIMEOUT 后，alice 会关闭连接。若等到 alice 已关闭连接后，bob 再继续发送信息，等于情况 1，alice 会答复 ECONNRESET 。 </aside>

It's been said that we don't really understand a system until we understand how it fails. Despite having written a (toy) TCP implementation in college and then working for several years in industry, I'm continuing to learn more deeply how TCP works — and how it fails. What's been most surprising is how basic some of these failures are. They're not at all obscure. I'm presenting them here as puzzlers, in the fashion of Car Talk and the old Java puzzlers. Like the best of those puzzlers, these are questions that are very simple to articulate, but the solutions are often surprising. And rather than focusing on arcane details, they hopefully elucidate some deep principles about how TCP works.

Prerequisites

These puzzlers assume some basic knowledge about working with TCP on Unix-like systems, but you don't have to have mastered any of this before diving in. As a refresher:

TCP states, the three-way handshake used to establish a connection, and the way connections are terminated are described pretty concisely on the TCP Wikipedia page.
Programs typically interact with sockets using read, write, connect, bind, listen, and accept. There's also send and recv, but for our purposes, these work the same way as read and write.
I'll be talking about programs that use poll. Although most systems use something more efficient like kqueue, event ports, or epoll, these are all equivalent for our purposes. As for applications that use blocking operations instead of any of these mechanisms: once you understand how TCP failure modes affect poll, it's pretty easy to understand how it affects blocking operations as well.

You can try all of these examples yourself. I used two virtual machines running under VMware Fusion. The results match my experiences in our production systems. I'm testing using the nc(1) tool on SmartOS, and I don't believe any of the behavior shown here is OS-specific. I'm using the illumos-specific truss(1) tool to trace system calls and to get some coarse timing information. You may be able to get similar information using dtruss(1m) on OS X or strace(1) on GNU/Linux.

nc(1) is a pretty simple tool. We'll use it in two modes:

As a server. In this mode, nc will set up a listening socket, call accept, and block until a connection is received.
As a client. In this mode, nc will create a socket and establish a connection to a remote server.

In both modes, once connected, each side uses poll to wait for either stdin or the connected socket to have data ready to be read. Incoming data is printed to the terminal. Data you type into the terminal is sent over the socket. Upon CTRL-C, the socket is closed and the process exits.

In these examples, my client is called kang and my server is called kodos.

Warmup: Normal TCP teardown

This one demonstrates a very basic case just to get the ball rolling. Suppose we set up a server on kodos:

[root@kodos ~]# truss -d -t bind,listen,accept,poll,read,write nc -l -p 8080
Base time stamp: 1464310423.7650 [ Fri May 27 00:53:43 UTC 2016 ]
 0.0027 bind(3, 0x08065790, 32, SOV_SOCKBSD) = 0
 0.0028 listen(3, 1, SOV_DEFAULT) = 0
accept(3, 0x08047B3C, 0x08047C3C, SOV_DEFAULT, 0) (sleeping...)

(Remember, in these examples, I'm using truss to print out the system calls that nc makes. The -d flag prints a relative timestamp and the -t flag selects which system calls we want to see.)

Now on kang, I establish a connection: