Socket shutdown versus socket close on cpython, jython and java.
So, here we are again with translating cpython socket semantics to java socket semantics, in order to correctly implement the cpython socket API on jython.
The latest issue is socket shutdown.
Cpython sockets have two separate methods related to terminating a socket connection; the close method and the shutdown method.
Before we get into the difference between the two, we should note that when a TCP socket connection is to be terminated, a network packet containing a FIN is sent to the peer, informing them to tear down the connection (although sometimes an RST packet can be sent instead). It is a sometimes used practice to shutdown only one side of a connection. For example, when a client knows that its request has been fully transmitted, it shuts down its transmit stream to the server, but leaves the receive stream open for the response. This makes the clients intention for future communications more explicit, and may lighten the load on the server. However, this is controversial topic, because that can also lead asynchronous servers to mistakenly think that a socket is readable when it is not. You can find a detailed and thorough discussion of this topic in ActiveState Recipe 408997: When to not just use socket.close(); make sure to read all of the comments.
The close method of sockets closes the file descriptor to which the socket is connected. When the last file descriptor connected to the socket is closed, then the operating system also closes the underlying socket, which results in the transmission of the FIN packet. That wording is important: “when the last file descriptor .. is closed” has specific meaning in many unix network servers architectures. The archetypal design for a network server on unix is to accept incoming socket connections in one process, and then spawn a subprocess to service the connection. Since the child process inherits the file descriptor table of its parent, it also inherits a file descriptor for the socket. Which means that there can be multiple file descriptors open on a socket. Which means that closing a socket is not guaranteed to close the connection; that will not happen until all file descriptors for the socket are closed.
The shutdown method, on the other hand only sends the FIN packet, regardless of how many file descriptors are open to the socket; moreover, it doesn’t affect any of those file descriptors.
There are some problems for implementing shutdown on java and thus on jython. Java servers are almost always multi-threaded, and process incoming requests in spawned threads, not processes. For this reason, and because of the fact that java does not expose C file descriptors, there are no socket methods that differentiate between the socket and file descriptor; the two are one and the same in java. This applies to all types of socket.
Implementing shutdown will be discussed below under the different socket types
- TCP client sockets are fine: java.net.Socket objects have the shutdownInput and shutdownOutput methods, which shut down the receive stream and the transmit stream respectively.
- TCP server sockets are different: they don’t have transmit and receive streams: they exist only to accept incoming socket connections. So when a TCP server socket is shutdown, the result of the call should be to close the listen queue for the socket, and discard all queued incoming connections. In java, server sockets do not have a shutdown method (as you can see from their javadocs: java.net.ServerSocket and java.nio.channels.ServerSocketChannels: remember, both classes are required to fully implement all cpython socket semantics in jython). So, how to implement the shutdown for TCP server sockets? In java, the same effect is achieved is issuing the close method. So the choice is whether or not to make shutdown call close? Since production code that uses the shutdown method will also call the close method immediately, or soon afterwards, that would make that series of calls fail. Therefore, the right choice (IMHO) is to have the shutdown method as a no-op, i.e. do nothing, and document the fact that the user must call close on the socket in the jython socket module documentation.
- UDP sockets, similarly don’t have input and output streams; calling shutdown on the socket should stop the acceptance of incoming packets. But in java, neither java.net.DatagramSockets nor java.nio.channels.DatagramSocketChannels have shutdown methods; instead the same effect is achieved in by calling the close method. So, the right choice for the shutdown of UDP sockets is to make it a no-op, and document that fact.
I have checked in an implementation of the above into the jython SVN; in the release 2.2 maintenance branch at revision 6000 and into trunk at revision 6001. I have put together this page to record my reasoning for implementing things this way, and the research that I’ve done in trying to arrive at the right implementation. Or at least, an implementation that is as right as possible for the jython programmer.
On a related note, I see that the good people over in Twisted Matrix are making great progress with getting twisted running on jython. It will be fantastic to see one of the premier python asynchronous frameworks working on jython!
Lastly, thanks to Glyph Lefkowitz of twisted for reporting this situation in the jython bug tracker: listening socket shutdown expects the wrong kind of socket.