Jython Journeys

Notes about my work with jython and python

Asynchronous networking for jython: the socket module

without comments

Over the Christmas holidays in 2003, I wrote a design for how Asynchronous Socket I/O might be implemented in jython, using java.nio. I wrote up some notes in HTML, and placed them on a group of pages on xhaus.com. These notes were the basis of the design which I later used to implement asynchronous I/O for jython sockets, which is now a part of the jython distribution.

Although these notes are now out-of-date, having been surpassed by the actual implementation itself, I am publishing them here for historical purposes. The notes are broken down into four main areas.

  1. Overview
  2. Socket module
  3. Select module
  4. Asyncore module

For documentation on how to use the jython’s asynchronous socket I/O, see the documentation for the socket, select, asyncore, and asynchat modules.


Modifications to the existing jython socket module

There are a number of modifications that would need to be made to the existing jython socket implementation in order to facilitate support for timeouts and for non-blocking operations. These modifications are discussed in this document.


Socket objects vs. SocketChannel objects

It is required to provide a facility in jython which permits the programmer to “watch” the readiness status of multiple I/O channels. In cpython, file descriptors are the handles which are used to represent watchable channels. A typical code sequence might look like this

s = socket.socket(AF_INET, SOCK_STREAM)
fd = s.fileno()
po = select.poll()
po.register(fd)

However, the concept of a file descriptor has no meaning on the java platform, since the JVM does not use file descriptors (they are too platform specific). Instead, in a jython implementation, it will be necessary to obtain a java.nio.channels.SelectableChannel representing the channel to be watched.

Fortunately, all socket objects in the java.net package have a getChannel() method (as of java 1.4 anyway).

Unfortunately, this method returns a null value when used with a socket created by the constructors java.net.Socket, java.net.ServerSocket, java.net.DatagramSocket. The only time when java.net.*Socket.getChannel() returns a SelectableChannel is when the socket itself was created through the java.nio.channels package. In all cases, this is done by calling a static method of the relevant class. The creation methods for the different channel types are

Channel type Java.net constructor Java.nio construction method
TCP Client Socket java.net.Socket() java.nio.channels.SocketChannel.open()
TCP Server Socket java.net.ServerSocket() java.nio.channels.ServerSocketChannel.open()
UDP Socket java.net.DatagramSocket() java.nio.channels.DatagramChannel.open()

This restriction means that all sockets that are to be processed using this proposed non-blocking socket I/O API must be created through the java.nio construction method outlined above.

However, it is not simply of case of switching over to the new methods for all socket creation, since it is hoped to also support as many operations as possible on JVMs prior to version 1.4, e.g. timeouts. Therefore, it is necessary to separate out socket creation operations to a single set of factory methods or functions, which can be altered depending on the runtime environment in which they are run. The following is one proposed code sequence for creating socket factory functions, which should work on all JVMs. Another possible approach might be to create a factory class which dynamically modifies its behaviour.

try:
    import java.nio.channels
    print "using java.nio"
 
    def _createclientsock():
        return java.nio.channels.SocketChannel.open().socket()
 
    def _createserversock():
        return java.nio.channels.ServerSocketChannel.open().socket()
 
    def _createdatagramsock():
        return java.nio.channels.DatagramChannel.open().socket()
 
except ImportError:
    print "using java.net"
 
    def _createclientsock():
        return java.net.Socket()
 
    def _createserversock():
        return java.net.ServerSocket()
 
    def _createdatagramsock():
        return java.net.DatagramSocket()

Note that it should be sufficient to return simple java.net socket objects from these functions, because

  1. It requires less code modification to the existing jython socket module: the existing code is designed to work with java.net.*Socket objects.
  2. In the case of 1.4 JVMs, where support for non-blocking operations is provided, the SelectableChannel corresponding to a socket can always be retrieved from the socket’s getChannel() method.
  3. In the case of pre-1.4 JVMs, which do not support nonblocking I/O, calls to the getChannel() method will fail. However, non-blocking support is not provided for pre-1.4 JVMs.

In summary

  1. Sockets should be created using the java.nio APIs where possible
  2. Sockets should be created using the java.et APIs if java.nio is not available
  3. Both these factory methods should return objects which implement the same interface, i.e. the interface presented by socket objects in the java.net package.
  4. When a java.nio.channels.SelectableChannel is required, it can be retrieved from the socket using the the getChannel() method.

Socket creation in existing jython socket module

The existing jython socket module constructs all sockets by using constructors from the java.net package. However, as discussed above, if a socket channel is to be watchable, a java.nio.channels.SelectableChannel object is required. Thus, the existing jython socket module will have to be modified to use the the factory approach outlined above, which uses java.nio where possible, and java.net where not. The only parts of the socket module that should require modification are the parts that actually create sockets.

For example, the following is the current code for the connect method of socket objects

def connect(self, addr, port=None):
    "This signifies a client socket"
     if port is not None:
         addr = (addr, port)
     assert not self.sock
     host, port = addr
     if host == "":
         host = java.net.InetAddress.getLocalHost()
     self._setup(java.net.Socket(host, port))

Which could be replaced with code similar to the following

def connect(self, addr, port=None):
    "This signifies a client socket"
     if port is not None:
         addr = (addr, port)
     assert not self.sock
     host, port = addr
     if host == "":
         host = java.net.InetAddress.getLocalHost()
     sock = _createclientsocket()
     self._setup(sock)
     self.sock.connect(java.net.InetSocketAddress(host, port))

Note that this code is not yet sufficient: there is still timeout and blocking configuration to be taken into account. See below under Implementing the cpython 2.3 blocking/timeout model.


Types of channels that can be examined

In cpython, when a channel is to be watched, the file descriptor representing the channel is registered with a poll object. However, whether or not the channel is watchable is determined by the operating system, and is thus platform independent. This model fits very well with cpython’s late-binding ethos.

In contrast, java has a very precise object model which differentiates watchable channels from other channel types by having watchable channel types extend the abstract class java.nio.channels.SelectableChannel. If a java object is not a subclass of this abstract class, it cannot be “watched”.

This means that a mechanism is required for obtaining the underlying SelectableChannel from any given socket, through a public method, preferably re-using a known and comparable method from the cpython 2.3 API.

It is proposed to use the (currently unimplemented in jython) fileno() method to enable the programmer to obtain a SelectableChannel object which can be passed to poll objects. The fileno() method is the closest analogy to a SelectableChannel in the cpython model, and its return value is most frequently used as a SelectableChannel would be. However, since the two concepts are different, it is proposed to create two separate methods with the same interface. Thus, the fileno method for sockets could be defined as follows in the jython socket module.

class _tcpsocket:
 [....]
    def getchannel(self):
        return self.sock.getChannel()
 
    fileno = getchannel

The getchannel() method could be used internally to the new jython socket module, since it has inherent knowledge that the returned value is not a file descriptor but a SelectableChannel.

A similar method would be required for the _udpsocket class.


Implementing the cpython 2.3 blocking/timeout model

The following list of cpython socket methods (as of cpython 2.3) are currently not available on jython socket objects, mostly because they are to do with non-blocking I/O or timeouts.

Method Description Notes
setblocking This method permits the programmer to configure a socket to be in blocking or non-blocking mode. This is the direct equivalent of the java java.nio.channels.SelectableChannel.configureBlocking() method.
settimeout This method permits the programmer to set the timeout value for a timeout mode socket. This is the direct equivalent of the java java.nio.channels.SelectableChannel.configureBlocking() method.
gettimeout This method permits the programmer to retrieve the timeout value for a timeout mode socket. This should return the currently configured timeout value.

In the cpython socket model, the above top two methods are closely related. The cpython module reference for the socket module says the following


Some notes on socket blocking and timeouts: A socket object can be in one of three modes: blocking, non-blocking, or timeout. Sockets are always created in blocking mode. In blocking mode, operations block until complete. In non-blocking mode, operations fail (with an error that is unfortunately system-dependent) if they cannot be completed immediately. In timeout mode, operations fail if they cannot be completed within the timeout specified for the socket. The setblocking() method is simply a shorthand for certain settimeout() calls.


Timeout mode internally sets the socket in non-blocking mode. The blocking and timeout modes are shared between file descriptors and socket objects that refer to the same network endpoint. A consequence of this is that file objects returned by the makefile() method should only be used when the socket is in blocking mode; in timeout or non-blocking mode file operations that cannot be completed immediately will fail.

The following algorithms for a jython implementation of the above methods could be proposed, in order to reflect the same semantics as the cpython API.

    def settimeout(self, timeout):
        if timeout is None:
            self.setblocking(1)
        else:
            millis = int(timeout * 1000)
            if millis == 0:
                self.setblocking(0)
            else:
                self.sock.setSoTimeout(millis)
 
    def setblocking(self, flag):
        try:
            self.getchannel().configureBlocking(flag)
        except:
            raise JynioException('Channel cannot be configured for blocking')

However these naive implementations cannot be used in practice, because of the deferred creation of socket implementation objects. This deferred creation of socket implementation objects means that the actual underlying implementation socket may not exist when the settimeout() and setblocking() methods are called. Therefore the socket configuration information must be stored locally on the jython socket object, for use in configuring the socket after it’s deferred creation.

Since the goal of this project is cpython 2.3 compatibility, it is proposed that the three explicit modes mentioned in the cpython documenation be adopted as allowable states of jython socket objects. These modes are blocking, non-blocking, timeout, and could be represented as constants as follows.

MODE_BLOCKING    = 'block'
MODE_NONBLOCKING = 'nonblock'
MODE_TIMEOUT     = 'timeout'

Also, there should be a permitted set of states for any given socket. Whether or not a socket is in one of the permitted states can be checked by assertions.

_permittedstates = (MODE_BLOCKING, MODE_NONBLOCKING, MODE_TIMEOUT)

Note, however, that compatibility with cpython 2.3 on JVM platforms prior to 1.4 is also a goal. On such JVMs there are only two permissible modes of operation: blocking and timeout: non-blocking operation is not supported. Therefore, the code above would have to be written as

try:
    import java.nio.channels
    _permittedmodes = (MODE_BLOCKING, MODE_NONBLOCKING, MODE_TIMEOUT)
except ImportError:
    _permittedmodes = (MODE_BLOCKING, MODE_TIMEOUT)

The definition of the above methods then becomes (including required modifications to other functions/methods).

def socket(family, type, flags=0):
    [.....]
    self.blockmode = MODE_BLOCKING
    self.timeout = 0
    [.....]
 
 
class _tcpsocket:
 
    def _config(self):
        assert self.blockmode in _permittedmodes
        if not self.sock: return
        if self.blockmode == MODE_BLOCKING:
            self.sock.setSoTimeout(0)
        if self.blockmode == MODE_NONBLOCKING:
            self.sock.getchannel().configureBlocking(0)
        if self.blockmode == MODE_TIMEOUT:
            self.sock.setSoTimeout(int(self.timeout*1000))
 
    def settimeout(self, timeout):
        if timeout is None:
            self.blockmode = MODE_BLOCKING
            self.timeout = None
        elif int(timeout) == 0:
            self.blockmode = MODE_NONBLOCKING
            self.timeout = 0
        else:
            self.blockmode = MODE_TIMEOUT
            self.timeout = timeout
        self._config()
 
    def setblocking(self, flag):
        if flag:
            self.blockmode = MODE_BLOCKING
            self.timeout = 0
        else:
            self.blockmode = MODE_NONBLOCKING
            self.timeout = 0
        self._config()

Note the introduction of the _config() method, which will be required on the _tcpsocket and _udpsocket classes. When and where this method should be used is discussed below.


Deferred creation of socket implementation objects

In cpython, after a socket object has been created, it can be used as either a server or a client socket. However, on the java platform, this “late-binding” on the socket implementation object itself cannot be used, because server and client socket objects are implemented through separate classes on the JVM.

The current jython socket module defers the creation of socket objects until it is known for what purpose the socket will be used. For example, when the connect() method is called on a socket object, it is then known that this is to be a client socket.

All such methods that determine the final server/client/datagram nature of a socket (i.e. that both create and configure sockets), must be modified to take the new timout and blocking configuration options into account. This can be done through the _config() method discussed in the previous section. The sequence of operations should then be

  1. Create the socket
  2. Configure the socket for timeouts, blocking, etc
  3. Carry out the desired operation on the socket (e.g. accept, connect, etc)

Taking the above definition of the socket.connect() method, and modifying it to support blocking configuration gives the following code.

    def connect(self, addr, port=None):
        "This signifies a client socket"
        if port is not None:
            addr = (addr, port)
        assert not self.sock
        host, port = addr
        if host == "":
            host = java.net.InetAddress.getLocalHost()
        sock = _createclientsock()
        self._setup(sock)
        self._config() # Configure timeouts, etc, now that the socket exists
        self.sock.connect(java.net.InetSocketAddress(host, port))

The following is a typical expected call sequence for using sockets with timeouts.

from socket import *
 
s = socket(AF_INT, SOCK_STREAM)
s.settimeout(1.0)
s.connect( ('www.python.org', 80) )

Socket methods that would require similar modification to the connect() method above are

  1. _tcpsocket.accept()
  2. _tcpsocket.listen()
  3. _udpsocket.bind()
  4. _udpsocket.connect()
  5. _udpsocket.sendto()

Exception compatibility with cpython.

It is desirable that the proposed new jython socket module present the same “exceptions interface” as the cpython 2.3 socket module. This is currently not the case with the existing jython socket module. Consider the following two logs of interactive sessions with socket, one on cpython 2.3 and the other on jython 2.1.

Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from socket import *
>>> s = socket(AF_INET, SOCK_STREAM)
>>> s.connect( ('anamethatdoesnotexist', 80) )
Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 1, in connect
socket.gaierror: (4, 'getaddrinfo failed')

And

Jython 2.1 on java1.4.2 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>> from socket import *
>>> s = socket(AF_INET, SOCK_STREAM)
>>> s.connect( ('anamethatdoesnotexist', 80) )
Traceback (innermost last):
  File "", line 1, in ?
  File "C:jython21Libsocket.py", line 135, in connect
java.net.UnknownHostException: anamethatdoesnotexist
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:153)
        [.......]
        at org.python.util.jython.main(jython.java)
 
java.net.UnknownHostException: java.net.UnknownHostException: anamethatdoesnotexist

This difference between exceptions reported on the different platforms makes it more difficult to write robust portable code, and can lead to unnecessary increased code density. It is desirable that the jython socket module generate the same exceptions as the cpython 2.3 module, to make the programmers job more straightforward.

However, this is non-trivial task. There is a very wide variety of socket exceptions that can be reported, reflecting the wide varieties of errors or problems that can be encountered with sockets, especially in relation to resource usage and other physical considerations. Also, sometimes the same exception differs in meaning subtly between platforms.

Another point is that although it may not be trivial to report the same exceptions as cpython, the following question should be asked: Is it correct to raise Java platform exceptions from socket methods?

To be discussed. I know this is not a trivial problem.

Written by alan.kennedy

December 26th, 2003 at 9:10 pm

Posted in jython

Tagged with , ,