Cleansocks

This semester we will use the Cleansocks library. It provides a C++-based interface to the standard sockets designed to preserve the concepts and procedures of the sockets library, while attempting to hide some of the C-ish grunginess. It was created under Linux, and subsequently dragged over to Windows, so it can be used on either platform. Others which support some kind of socket-like networking should be possible as well. The current release is marked at version 0.1.4.

Download links and installation instructions are here. This page describes the interface.

I have ported some of Comer's CNAIAPI examples to use Cleansocks. They are here:
Ported CNAI Chat Client Example Ported CNAI Chat Server Example Ported CNAI Web Client Example Ported CNAI Web Server Example

Sockets Proper

This section describes the socket interface “proper”. Sockets are an abstract interface designed to deal with any sort of networking. To use the classes and operations in this section, include the header file cleansocks.h. All types and operations described here are defined in the cleansocks namespace, so you might want to say using namespace cleansocks;, or qualify the names you are using.

class socket s;

A socket is a communications endpoint. All messages or connections travel between to sockets, usually located on different computers. A C++ object of class socket represents one of these endpoints. You won't generally use this class directly, however, since it is abstract and does not belong to any particular type of network. You will want to create concrete sockets, such as the TCPsocket object described below. Concrete sockets are derived from class socket, and can be used with all the operations described here.

class endpoint e;

This is also abstract, and it is the name of a communications endpoint; essentially the identifier of a socket. The system uses an endpoint description to direct the information to the correct socket. Like sockets, you will not create endpoints directly, but use the concrete versions for a specific type of network, such as IPendpoint described later. An IPendpoint is a host name combined with a port number.

connect(socket s, endpoint e)

This call attempts to connect the socket s to the remote endpoint identified by e. It is used for stream-oriented protocols like TCP, usually by client programs. If the connection is successful, the socket can be used in the send and recv operations below. If unsuccessful, the method will throw an exception.

bind(socket s, endpoint e)

This associates socket s with the local endpoint identifier e located on this computer. This is most often used by servers to specify where clients will need to send information to contact them. For instance, a TCP-based server will use bind specify which port clients should try to contact.

It will succeed or throw an exception.

listen(socket s [, int b])

This places socket s in listen mode. It is used by connection-oriented protocols to allow other sockets to connect to s. Servers use this call to listen for clients to connect. Think of it as turning on your phone so you can receive calls.

The optional parameter b is the backlog size. It tell how many un-received connections will be held by the system before new connections are refused. Think of it as the maximum number of calls you are allowed to have waiting.

listen will succeed or throw an exception.

socket s2 = accept(socket s)
socket s2 = accept(socket s, endpoint& e)

Accept receives a connection from a remote socket. The socket s must have been put into listening mode previously. A call to accept suspends the calling thread until some other socket attempts to connect to s, then it resumes the caller and returns. Think of it as answering on your phone. Perhaps more accurately, answering your phone after it wakes you up. (This is not unlike reading from the keyboard, where the read waits until you type something.)

The return value s2 is another socket. This one is already connected to the remote socket, and you use s2 (with send and recv) to communicate with the socket that connected to s and woke you up. The e in the second form is an output parameter. After the call returns, it will contain the endpoint id of the socket that connected to s.

accept will succeed or throw an exception.

int i = send(socket s, const void *buf, int size [, int flags ] )

This attempts to send the first size bytes starting from the location given by buf to the socket s is connected to. That means s must have been sent to a successful connect, or returned by a successful accept. The return value i is the number of bytes actually sent, which should generally be size. Upon failure, send will throw an exception. Whether a return value less than size is bad depends on what protocol you are using. It should not happen in most situations. The optional flag are the options for the standard socket send operation; you won't generally need to provide this. Optional behaviors may differ a bit from this description.

This is an alternate version of send which accepts a C++ standard string instead of buf and size.

int i = recv(socket s, const void *buf, int size [, int flags ] )

This receives up to size bytes from the socket s is connected to and places them into the location given by buf . That means s must have been sent to a successful connect, or returned by a successful accept. The return value i is the number of bytes actually received, which might be less than size. Upon failure, send will throw an exception. If there is no data available, recv will cause your program to wait until some arrives. Even then, return values less than size are common, and simply indicate that size bytes have not yet arrived. A return value of zero indicates that the remote socket has been closed, and indicates a normal shutdown.

Same story for flags.

int i = sendto(socket s, const void *buf, int size [, int flags ], endpoint e)

The sendto call is used with message-oriented protocols (like UDP), in which sockets are not connected. It sends the single message in the location indicated by buf and size bytes long. The function returns the size of the message sent. The sent message size may be less than size if that is too large to be handled by the protocol in use.

int i = recvfrom(socket s, const void *buf, int size [, int flags ] [, endpoint & e])

The recvfrom call is also used with message-oriented protocols It receives a single message and places it into the location indicated by buf which must be at least size bytes long. If the message is longer than size, the extra bytes are discarded. The function returns the (original) size of the message. If e is provided, it is an output parameter and recvfrom fills it in with the endpoint identifier of the socket which sent the message.

close(s)

This closes the socket, indicating it is no longer used and cleaning up associated system resources. Sockets should be closed when no longer needed (but see below).

Assigning and copying sockets is generally allowed, but can be problematic. A copy of a socket is not an independent resource, but refers to the same underlying object as the original. If the original or the copy is closed, the other will be invalid, and operations will fail. Closing both is usually safe, but could be problematic, at least on a Unix style OS. This reflects a behavior of the underlying socket layer which cleansocks does not attempt to modify. This might be done in a later version.

IP 4 Sockets

This section describes the part of Cleansocks which deals with IP version 4. Include cleanip.h.

class TCPsocket
class UDPsocket

These are the classes representing sockets that use the TCP and UDP protocols, respectively. They are derived from the socket class, so the socket operations described there can be applied to them.

class IPaddress a;
class IPaddress a("w.x.y.z");
class IPaddress a(unsigned int v);

This class represents a IP version 4 address. It can be constructed from a string giving and address in dotted-decimal notation, or from an unsigned integer value. The default constructor produces an invalid address, but you may want to use it to create a variable which you can assign later. Use lookup_host (below) to get an address from a host name. This succeeds or throws an exception.

class IPport;
class IPport(int p)

This class represents an IP port number. It may be constructed by by specifying the number, or by default to assign later. Use lookup_service (see below) to look up a port number by its service name.

class IPendpoint;
class IPendpoint(IPaddress a, IPport p)

This is the IP derivative of an endpoint, built from an IP address and a port number.

IPaddress a = lookup_host(string hn)

The lookup_host method takes a host name and returns its IP address. It takes a C++ standard string. It will succeed or throw an exception.

IPport a = lookup_service(string hn)

The lookup_service method takes the name of a standard service and looks up the standard port number. For instance, if you look up http you will get port 80.

IPaddress::any()

This is the so-called wild-card address. It is used to build an IPendpoint that represents a specified port at any address. This is used so that a server can bind to a port on any local address.

Buffered Stream Sockets

The buffered_socket is a convenience which extends the functionality of the basic socket interface. The buffered socket class can be used with any connected socket. It supports send and recv, but the object may read more bytes than the user requests, and save them to return on the next receive. This can increase efficiency of applications which want to do small reads, and allows an efficient recvln method which is useful for line-oriented protocols. Include cleanbuf.h to use this facility.

class buffered_socket q(socket s [, int size ])

This constructs a buffered_socket object, q. The size is the amount of local storage allocated, and defaults to 1024 bytes. The constructor assumes that s supports the send and recv operations, and later operations will fail if this is not true.

int i = recv(buffered_socket s, const void *buf, int size [, int flags ] )

This has the same client semantics as the regular socket recv, except that the underlying implementation may ask for more than size bytes, which it then retains and returns upon subsequent calls.

int i = recvln(buffered_socket s, const void *buf, int size [, char term ] [, int flags ] )

The recvln method behaves like recv, except that it stops reading at the first occurrence of the terminator character term, which is newline by default. It will continue to try to get data until buf is full, or until term is found. This is very useful for protocols which use line breaks, such as HTTP, FTP, and the email protocols.

int i = send(buffered_socket s, const void *buf, int size [, int flags ] )

This just calls send on the socket object contained in s. A future version of the library may attempt to buffer sending, but this one does not.

Exceptions

Cleansocks throws exceptions to report errors (the regular socket interface does not use exceptions). The library has several, all derived from the class socket_error, which is derived from the standard C++ runtime_error. Their inheritance relationship is like this:

std::runtime_error

socket_error

socket_sys_error

socket_db_error

socket_would_block

The socket_error class is the base of any exception detected by this library. There are a few places which throw exceptions of exactly this type. For most purposes, if you need to catch anything, catch socket_error &. The socket_sys_error exceptions represent errors reported from the basic (generic) socket calls described in the first section of this document. The socket_db_error is thrown by failures of host or service name performed by lookup_host or lookup_service. This division reflects one in the underlying socket interface. The std::runtime_error class and all its children have a .what() method which returns a descriptive string. The socket_sys_error and socket_db_error also implement a method .code() which returns the integer error code reported by the system. These values will differ based on the underlying operating system, so you should probably avoid using them.

The socket_would_block error wraps the system error return of the same type (hence it's a socket_sys_error). This occurs only when using non-blocking sends or receives (which is not the default). It is represented by a separate exception since it will often need separate treatment by programs which use it.

The .what() strings for socket_sys_error and socket_db_error are the error messages provided by the underlying system, and will have different text on different OSes. Error codes for socket_db_error are error return values from the underlying socket functions, mapped to strings by the socket call gai_strerror(). For socket_sys_error, under Unix the codes come from errno and are mapped by strerror_r; on Winsock, they come from WSAGetLastError() and are interpreted by FormatMessage(), a seven-parameter monstrosity that could have come from nowhere but Redmond.

Thread Safety

The Cleansocks library is implemented using Winsock on Windows and the standard socket and IP lookup calls on Unix, as well as the standard C++ libraries. AFAK, all the underlying calls are thread-safe. In particular, the host and service lookups on Linux are done using the newer getaddrinfo() call, rather than the older interface.