Table of Contents
An Introduction to Networking with Tango
Here we discuss the network components provided by the Tango API. Anyone with even a vague recollection of the old C socket interfaces will recall why network software development has always been such a "trial by ordeal". It is just not fun working with cryptic API's, seemingly inconsistant conventions, and confusing preprocessor macros. Tango, of course, really tries to make it all easier. The approach, once again, is to make the network API clear and consistant across platforms.
While claiming an API to be easy does not make it so, we hope the examples and explanations in this chapter will be adequate testimony that Tango "gets it right." The Tango network interface is foundational to many powerful and more complex network objects which will be discussed later.
First, however, let us present a little history of networking. This should give a background as to why fundamental entities like sockets, ports, and internet addresses exist in the first place.
A Brief History of Networking
In the 1970's, the Berkley Software Distribution community made substantial API contributions via their locally developed Bell Lab's UNIX derivative called BSD UNIX.
Already a luminary in the academic realm, BSD UNIX garnered more attention in the corporate world when Sun Microsystems released its own derivative called SunOS. It took only a few years for numerous other proprietary UNIX's to appear on the scene, each exchanging, borrowing and contributing to the cacophony of computer and software technologies that already permeated the industry. Yet, BSD, rising above this sea of proprietary Unixes, had made it's mark as a standard setter as important to the industry as the original AT&T Unix itself.
In the network programming sector alone, one of the most important of the BSD contributions became known as "BSD Sockets". Years later, as networks were becoming prevalent everywhere, BSD Sockets (the high level rendition of network communication) transformed into a defacto standard in practically every modern operating system, UNIX-derived or not.
A Summary of Networking
Simply, a network is a collection of computers that communicate with each other through a transmission medium. The communication mechanism must be managed like any other resource to prevent complete pandemonium amongst the connected computers, much like a group of children must be made to talk in turn in order for their speech to be intelligible. Even so, communication is usually managed on several levels using hardware and software protocols. Some protocols help prioritize transmissions; others route the communication to the correct computer; still others help guarantee that connections are established and maintained between specific points.
These protocols are rules that state how network components agree to communicate. They are integrated into hardware or software and are present in multiple levels of any network model.
Networks were coming into such prominence that eventually a standard model was designed to describe how to connect combinations of devices for communication. (Complete Encyclopedia of Networking, p 727). The Open Systems Interconnection (OSI Reference Model) is a seven-layer model meant to demonstrate the functional layers of a network and the features available in each layer. (REWORD LAST SENTENCE COMPLETELY)
A layer represents a functional contributer to a portion of the network model. For example, a software application with its protocols represents the highest level Application Layer of the OSI (FTP and Telnet are application layer protocols); TCP and UDP protocols represent a middle layer called the Transport Layer; and the transmission line (wired or wireless) represents the the lowest layer called the Physical Layer. (Ibid. p 734) See the figure for a full view of the seven layer model.
The OSI model acts as a guideline only, an abstract breakdown of how network systems operate. Most networks systems will implement most of these layers, but they won't be so clearly defined. Some layers may be logically merged for convenience.
What the OSI model should indicate is that communication happens in layers. The model helps us understand how network systems work: when we make a connection with another computer, there are several processes that go on, each playing an important part in carrying the payload (the packet) to the destination. Each level generates overhead in carrying the payload; this overhead is the price paid by the delivery system in exchange for reliability and functionality.
Each layer at the destination sees its equivalent layer at the source. This means that packets are passed down the layers at the source and back up the layers at the destination; the destination levels strip off the pertinent information designated for their specific layer until the packet is presented in its naked form to the Application Layer. This process will be useful to understand when we discuss the Tango Socket in the next section.
Like the name implies, a socket is a medium for connecting one entity to another. The entities involved are always processes (software running on an operating system) that may exist on local or distributed network nodes (computers). Each process must create a socket in order to establish a communication portal. From the application perspective, the location of the communicating nodes is unimportant so long as the nodes are members of an operational network. Since the socket interface facilitates a form of distributed communication, the mechanism is often referred to as interprocess communication (IPC).
Data that is sent or received in this model is called a packet. A packet contains useful information that one processes delivers to another. Examples of packets would be anything from emails sent out on the Internet to web pages being downloaded from a server. Though packets may be almost any size, practicality dictates that information transferred must be broken up into smaller parts before transmission. While the application may see the received web page as a whole, lower level layers will usually partition large payloads into smaller. These packets are sections of the whole transmitted in designated byte sizes.
In Tango the Socket class constitutes a complete wrapper for the BSD Socket API. In that sense, it is a low level network interface in Tango: despite simplifying the interface of the BSD Socket in general, using it still requires a thorough knowledge of the BSD mechanism. Here we will give an overview of how to apply the Tango Socket mechanism. For a more thorough discussion of network socket specifics, see the references at the end of this chapter. In later sections, this chapter will introduce high level network interfaces that eliminate much of the hassle of working with low-level sockets.
Creating the Socket
Before we can create a socket, we will need to describe how it will be used and how it will communicate with another socket. Most network programs use maybe one or two types of sockets for the majority of tasks, but flexibility demands that many options be available to the programmer. Therefore, during creation of a socket, we need to specify several socket features that define how we will later communicate through the socket. There are three types of attributes to set:
- address family
- socket type
- protocol type
The address family describes how all communication points or nodes will be addressed or accessed through the new socket. For example, if we create a socket with AddressFamily.INET, then all communications through the socket are expected to use a 32-bit internet address (eg. 127.0.0.1) and a 16-bit port number. Using AddressFamily.UNIX means that we expect to communicate through the socket using a local UNIX file path descriptor as an address. Here is a list of address family's provided by the Tango Socket:
- AddressFamily.UNSPEC -- not specified
- AddressFamily.UNIX -- addresses are file paths for this socket
- AddressFamily.INET -- addresses are 32-bit internet addresses and a port number
- AddressFamily.IPX -- addresses use the Netware format
- AddressFamily.APPLETALK -- addresses use the AppleTalk? format
Next we describe the socket type. A type specifies how the socket manages the connection or controls the flow of outgoing or incoming data. For example, choosing a STREAM type indicates that the socket will try to guarantee a reliable two-way transmission of data bytes. On the other hand, selecting the datagram (DGRAM) type indicates that we want a socket that is connectionless; that is, the socket knows nothing of an outside connection, yet sends bytes out to the specified address without any guarantee that the data is actually received or even arrives in order. These are the socket types (see references for more details):
- SocketType.STREAM -- reliable two way connection
- SocketType.DGRAM -- connectionless unreliable datagrams
- SocketType.RAW -- raw protocol access
- SocketType.RDM -- reliably-delivered message datagrams
- SocketType.SEQPACKET -- sequenced, reliable, two-way connection-based datagrams
The last category involved for a Socket setup provides a selection that describes the protocol family to be used. Protocols determine how processes communicate with each other, the rules involved for transmission and acquisition of data. The Socket, therefore, must setup the protocol family before communication can commence.
We recall that the OSI Reference model describes several layers. Protocols of different varieties exist in many of these layers. Fortunately, the Socket connection is most interested in the protocol specific to the Transport layer. But since protocols often exist in families (ie associated protocols from different OSI levels that are used together), choosing a Transport protocol automatically insures that another protocol will be used on a lower level (eg. Network Layer). This is one interpretation of a protocol "family". For example choosing the Transport Layer protocol TCP for our Socket guarantees that we will be using a Network layer protocol IP, and any associated lower level protocols that go with it. The Protocol types available for the Socket are as follows:
- ProtocolType.IP -- Internet Protocol 4 (Network Layer)
- ProtocolType.ICMP -- Internet Control Message Protocol (part of the IP family)
- ProtocolType.IGMP -- Internet Group Management Protocol (part of the IP family)
- ProtocolType.GGP -- Gateway to Gateway Protocol (Obselete transport layer protocol that is part of the IP family)
- ProtocolType.TCP -- Transmission Control Protocol (the mainstay of the internet - connection oriented transport layer protocol that is part of the IP family)
- ProtocolType.PUP -- PARC Universal Packet Protocol (old protocol family, precursor to TCP/IP)
- ProtocolType.UDP -- User Datagram Protocol (important transport layer protcol of the IP family that is connectionless and stateless, in contrast to TCP).
- ProtocolType.IDP -- Xerox NS Protocol family (early protocol suite)
Several of these transport layer protocols belong to the IP family; this makes sense since the Socket API has it's origins in the UNIX. Therefore, it is apparent that protocol types are mostly oriented to choosing specific transport layer protocols within this the IP family.
ICMP and IGMP are closely associated with the network layer IP protocol, although they are not technically transport layer protocols (see wikipedia articles for details on the above protocols). We also see that several of these protocols are obsolete on modern systems, yet their names remain part of the original Socket API for compatibility (IGP, PUP suite, IDP suite). The common protocols for Sockets remain IP, TCP, UDP. The few other IP family protocols are specialized protocols and less commonly used.
-- continued --
More than Sockets
Tango exposes this kind of low-level access via tango.net.Socket, but that is just too low-lowel for general usage. Thus, Tango provides a number of higher level wrappers to make Internet programming more productive.
TO BE WRITTEN:
This is the internet equivalent of FileConduit:
An excellent guide to low-level socket programming is right here