SourceForge VA Linux Systems
Copyright © 2000 Paul Sheer - Click here for copying permissions       Source by FTP

next up previous contents index
Next: DNS and Name Resolution Up: Rute Users Tutorial and Previous: NFS   Contents   Index

Subsections

Transmission Control Protocol (TCP) and User Datagram Protocol (UDP)

Add comments here 
In the previous chapter we talked about communication between machines in a generic sense. However, when you have two applications on either side of the Atlantic ocean, being able to send a packet that may or may not reach the other side, is not sufficient. What you need reliable communication.
Ideally a programmer wants to be able to establish a link to a remote machine and then feed bytes in one at a time, and be sure that the bytes are being read on the other end, and visa-versa. Such communication is called reliable stream communication.
To implement a reliable stream having only data packets at our disposal is tricky. You can send single packets and then wait for the remote machine to indicate that it has received it, but this is inefficient (packets can take a long time to get to and from their destination) -- you really want to be able to send as many packets as possible at once, and then have some means of negotiating with the remote machine when to resend packets that were not received. What TCP does is to send data packets one way, and then acknowledge packets the other way, saying how much of the stream has been properly received.
We hence say that TCP is implemented on top of IP. This is why Internet communication is sometimes called TCP/IP.
TCP communication has three stages: negotiation, transfer and detachment30.1.
Negotiation: The client application (say a web browser) first initiates the connection using a C connect() (connect2) function. This causes the kernel to send a SYN (SYNchronisation) packet to the remote TCP server (in this case a web server). The web server responds with a SYN-ACK packet (ACKnowledge), and finally the client responds with a final SYN packet. This packet negotiation is unbeknown to the programmer.
Transfer: The programmer will use the send() (send2) and recv() (recv2) C function calls to send and receive an actual stream of bytes. The stream of bytes will be broken into packets and the packets send individually to the remote application. In the case of the web server, the first bytes sent would be the line GET /index.html HTTP/1.0<CR><NL><CR><NL>. On the remote side, reply packets (so also called ACK packets) are sent back as the data arrives, indicating if parts of the stream went missing and require retransmission. Communication is full-duplex -- meaning that there are streams in both directions -- both data and acknowledge packets are going both ways simultaneously.
Detachment: The programmer will use the C function call close() (close2) and/or shutdown() (shutdown2) to terminate the connection. A FIN packet will be sent and TCP communication will cease.

The TCP header

TCP packets are obviously encapsulated within IP packets. The TCP packet is inside the Data begins at... part of the IP packet. A TCP packet has a header part and a data part. The data part may sometimes be empty (such as in the Negotiation stage).
Here is the full TCP/IP header:
Bytes Description
0 bits 0-3: Version, bits 4-7: Internet Header Length (IHL)
1 Type of service (TOS)
2-3 Length
4-5 Identification
6-7 bits 0-3: Flags, bits 4-15: Offset
8 Time to live (TTL)
9 Type
10-11 Checksum
12-15 Source IP address
16-19 Destination IP address
20-IHL*4-1 Options + padding to round up to four bytes
0-1 Source Port
2-3 Destination Port
4-7 Sequence Number
8-11 Acknowledgement Number
12 bits 0-3: number of bytes of additional TCP Options / 4
13 Control
14-15 Window
16-17 Checksum
18-19 Urgent Pointer
20-20+Options*4 Options + padding to round up to four bytes
TCP Data begins at IHL*4+20+Options*4 and ends at Length-1
The minimum combined TCP/IP header is thus 40 bytes.
With Internet machines, several applications often communicate simultaneously. To identify a particular stream the Source Port and Destination Port fields are used. In the case of web communication, the destination port (from the clients point of view) is port 80, and hence all outgoing traffic will have the number 80 filled in this field. The source port port (from the clients point of view) is chosen randomly to any unused port number above 1024 before the connection is negotiated -- these too are filled into outgoing packets. No two streams have the same combinations of source and destination port numbers. The kernel uses the port numbers on incoming packets to determine which application requires those packets, and similarly for the remote machine.
Sequence Number is the offset within the stream that this particular packet of data belongs to. The Acknowledge Number is the point in the stream up to which all data has been received. Control are various other flag bits. Window is the maximum amount that the receiver is prepared to accept. Checksum is to verify data integrity, and Urgent Pointer is for interrupting data. Data needed by extensions to the protocol are appended after the header as options.

A sample TCP session

Its easy to see TCP working by using telnet. You are probably familiar with using telnet to login to remote systems, but telnet is actually a generic program to connect to any TCP socket. Here we will try connect to cnn.com's web page.
We first need to get an IP address of cnn.com:

 
 
[root@cericon]# host cnn.com
cnn.com has address 207.25.71.20

Now in one window we run:

 
 
 
 
[root@cericon]# tcpdump \
'( src 192.168.3.9 and dst 207.25.71.20 ) or ( src 207.25.71.20 and dst 192.168.3.9 )'
Kernel filter, protocol ALL, datagram packet socket
tcpdump: listening on all devices

which says list all packets having source (src) or destination (dst) addresses of either us or CNN.

Then we use the HTTP protocol to grab the page. Type in the HTTP command GET / HTTP/1.0 and then press enter twice (as required by the HTTP protocol). The first and last few lines of the sessions are shown below:

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
25 
 
 
 
 
30 
 
[root@cericon root]# telnet 207.25.71.20 80
Trying 207.25.71.20...
Connected to 207.25.71.20.
Escape character is '^]'.
GET / HTTP/1.0

HTTP/1.0 200 OK
Server: Netscape-Enterprise/2.01
Date: Tue, 18 Apr 2000 10:55:14 GMT
Set-cookie: CNNid=cf19472c-23286-956055314-2; expires=Wednesday, 30-Dec-2037 16:00:00 GMT; path=/; domain=.cnn.com
Last-modified: Tue, 18 Apr 2000 10:55:14 GMT
Content-type: text/html

<HTML>
<HEAD>
        <TITLE>CNN.com</TITLE>
        <META http-equiv="REFRESH" content="1800">

        <!--CSSDATA:956055234-->
        <SCRIPT src="/virtual/2000/code/main.js" language="javascript"></SCRIPT>
        <LINK rel="stylesheet" href="/virtual/2000/style/main.css" type="text/css">
        <SCRIPT language="javascript" type="text/javascript">
                <!--//
                if ((navigator.platform=='MacPPC')&&(navigator.ap

..............
..............

</BODY>
</HTML>
Connection closed by foreign host.

The above produces the front page of CNN's web site in raw html. This is easy to paste into a file and view off-line.

In the other window, tcpdump is showing us what packets are being exchanged. tcpdump nicely shows us hostnames instead of IP addresses and the letters www instead of the port number 80. The local ``random'' port in this case was 4064:

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
25 
 
 
 
 
30 
 
 
 
 
35 
 
 
 
 
40 
 
 
 
 
45 
 
 
 
 
50 
 
 
 
 
55 
 
 
 
 
60 
 
 
 
 
65 
 
 
 
[root@cericon]# tcpdump \
'( src 192.168.3.9 and dst 207.25.71.20 ) or ( src 207.25.71.20 and dst 192.168.3.9 )'
Kernel filter, protocol ALL, datagram packet socket
tcpdump: listening on all devices
12:52:35.467121 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   S 2463192134:2463192134(0) win 32120 <mss 1460,sackOK,timestamp 154031689 0,nop,wscale 0> (DF)
12:52:35.964703 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   S 4182178234:4182178234(0) ack 2463192135 win 10136 <nop,nop,timestamp 1075172823 154031689,nop,wscale 0,mss 1460>
12:52:35.964791 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 1:1(0) ack 1 win 32120 <nop,nop,timestamp 154031739 1075172823> (DF)
12:52:46.413043 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   P 1:17(16) ack 1 win 32120 <nop,nop,timestamp 154032784 1075172823> (DF)
12:52:46.908156 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 1:1(0) ack 17 win 10136 <nop,nop,timestamp 1075173916 154032784>
12:52:49.259870 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   P 17:19(2) ack 1 win 32120 <nop,nop,timestamp 154033068 1075173916> (DF)
12:52:49.886846 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   P 1:278(277) ack 19 win 10136 <nop,nop,timestamp 1075174200 154033068>
12:52:49.887039 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 278 win 31856 <nop,nop,timestamp 154033131 1075174200> (DF)
12:52:50.053628 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 278:1176(898) ack 19 win 10136 <nop,nop,timestamp 1075174202 154033068>
12:52:50.160740 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   P 1176:1972(796) ack 19 win 10136 <nop,nop,timestamp 1075174202 154033068>
12:52:50.220067 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 1972 win 31856 <nop,nop,timestamp 154033165 1075174202> (DF)
12:52:50.824143 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 1972:3420(1448) ack 19 win 10136 <nop,nop,timestamp 1075174262 154033131>
12:52:51.021465 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 3420:4868(1448) ack 19 win 10136 <nop,nop,timestamp 1075174295 154033165>

..............
..............

12:53:13.856919 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 53204 win 30408 <nop,nop,timestamp 154035528 1075176560> (DF)
12:53:14.722584 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 53204:54652(1448) ack 19 win 10136 <nop,nop,timestamp 1075176659 154035528>
12:53:14.722738 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 54652 win 30408 <nop,nop,timestamp 154035615 1075176659> (DF)
12:53:14.912561 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 54652:56100(1448) ack 19 win 10136 <nop,nop,timestamp 1075176659 154035528>
12:53:14.912706 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 58500 win 30408 <nop,nop,timestamp 154035634 1075176659> (DF)
12:53:15.706463 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 58500:59948(1448) ack 19 win 10136 <nop,nop,timestamp 1075176765 154035634>
12:53:15.896639 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 59948:61396(1448) ack 19 win 10136 <nop,nop,timestamp 1075176765 154035634>
12:53:15.896791 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 61396 win 31856 <nop,nop,timestamp 154035732 1075176765> (DF)
12:53:16.678439 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 61396:62844(1448) ack 19 win 10136 <nop,nop,timestamp 1075176864 154035732>
12:53:16.867963 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 62844:64292(1448) ack 19 win 10136 <nop,nop,timestamp 1075176864 154035732>
12:53:16.868095 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 64292 win 31856 <nop,nop,timestamp 154035829 1075176864> (DF)
12:53:17.521019 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   FP 64292:65200(908) ack 19 win 10136 <nop,nop,timestamp 1075176960 154035829>
12:53:17.521154 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   . 19:19(0) ack 65201 win 31856 <nop,nop,timestamp 154035895 1075176960> (DF)
12:53:17.523243 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   F 19:19(0) ack 65201 win 31856 <nop,nop,timestamp 154035895 1075176960> (DF)
12:53:20.410092 eth0 > cericon.obsidian.co.za.4064 > www1.cnn.com.www:
   F 19:19(0) ack 65201 win 31856 <nop,nop,timestamp 154036184 1075176960> (DF)
12:53:20.940833 eth0 < www1.cnn.com.www > cericon.obsidian.co.za.4064:
   . 65201:65201(0) ack 20 win 10136 <nop,nop,timestamp 1075177315 154035895>

103 packets received by filter

The above requires some explanation: Line 5, 7 and 9 are the Negotiation stage. tcpdump uses the format <Sequence Number>:<Sequence Number + data length>(<data length>) on each line to show the context of the packet within the stream. The Sequence Number however is chosen randomly at the outset, hence tcpdump prints the relative sequence number after the first two packets to make it clearer what the actual position is within the stream. Line 11 is where I pressed Enter the first time, and Line 15 was Enter with an empty line. The ack 19's indicates up to where CNN's web server has received incoming data -- in this case we only ever typed in 19 bytes, hence the web server sets this value in every one of its outgoing packets, while our own outgoing packets are mostly empty of data.
Line 61 and 63 are the Detachment stage.
More information about the tcpdump output can be had from tcpdump8 under the section TCP Packets.

User Datagram Protocol (UDP)

You don't always need reliable communication.
Sometimes you want to have direct control of packets for efficiency reasons, or because you don't really mind if packets get lost. Two examples are nameserver communications, where single packet transmissions are desired, or voice transmissions where reducing lag time is more important than data integrity. Another is NFS (Network File System) which uses UDP to implement exclusively high bandwidth data transfer.
With UDP the programmer sends and receives individual packets again incapsulated within IP. Ports are used in the same way as with TCP, but these are merely identifiers and there is no concept of a stream. The full UDP/IP header is very simple:
Bytes Description
0 bits 0-3: Version, bits 4-7: Internet Header Length (IHL)
1 Type of service (TOS)
2-3 Length
4-5 Identification
6-7 bits 0-3: Flags, bits 4-15: Offset
8 Time to live (TTL)
9 Type
10-11 Checksum
12-15 Source IP address
16-19 Destination IP address
20-IHL*4-1 Options + padding to round up to four bytes
0-1 Source Port
2-3 Destination Port
4-5 Length
6-7 Checksum
UDP Data begins at IHL*4+8 and ends at Length-1

/etc/services file

There are various standard port numbers used exclusively for particular types of services. 80 is always web as shown above. Port numbers 1 through 1023 are reserved for such standard services which are each given convenient names.
All services are defined for both TCP as well as UDP, even though there is, for example, no such thing as UDP FTP access, etc.
Port numbers below 1024 are used exclusively for root uid programs such as mail, DNS, and web services. Programs of ordinary users are not allowed to bind to ports below 1024. The place where these ports are defined is the /etc/services file. The /etc/services file is mostly for descriptive purposes -- programs can look up port names and numbers -- /etc/services has nothing to do with the availability of a service.
An extract of the /etc/services file is

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
25 
 
 
 
 
30 
 
 
 
 
35 
 
 
 
 
40 
 
 
tcpmux          1/tcp                           # TCP port service multiplexer
echo            7/tcp
echo            7/udp
discard         9/tcp           sink null
discard         9/udp           sink null
systat          11/tcp          users
daytime         13/tcp
daytime         13/udp
netstat         15/tcp
qotd            17/tcp          quote
msp             18/tcp                          # message send protocol
msp             18/udp                          # message send protocol
chargen         19/tcp          ttytst source
chargen         19/udp          ttytst source
ftp-data        20/tcp
ftp             21/tcp
fsp             21/udp          fspd
ssh             22/tcp                          # SSH Remote Login Protocol
ssh             22/udp                          # SSH Remote Login Protocol
telnet          23/tcp
smtp            25/tcp          mail
time            37/tcp          timserver
time            37/udp          timserver
rlp             39/udp          resource        # resource location
nameserver      42/tcp          name            # IEN 116
whois           43/tcp          nicname
re-mail-ck      50/tcp                          # Remote Mail Checking Protocol
re-mail-ck      50/udp                          # Remote Mail Checking Protocol
domain          53/tcp          nameserver      # name-domain server
domain          53/udp          nameserver
mtp             57/tcp                          # deprecated
bootps          67/tcp                          # BOOTP server
bootps          67/udp
bootpc          68/tcp                          # BOOTP client
bootpc          68/udp
tftp            69/udp
gopher          70/tcp                          # Internet Gopher
gopher          70/udp
rje             77/tcp          netrjs
finger          79/tcp
www             80/tcp          http            # WorldWideWeb HTTP
www             80/udp                          # HyperText Transfer Protocol


next up previous contents index
Next: DNS and Name Resolution Up: Rute Users Tutorial and Previous: NFS   Contents   Index
Paul Sheer 2000-10-07