Yong Qiu Liu's Web Page--TCP

Yong Qiu Liu's Web Page--TCP_IP

Networking service systems and related standards

3.3 TCP/IP and UDP

TCP/IP is a set of protocols that is developed to allow computers to share resources across a network. It provides "low-level" functions for many applications for doing specific tasks, like transferring files between computers, sending mail, or connecting to people on another computer.

Here are the most important TCP/IP services:

File transfer: The File Transfer Protocol (FTP) enables people to get files from or send files to another computer. It is a utility to access a file on another system and copy it to the local system, then work with the local copy.

Remote login: The Network Terminal Protocol (TELNET) enables people to log in on any other computer on the network. The telnet program makes the local computer invisible while it is running. Every character typed is sent directly to the other system until the session is terminated.

Computer mail: This allows people to send messages to the people on other computers. Mail is normally handled by a mail server running all the time by a larger system. Microcomputer mail software then becomes a user interface that retrieves mail from the mail server.^[92]

Videoconferencing: By using TCP/IP, the real-time video and audio signals can be sent through the network accompanied by data compression and decompression technique to make the multimedia data small enough to be send through network. Then people can see and talk one another and co-operate over network.

The TCP/IP Protocol Architecture

The TCP/IP protocol architecture is shown in Figure A-4. It shows the major protocol and application components common to most commercial TCP/IP software packages and their relationship.

Figure A-4 The TCP/IP Protocol Architecture

The Network Interface Layer

TCP/IP protocols are designed to operate over nearly any underlying local or wide area network. IP messages can be transported over all of the technologies, although certain accommodations may need to be made. The Serial Line Internet Protocol (SLIP) and Point-to-Point Protocol (PPP) are two underlying network interface protocols particularly relevant to TCP/IP. They can provide data link layer protocol services independently and enable a remote computer to attach directly to a host server and connect to the Internet using IP.

Following is a brief description of the operations of PPP:

After the link is physically established, each host configures and tests the data link by sending Layer Control Protocol packet. PPP negotiates the maximum frame length, authentication protocol, link quality protocol, compression protocol and other configuration parameters. As soon as the link has been established, authentication will work if it is used.

After the link is established, one or more Network Layer Protocol connections are configured, including PPP's IP Control Protocol (IPCP) if IP is to be used. After the configuration, datagrams from those protocols can be sent over the link. Control protocols may be used for IP, IPX (NetWare), DDP (AppleTalk), DECnet, and more.

This link will continue work unless it is closed down

The Internet Layer

This layer is equivalent to the OSI Network Layer. IP provides a connectionless datagram transport service over the network. As at this layer the network does not guarantee delivery nor notify about packets lost, this service is called an unreliable service. IP datagrams can contain data up to 65,535 bytes but do not provide a mechanism for flow control.

IP header: The basic IP packet header format is shown in Figure A-5. Each row represents a single 32-bit word and an IP header will be at least 5 words (20 bytes) in length. The description of the IP header is in Table A-3.

Figure A-5 IP hearer[93]

Table A-3 Description of IP Header

Name	Description
VERS	The version of the IP protocol
LEN	The length of the IP header counted in 32-bit quantities (not include the data field)
Type of Service	Whether the quality of service requested
Total Length	The total length of the datagram, header and data
Identification	A unique number assigned by the sender to aid in reassembling a fragmented datagram
Flags	Various control flags: 0: Reserved; DF: Don't Fragment; MF:More Fragments
Fragment Offset	Used with fragmented datagrams to aid in re-assembly of the full datagram
TTL(Time to Live)	The time (in seconds) this datagram is allowed to travel
Protocol	The higher-level protocol to which IP should deliver the data in this datagram
Header Checksum	A checksum on the header only
Source IP Address	The 32-bit IP address of the host sending this datagram
Destination IP Address	The 32-bit IP address of the destination host for this datagram
Options	Variable length
Padding	If an option is used, the datagram is padded with all-zero bytes up to the next 32-bit boundary
data	The data contained in the datagram is passed to a higher-level protocol, as specified in the protocol field

IP Addresses: IP addresses are typically written as a sequence of four numbers that separated by periods, totally 32 bits in length. A sample IP address is 130.123.96.91.

Figure A-6 IP address format

For routing purposes IP addresses are subdivided into two subfields, the Network Identifier (NET_ID) that identifies the TCP/IP subnetwork to perform high-level routing between networks and the Host Identifier (HOST_ID) that indicates the specific host within a subnetwork.

IP defines several address classes to address different size networks (Figure A-6).

Class A: 7-bit NET_ID and 24-bit HOST_ID. This is intended for use with very large networks and can address up to 16,777,216 (224) hosts per network. . NET_ID will be a number between 1 and 128.Now only about 90 or so Class A addresses have been assigned.

Class B: 14-bit NET_ID and 16-bit HOST_ID. It is intended for moderate sized networks and can address up to 65,536 hosts per network. NET_ID will be a number between 128 and 191. This address space has long been threatened with being used up.

Class C: 21-bit NET_ID and 8-bit HOST_ID. It is intended for small networks and can address only up to 254 hosts per network. NET_ID will be a number between 192 and 223. Most addresses assigned to networks today are Class C (or sub-Class C).

Class D: addresses may begin with a value between 224 and 239 and are used for IP multicasting.

Class E: addresses begin with a value between 240 and 255 and are reserved for experimental use.

Classes A, B, and C are used for host addressing and the only difference between the classes is the length of the NET_ID subfield. Whereas Class D and Class E are used just for special purpose only.

The subnet mask: The subnet mask can be used to subdivide a large address space into subnetworks or to combine many small address spaces. To determine the subnet portion of the address, we simply perform a bit-by-bit logical AND of the IP address and the mask. For example, a Class B address space 130.123.0.0 could segment this into a 16-bit NET_ID, 4-bit SUBNET_ID, and 12-bit HOST_ID. In this case, the subnet mask for Internet routing (16-bit NET_ID ) would be 255.255.0.0 (11111111, 11111111, 00000000, 00000000 in binary), while the mask for routing to individual subnets within the larger Class B address space (16-bit NET_ID + 4-bit SUBNET_ID ) would be 255.255.240 (11111111, 11111111, 11110000, 00000000 in binary).

The use of class-based addresses in IP causes IP address exhaustion. This has been a concern since the early 1990s. To make fully use of these IP address spaces, several mechanisms are used.

Classless Interdomain Routing (CIDR) uses variable-size subnet masks to assign blocks of Class C addresses to an organisation that only needs several Class C addresses. For instance, 192.168.128.0, 192.168.129.0, 192.168.130.0, and 192.168.131.0 are assigned to an organisation by using a 22-bit subnet mask 255.255.252.0 for the NET_ID 192.168.128.0 assigned to this organisation.

Network Address Translation (NAT) is a mechanism that enables multiple hosts share a pool of IP addresses. As every host on the user's network can be assigned an IP address from the pool of private addresses; those are never seen on the Internet. When the user accesses the Internet, the NAT server translates the "private" IP address of the host into a "public" IP address from the pool of assigned addresses. This mechanism assumes that normally at a single time just a portion of the hosts access the Internet

Port Address Translation (PAT) or Network Address Port Translation (NAPT) is another mechanism that supposes the assumption of NAT is wrong. It allows multiple hosts to share a single IP address by using different port numbers.

Dynamic Host Configuration Protocol (DHCP) is also used to deal with renumbering. It dynamically assigns IP addresses to host systems. DHCP is suitable for those environments where users move around frequently.

The Domain Name System (DNS): For convenience, most IP hosts have both a numeric IP address and a unique name. But for routing purposes, the name must be translated back to a numeric address. The DNS is a database that contains host name and IP address information for all domains on the Internet. There is a single authoritative name server for every domain that contains all DNS-related information about the domain. To obtain a host's IP address from the host's name, a DNS request is made by the initial host to a local name server. If the local name server configured or cached this IP address, it responds to the request with information, otherwise, the local name server forwards the request to one of the root servers. The root server, then, will determine an appropriate name server for the target host and the DNS request will be forwarded to the domain's name server.

IP Routing: The IP has the responsibility to route packets over network. It looks up the destination IP NET_ID of a packet in a routing table and forwards it following the information in the table. There are three routing protocols commonly associated with IP and the Internet, RIP, OSPF and BGP.

The Routing Information Protocol (RIP-2) specifies the way routers exchange routing table information using a distance-vector algorithm. Neighbouring routers exchange their entire routing tables periodically by RIP. Current routing protocols for many of today's LANs are based upon RIP.

The Open Shortest Path First (OSPF) is a more robust protocol that is rapidly replacing RIP in the Internet. As a link state routing algorithm, it converges faster, requires less network bandwidth and is better able to scale to larger networks. A router broadcasts only changes in the status of its links rather than the entire routing tables.

The Border Gateway Protocol version 4 (BGP-4): As an exterior gateway protocol, it provides routing information between Internet routing domains. Similar to RIP, BGP is a distance vector protocol but it stores the actual route to the destination network. It also allows a network's administrator to create routing policies based on political, security, legal, or economic issues rather than technical ones.

The Transport Layer Protocols

This layer is equivalent to the OSI Transport and Session Layers. There are two important protocols in this layer, Transmission Control Protocol(TCP) and the User Datagram Protocol (UDP) (Figure A-4).

Here are the concepts used in this layer:

Port: A port is a 16-bit number that is used to address the higher-level protocol or application program to deliver incoming messages. Each process that wants to communicate with another process has one or more ports to identifie itself to the TCP/IP protocol suite.

Port numbers have different ranges:

0-1023: Well Known Ports. These are assigned to the server side with a high level of privilege (such as root or administrator).

1024-49151: Registered Ports. These are used by server or client applications for the Internet community.

49152-65535: Dynamic and/or Private Ports. These can be used freely by any client or server.

Sockets: A socket is a special type of file handle, which is used by a process to request network services from the operating system. A socket address is the triple: {protocol, local-address, local-process}. For example, {tcp, 193.44.234.3, 12345} is the TCP/IP suite socket.^[93]

As shown in Figure A-7, there are two processes for communicating via TCP sockets. Each side of a TCP connection has a socket identified by the pair { IP address, port number }. The two processes communicating over TCP form a logical connection identifiable by the combination { local IP address, local port, remote IP address, remote port}.

Figure A-7 Two processes communicating via TCP sockets

TCP: TCP provides a connection oriented, reliable, byte stream service. The two applications using TCP must establish a TCP connection with each other before they can exchange data. It is a full duplex protocol that each TCP connection supports a pair of byte streams flowing in each direction. There is a flow-control mechanism that enables the receiver to limit the amount of data the sender can transmit. It also implements a congestion-control mechanism.^[91]

TCP provides the following functions^[91]:

Stream Data Transfer: TCP transfers a successive stream of bytes by grouping the bytes in TCP segments and then passes them to IP for transmission. It decides how to segment the data and forward it at its own convenience.

Reliability: There is a sequence number in each byte transmitted. After receiving the data, the receiving TCP sends a positive acknowledgment (ACK) to the sending TCP. The data is retransmitted if the ACK is not received within a timeout interval by the sending TCP. The sequence of numbers is also used to rearrange the segments when they arrive out of order or duplicate segment appears.

Flow Control: There is also information in the ACK that indicates the number of bytes it can receive beyond the last received TCP segment, without causing overrun and overflow in its internal buffers.

Multiplexing: TCP provides a set of addresses or ports within each host to enable multiple processes to use TCP communication facilities simultaneously within a host. This is called a socket. A pair of sockets uniquely identifies each connection.

Logical Connections: A logical connection is the combination of the status of each data stream, including sockets, sequence numbers and window sizes. Each connection is uniquely identified by a pair of sockets.

Full Duplex: TCP supports concurrent data streams in both directions.

TCP header: TCP data is encapsulated in an IP data-gram. Figure A-8 shows the structure of the TCP header. Its normal size is 20 bytes unless options are present.

Figure A-8 TCP header

The Acknowledgement number field contains the next sequence number that the receiver expects to receive.

The 6-bit Flags field is used to relay control information between TCP peers. The possible flags include SYN, FIN, RESET, PUSH, URG, and ACK.

SYN: This synchronises the sequence numbers. This is used when establishing a TCP connection.

FIN: This indicates that there is no more data from sender. This is used when terminating a TCP connection.

ACK: when the Acknowledgement field is valid, It implies that the receiver should pay attention to it.

URG: This signifies that this segment contains urgent data. When this flag is set, the urgent pointer field indicates where the non-urgent data contained in this segment begins.

PUSH: This indicates that the sender used the push operation.

RESET: This signifies that the receiver has become confused and so wants to abort the connection.

The Option field is the maximum segment size the sender wants to receive, which was specified by each end of the connection on the first segment exchanged.

The data portion is optional.

TCP Logical Connections: TCP connections have three main parts: connection establishment, data exchange, and connection termination (Figure A-9).

Figure A-9. TCP logical connection phases

Table 40 lists the descriptions of the messages used in Figure A-9

Table A-4 The descriptions of the messages used in Figure A-9

Name	Description	Name	Description
syn	The SYN-bit flag	dst_port	Receiver’s port number
ack	ACK-bit flag	DataLen	The size of the data
SEQ	The sequence number	Data	Information sending
ACK	The acknowledgement number	fin	The FIN-bit flag
src_port	Sender’s port number

The connection establishment phase comprises a three-way handshake during which time the client and server exchange their initial sequence number (ISN) and acknowledge the other host's ISN. See Figure A-9.

Data exchange: This is the second part of the TCP connection (Figure A-9).

Connection termination is the final phase. The TCP protocol takes the logical connection as a pair of simplex links. So connection termination requires four segments or, more properly, two pairs of segments.

This is a normal scenario for setting up a TCP connection between a client and server.

UDP (User Datagram Protocol): UDP is designed for applications where it is not necessary to put sequences of datagrams together. As for TCP, there is a UDP header (Figure A-10). It puts the UDP header in front of the data and then sends the data to IP. UDP does not split data into multiple datagrams. Neither does it keep track of what it has sent so it does not care if data is lost. The UDP header is shorter than a TCP header. It still has source and destination port numbers, and a checksum, but no other features. UDP is used by the protocols that handle name lookups and a number of similar protocols.^[92]

Two UDP hosts communicate in a similar fashion; one host sends a UDP datagram to the other which is presumably listening on the port indicated in the datagram.

This is what happens in special conditions:

The host is not listening on TCP port: If Host A attempts to contact Host B on a TCP port that Host B is not listening on, Host B responds with a TCP segment with the reset (RST) and acknowledge (ACK) flags set.

The host is not listening on UDP port: If Host A attempts to contact Host B on a UDP port that Host B is not listening on, Host B sends an ICMP port unreachable message to Host A.

The host does not exist: If Host A attempts to contact Host B and Host B is not listening (e.g., Host B's IP address either doesn't exist or is unavailable), Host B's subnet's router will send an ICMP host unreachable message to Host A.

The TCP/IP Application Layer

The TCP/IP Application Layer protocols support the applications and utilities that perform over the network. This is where the user interacts with the network. All network applications include http clients, ftp, telnet, mail and news are at the application layer. They use either TCP or UDP to communicate with other machines. Our application also located on this layer.

Last update April 1, 2002