Addendum 2. TCP/IP
TCP/IP (Internet Protocol) is the protocol (set of rules) governing all messages sent over the Internet. Although a typical user does not need to know anything about TCP/IP to use the Internet, one does need an overview to configure firewalls and to understand some of the other threats on the Internet. What follows is a very simplistic description of TCP/IP. If you are already familiar with the TCP/IP protocol, you probably do not need to read this chapter.
Internet Addressing
Every device on the Internet has an IP address. In general, this address uniquely defines that device, just as your mailing address on an envelope uniquely defines your home. Addresses in the current version of TCP/IP (known as IPv4) are 32-bit binary numbers, so there are 232 = 4,294,967,296 possible addresses. To make it easier to represent and remember, the 32-bit binary number is broken up into 4 8-bit sections. Because 28 = 256, each 8-bit section can have a value from 0 to 255. These 4 numbers are normally shown one after each other, connected by periods. So the lowest Internet address is 0.0.0.0 and the highest one is 255.255.255.255. A typical IP address might be 24.200.195.15. Devices called routers on the Internet keep track of where each IP address is and how to get to it.
Domain Name Service
Because long strings of numbers are not easy to remember, many computers on the Internet are given alphabetic names (called a hostname). An example of such a name is www.infodev.org. When you enter this name into your web browser, for example, your computer sends a message to a special service called the Domain Name Service or DNS. The DNS knows how to translate alphabetic names into numeric ones - 192.86.99.121 in this case. DNS also allows a web server to be moved to a different location on the Internet. The owner informs the DNS of the new address, but users can still use the original hostname.
IP: Internet Protocol
When data is sent over the Internet, it is sent in blocks of characters called a packet or datagram. The IP in TCP/IP stands for Internet Protocol and the Internet Protocol defines how the packet looks inside. The IP packet contains a number of pieces of information. Among them are:
• the size of the packet;
• the IP address of the sender;
• the IP address where the packet is being sent;
• the type of packet.
When a packet leaves your computer, it is sent to the nearest router which attempts to send it to the next router along the way to its destination. If, due to congestion or some other problem, the packet cannot get delivered, it is simply ignored. For this reason, IP is called an unreliable protocol. Although in theory IP is unreliable, in most cases, the Internet delivers all the packets that are sent.
There are a number of different types of packets that can be sent, but there are only two that we will look at here. They are TCP and UDP.
TCP: Transmission Control Protocol
TCP is the protocol that is used for most messages, including the web (HTTP), File Transfer Protocol (FTP) and e-mail. In addition to the data being sent, the TCP packet includes:
• a 16-bit sending port number;
• a 16-bit receiving port number;
• sequencing information;
• acknowledgement information.
Because a single computer typically has just one IP address, the port number is used to indicate what program within the computer is sending or receiving the message. This is what allows you to have several web browser windows open on your computer and to have the pages that you request go back to the correct window. For a program to receive a TCP message, it must be listening on the correct port. Typically, a specific port is used for each type of application. For instance, a web server usually listens on port 80. When you open a browser window, it typically picks a semi-random port number (by convention higher than 1023) as its port, and this is the port that it listens on. Because IP packets are limited in length, and the data transmitted by an application program may be much larger, the data can be chopped up into smaller segments. Each segment is sent in its own TCP packet. For various reasons, some packets may arrive faster than others, which means that they may arrive out of order. The sequencing information allows the receiving program to re-assemble the segments in the correct order. Since IP is potentially unreliable, it is possible that one of the segments never arrives. In this case, the receiving program will notice that there is a gap in the sequence and it can request that the missing packet be resent.
When a program sends a TCP packet, it expects the receiving program to acknowledge it. If an acknowledgement does not arrive in a reasonable time, the packet can be re-transmitted. Because of the sequence numbers and the acknowledgements, TCP is a reliable protocol. When it is used, the user application can be sure that if there is an error in transmission or reception, the application will be informed.
UDP: User Datagram Protocol
UDP is a simple format to allow data to be transmitted. Each UDP packet includes some information in addition to the data. These include:
• a 16-bit sending port number, and
• a 16-bit receiving port number.
Just as with TCP, because port numbers are used, there can be several program sending or receiving UDP streams in parallel. Also like TCP, to receive a message, the program must be listening on the correct port. There are no provisions for sequencing or acknowledgement in UDP, so it (like IP) is an unreliable protocol. In theory, messages can be lost. It is used in cases where it either does not matter if an occasional message is lost, or if there is a simple way to recover from the lost message. Because there are no acknowledgements or sequencing, it uses far fewer resources.
|