The User Datagram Protocol

UP!/Web Service 2008. 8. 21. 14:16

The User Datagram Protocol

Summary	The User Datagram Protocol provides a low-overhead transport service for application ptotocols that do not need (or cannot use) the connection-oriented services offered by TCP. UDP is most often used with applications that make heavy use of broadcasts or multicasts, as well as applications that need fast turnaround times on lookups and queries.
Protocol ID	17
Relevant STDs	2 (http://www.iana.org/); 3 (includes RFCs 1122 and 1123); 6 (RFC 768, republished)
Relevant RFCs	768 (User Datagram Protocol); 1122 (Host Network Requirements)

There are two standard transport protocols that applications use to communicate with each other on an IP network. These are the User Datagram Protocol (UDP), which provides a lightweight and unreliable transport service, and the Transmission Control Protocol (TCP), which provides a reliable and controlled transport service.

The majority of Internet applications use TCP, since its built-in reliability and flow control services ensure that data does not get lost or corrupted. However, many applications that do not require the overhead found in TCP—or that cannot use TCP because the application has to use broadcasts or multicasts—will use UDP instead. UDP is more appropriate for any application that has to issue frequent update messages or that does not require every message to get delivered.

The UDP Standard

UDP is defined in RFC 768, which has been republished as STD 6 (UDP is an Internet Standard protocol). However, RFC 768 contained some vagaries that were clarified in RFC 1122 (Host Network Requirements). As such, UDP implementations need to incorporate both RFC 768 and RFC 1122 in order to work reliably and consistently with other implementations.

RFC 768 states that UDP is a stateless, unreliable transport protocol that does not guarantee delivery. Thus, UDP is meant to provide a low-overhead transport for applications to use when they do not need guaranteed delivery.

RFC 768 also states that the Protocol ID for UDP is 17. When a system receives an IP datagram that is marked as containing Protocol 17, it should pass the contents of the datagram to the local UDP service for further processing.

UDP Is an Unreliable, Datagram-Centric Transport Protocol

As we discussed in Chapter 1, An Introduction to TCP/IP, sending a message via UDP is somewhat analogous to sending a postcard in that it is totally untrustworthy, providing no guarantees of any kind of delivery. UDP messages are sent and then forgotten about immediately. As such, applications that need a reliable transport protocol should not use UDP.

However, UDP's lightweight model does provide some distinct benefits, particularly in comparison to TCP's highly managed connection model. While TCP provides high levels of reliability through highly managed virtual circuits, UDP offers high performance from having so little overhead. If reliability comes at the expense of performance, then conversely, performance can be gained by eliminating some of the overhead associated with reliability.

In addition, many applications simply cannot use TCP, since TCP's virtual circuit design requires dedicated end-to-end connections between two (and no more than two) endpoints. If an application needs to use broadcasts or multicasts in order to send data to multiple hosts simultaneously, then that application will have to use UDP to do so.

Limited reliability

Although applications that broadcast information on a frequent basis have to use UDP, they do gain some benefits from doing so. Since broadcasts are sent to every device on the local network, it would take far too long for the sender to establish individual TCP connections with every other system on the network, exchange data with them all, and then disconnect. Conversely, UDP's connectionless service allows the sender to simply send the data to all of the devices simultaneously. If any of the systems do not receive one of the messages, then they will likely receive one of the next broadcasts, and as such will not be substantially harmed by missing one or two of them.

Furthermore, streaming applications (such as real-time audio and video) can also benefit from UDP's low-overhead structure. Since these applications are streamoriented, the individual messages are not nearly as important as the overall stream of data. The user will not notice if a single IP packet gets lost every so often, so it is better to just continually keep sending the next message, rather than stopping everything to resend a single message. These applications actually see errorcorrection as a liability, so UDP's connectionless approach is a 밼eature?rather than a 뱎roblem.?/P>

Similarly, any application that needs only a lightweight query and response service would be unduly burdened by TCP's connection-oriented services and would benefit from UDP's low overhead. Some database and network-lookup services use UDP for just this reason, allowing a client and server to exchange data without having to spend a lot of time establishing a reliable connection when a single query is all that's required.

It should be pointed out that many of the applications that use UDP require some form of error correction, but that this error correction also tends to be specific to the application at hand, and is therefore embedded directly into the application logic. For example, a database client would need to be able to tell when no response came back from a query, and so the database client may choose to just reissue the entire query rather than try to fix a specific part of the datastream (this is how Domain Name System queries work). Applications that use UDP must therefore incorporate any required error-checking and fault-management routines internally, rather than rely on UDP to provide these services.

Another interesting point is that most of the network technologies in use today are fairly reliable to begin with, so unreliable protocols like UDP (and IP) are likely to reach their destinations without much problems. Most LANs and WANs are extremely reliable, losing only tiny amounts of data over the course of their lifetime. On these types of networks, UDP can be used without much concern. Even topologies that are unreliable (such as analog modems) typically provide a modicum of error-correction and retransmission services at the data-link layer.

For these reasons, UDP probably shouldn't be considered totally unreliable, although you must always remember that UDP doesn't provide any error-correction or retransmission services within the transport itself. It just inherits any existing reliability that is provided by the underlying medium.

Furthermore, UDP also provides a checksum service that allows an application to verify that whatever data has arrived is probably the same as that which was sent. The use of UDP's checksum service is optional, and not all of the applications that use UDP also use the checksum service (although they are encouraged to do so by RFC 1122). Some applications incorporate their own verification routines within the UDP data segment, augmenting or bypassing UDP's provisional data-verification services with application-specific equivalents.

Datagram-centric transport services

Another unique aspect of UDP is the way in which it deals with only one datagram at a time. Rather than attempting to manage a stream of application data the way that TCP does, UDP deals with only individual blocks of data, as generated by the application protocols in use. For example, if an application gives UDP a fourkilobyte block of data, then UDP will hand that data to IP as a single datagram, without trying to create efficient segment sizes (one of TCP's most significant traits). The data may be fragmented by IP when it builds and sends IP packets for that four-kilobyte block of data, but UDP does not care if this happens and is not involved in that process whatsoever.

Furthermore, since each IP datagram contains a fully formed UDP datagram, the destination system will not receive any portion of the UDP message until the entire IP datagram has been received. For example, if the underlying IP datagram has been fragmented, then UDP will not receive any portion of the message until all of the fragments have arrived and been reassembled by IP. But once that happens, then UDP (and the application in use with UDP) will get and read the entire fourkilobyte message in one shot.

Some UDP stacks require that the application have enough buffers to read the entire datagram. If the application cannot accept all of the data, then it will not get any of the data, since the datagram will be discarded.

Conversely, remember that TCP does not generally cause fragmentation to occur, since it attempts to avoid fragmentation through the use of efficiently sized segments. In that model, TCP would send multiple TCP segments, each of which could arrive independently and be made available to the destination application immediately. Although there are benefits to the TCP design, record-centric applications also have to perform more work when using it instead of UDP, since UDP provides one-shot access to all of the data.

In fact, UDP is particularly useful for applications that have to transfer fixed-length records of data (such as database records or even fixed-length files). For example, if an application needs to send six records from a database to another system, then it can generate six fixed-length UDP datagrams, and UDP will send those datagrams as independent UDP messages (which become independent IP datagrams). The recipient then receives the datagrams as self-contained records and will be able to immediately process them as six unique records.

In contrast, TCP's circuit-centric model would require the same application to write the data to the TCP virtual circuit, which would then break the data into segments for transport across the network. The recipient would then have to read through the segments as they arrived at the destination system, poking through the data and looking for end-of-record markers until all six records were received and found.

For all of these reasons, UDP is a more efficient protocol, although it is still unreliable. As such, application protocols that want to leverage the low-overhead nature of UDP must provide their own reliability services. In addition, these applications typically have to provide their own flow-control and packet-ordering services, ensuring that datagrams are not received out of order. Most applications incorporate a half-duplex data-exchange mechanism in order to provide these services. The application protocol waits for a clear-to-send signal from the remote system, transmits a datagram, and then stops to wait for the clear-to-send signal again.

For example, Trivial File Transfer Protocol (TFTP) clients use acknowledgment messages embedded in UDP datagrams to tell a server that it received the last block of data and that it is ready to receive another block. The TFIP server then sends another block of data as another UDP message and then wait to receive an acknowledgment before sending another block. Although this method is clumsy when compared to TCP's graceful sliding window concept, it has been proven to work over the years.

UDP Ports.

UDP does very little. In fact, it does almost nothing, acting only as a very basic facilitator for applications to use when they need to send or receive datagrams on an IP network. In order to perform this task, UDP has to provide two basic services: it must provide a way for applications to send data over the IP software, and it must also provide a way to get data that it has received from IP back to the applications that need it.

These services are provided by a multiplexing component within the UDP software. Applications must register with UDP, allowing it to map incoming and outgoing messages to the appropriate application protocols themselves.

This multiplexing service is provided by 16-bit port numbers that are assigned to specific applications by UDP. When an application wishes to communicate with the network, it must request a port number from UDP (server applications such as TFTP will typically request a pre-defined, specific port number, while most client applications will use whatever port number they are given by UDP). UDP will then use these port numbers for all incoming and outgoing datagrams

This concept is illustrated in Figure 6-1. Each of the applications that are using UDP have allocated a dedicated port number from UDP, which they use for all incoming and outgoing data.


		Figure 6-1. Application-level multiplexing with port numbers

When an application wishes to send data over the network, it gives the data to UDP through the assigned port number, also telling UDP which port on the destination system the data should be sent to. UDP then creates a UDP message, marking the source and destination port numbers, which is then passed off to IP for delivery (IP will create the necessary IP datagram).

Once the IP datagram is received by the destination system, the IP software sees that the data portion of the IP datagram contains a UDP message (as specified in the Protocol Identifier field in the IP header), and hands it off to UDP for processing. The UDP software looks at the UDP header, sees the destination port number, and hands the payload portion of the datagram to whatever application is using the specified port number. Figure 6-2 illustrates this concept using the Trivial File Transfer Protocol (TFTP), a small file transfer protocol that uses UDP.


		Figure 6-2. Data being sent from a TFTP client to a TFTP server

Technically, a 뱎ort?identifies only a single instance of an application on a single system. The term 뱒ocket?is used to identify the port number and IP address concantenated together (i.e., port 80 on host 192.168.10.10 would be referred to as the socket 192.168.10.10:80). Finally, a 뱒ocket pair?consists of both endpoints, including the IP addresses and port numbers of both applications on both systems. Multiple connections between two systems must have unique socket pairs, with at least one of the two endpoints having a different port number.

Although the concept of socket pairs with UDP is similar to the same concept as it works with TCP, there are some fundamental differences that must be taken into consideration when looking at how connections work with UDP versus how they work with TCP. Most importantly, while TCP can maintain multiple virtual circuits on a single port number through the use of socket pairs, UDP-based applications do not have this capability at all, and simply treat all data sent and received over a port number as data for a single 밹onnection.?/P>

For example, if a DNS server is listening for queries on port 53, then any queries that come in to that port are treated as equal, with the DNS server handling the multiplexing services required to distinguish between the different clients that are issuing the distinct queries. This is the opposite of how TCP works, where the transport protocol would create and manage virtual circuits for each of the connections. With UDP, all data is treated as a single 밹onnection,?and the application must manage any multiplexing services required on that port.

Well-known ports

Most server-based IP applications use what are referred to as 뱖ell-known?port numbers. For example, a TFTP server will listen on UDP port 69 by default, which is the well-known port number for TFTP servers. This way, any TFTP client that needs to connect to any TFTP server can use the default destination of UDP port

69. Otherwise, the client would have to specify the port number of the server that it wanted to connect with (you've seen this in some URLs that use http://www. somehost.com:8080/ or the like; 8080 is the port number of the HTTP server on www.somehost.com).

Most application servers allow you to use any port number you want. However, if you run your servers on non-standard ports, then you would have to tell every user that the server was not accessible on the default port. This would be a hard-to-manage implementation at best. By sticking with the defaults, all users can connect to your server using the default port number, which is likely to cause the least amount of trouble.

Some network administrators purposefully run application servers on nonstandard ports, hoping to add an extra layer of security to their network. However, it is my opinion that security through obscurity is no security at all and that this method should not be relied upon by itself.

Historically, only servers have been allowed to run on ports below 1024, as these ports could be used only by privileged accounts. By limiting access to these port numbers, it was more difficult for hacker to install a rogue application server. However, this restriction is based on Unix-specific architectures, and is not easily enforced on all of the systems that run IP today. Many application servers now run on operating systems that have little or no concept of privileged users, making this historical restriction somewhat irrelevant.

There are a number of predefined port numbers that are registered with the Internet Assigned Numbers Authority (IANA). All of the port numbers below 1024 are reserved for use with well-known applications, although there are also many applications that use port numbers outside of this range. Some of the more common port numbers are shown in Table 6-1. For a detailed listing of all of the port numbers that are currently registered, refer to the IANA's online registry (accessible at http://www.isi.edu/in-notes/iana/assignments/port-numbers).

Table 6-1. Some of the Port Numbers Reversed for Well-Known UDP Servers
Port Number	Description
53	Domain Name System (DNS)
69	Trivial File Transfer Protocol (TFTP)
137	NetBIOS Name Service (sometimes referred to as WINS)
161	Simple Network Management Protocol (SNMP)

Besides the reserved addresses that are managed by the IANA, there are also unreserved port numbers that can be used by any application for any purpose, although conflicts may occur with other users who are also using those port numbers. Any port number that is frequently used is encouraged to register with the IANA.

To see the well-known ports used on your system, examine the /etc/services file on a Unix host, or the C:\WinNT\System32\Drivers\Etc\SERVICES file on a Windows NT host.

The UDP Header

UDP messages consist of header and body parts, just like IP datagrams. The body part contains whatever data was provided by the application in use, while the header contains the fields that tell the destination UDP software what to do with the data.

A UDP message is made up of six fields (counting the data portion of the message). The total size of the message will vary according to the size of the data in the body part. The fields in a UDP message are shown in Table 6-2, along with their size (in bytes) and their usage.

Table 6-2. The fields in a UDP Message
Field	Bytes	Usage Notes
Source Port	2	Identifies the 16-bit port number in use by the application that is sending the data
Destination Port	2	Identifies the 16-bit target port number of the application that is to receive this data
Length	2	Specifies the size of the total UDP message, including both the header and data segments
Checksum	2	Used to store a checksum of the entire UDP message
Data	varies	The data portion of the UDP message

Notice that the UDP header does not provide any fields for source or destination IP addresses, or for any other services that are not specifically related to UDP. This is because those services are provided by the IP header or by the application-specific protocols (and thus contained within the UDP message's data segment).

Every UDP message has an eight-byte header, as can be seen from Table 6-2. Thus, the theoretical minimum size of a UDP message is eight bytes, although this would not leave any room for any data in the message. In reality, no UDP message should ever be generated that does not contain at least some data.

Figure 6-3 shows a UDP message sent from a TFTP client to a TFTP server. In that example, a TFTP session is opened between Greywolf (the client) and Arachnid (the server), with Greywolf sending a file (called testfile.txt) to Arachnid. We'll use this message for further discussion of the UDP header fields.


		Figure 6-3. A simple UDP message

The following sections describe the header fields of the UDP message in detail.

Source Port

Identifies the message's original sender, as referenced by the 16-bit UDP port number in use by the application.

Size
Sixteen bits.

Notes
This field identifies the port number used by the application that created the data.

Note that RFC 768 states 밪ource Port is an optional field, when meaningful, it indicates the port of the sending process, and may be assumed to be the port to which a reply should be addressed in the absence of any other information. If not used, a value of zero is inserted.?/BLOCKQUOTE>
Although Source Port is optional, it should always be used.

Capture Sample
In the capture shown in Figure 6-4, the Source Port field is set to hexadecimal 04 2c, which equates to decimal 1068.

Figure 6-4.
The Source Port field

'UP! > Web Service' 카테고리의 다른 글

The Transmission Control Protocol (0)	2008.08.21
The Internet Control Message Protocol (0)	2008.08.21
Multicasting and the Internet Group Management Protocol (0)	2008.08.21
The Address Resolution Protocol (0)	2008.08.21
The Internet Protocol (0)	2008.08.21

Posted by 으랏차

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

으랏차