Using UDP sockets

TCP support is built in to the core of the Tcl interpreter. To be able to use the UDP protocol, you have to use an external package. The default choice is usually the TclUDP extension, which is available from http://sourceforge.net/projects/tcludp/ (it also comes as a part of ActiveTcl bundle; if you don't have it, install it with teacup install udp).

In contrast to TCP, which is a connection-oriented protocol, UDP is connection-less. This means that every data package (datagram) travels from one peer to another on its own, without a return acknowledgement or retransmission in the case of lost packets. What is more, one of the peers may send packages that are never received (for example if the second peer is not listening at the moment), and there is no feedback information that something is going wrong. This implies a difference in the design for handling the transmission, which will be illustrated in the following example.

Creating a UDP-based client

Lets consider a simple 'time server', where the server sends the current time to any client application that subscribes for such notifications, of course using UDP connectivity. The format of each datagram will be rather simple: it will contain only the current time expressed in seconds.

First let's have a look on client code:

package require udp
set s [udp_open]
fconfigure $s -buffering none
fconfigure $s -remote [list 127.0.0.1 9876]
puts -nonewline $s "subscribe"
proc readTime {channel} {
puts "Time from server: [read $channel]"
}
fileevent $s readable [list readTime $s]
vwait forever
close $s

As you have probably figured out, the first line loads the TclUDP extension. The next line creates a UDP socket, using the udp_open command, and stores its reference in the s variable. The UDP protocol uses ports in the same way as TCP. If we executed udp_open 1234, the port value 1234 would be specified, but if omitted, the operating system would assign a random port. Note that if you specify a port that is already being used by any other program, an error will be generated.

Next, we set the buffering mode to none, meaning that the output buffer will be automatically flushed after every output operation. We will discuss buffering issues more deeply later in this example.

The newly created UDP socket is not connected to anything, as the UDP is connection-less. Such a socket is able to receive packets as they arrive at any time from any source, without establishing a data connection of any type. To have datagrams be sent to a specific destination, you should use the fconfigure command with a new option (introduced by TclUDP) remote, along with a two-item list containing the target address and port: fconfigure $s -remote [list 127.0.0.1 9876]. In this example the server will be executed on local host (so you are able to run it even if you are not part of a network). Note that you can call this command any time you wish, causing successive datagrams to be sent to different peers.

Now it is time to send a message to the server in this case simply a string containing 'subscribe'. If -nonewline is omitted, puts would generate 2 datagrams (the second one containing the newline character) it is likely that the puts implementation will write data twice to the buffer (the message, and then the new line character), and as the buffering is set to none, it is flushed immediately after each write. The other solution would be to set buffering to full and call flush $s after each socket write.

The handling of incoming data is implemented based on event programming. The line:

fileevent $s readable [list readTime $s]

defines that every time the socket has some data to read (is readable), the command readTime with $s as an argument is called. The command itself is simple it prints to the screen every piece of data that comes from the socket, read with the read $s command.

Implementing service using UDP

The code for the server is a bit more complicated, due to a need to track subscribed clients:

package require udp
set clients [list]
proc registerClient {s} {
global clients
lappend clients [fconfigure $s -peer]
}
proc sendTime {s} {
global clients
foreach peer $clients {
puts "sending to $peer"
fconfigure $s -remote $peer
puts -nonewline $s [clock seconds]
}
after 1000 [list sendTime $s]
}
set server [udp_open 9876]
fconfigure $server -buffering none
fileevent $server readable [list registerClient $server]
sendTime $server
vwait forever

The list named clients will hold an entry for each subscribed client; each entry is also a list containing IP address and port, so it suits perfectly for the fconfigure $s remote command.

The server opens a UDP socket on port 9876. We would like to avoid the word 'listens' in this context, as this socket does not differ in any way from the one used by the client. By contrast, TCP requires a special server type socket, for listening purposes.

On every incoming data even, the registerClient procedure is executed. The command appends to the client's list information about the originator of the data (usually referred to as a peer) that has just arrived. This information is retrieved with fconfigure $s peer. Although it may seem that this data is defined for the socket (represented by $s), in reality it refers to the most recent datagram received by this socket.

Every one second the procedure sendTime is called. The purpose of this command is to send the current time to all subscribed clients, so it iterates over the clients list, and for each one it first configures the socket with the target address and port (fconfigure $s -remote $peer), and then sends a datagram containing the time in the form of the output from the clock seconds command.

The server code is simple, it runs forever and there is no way to unsubscribe from receiving the data, but it demonstrates how to work with UDP in Tcl.

The following picture shows an example of the execution of the server (timeServer.tcl) and two clients (timeClient.tcl):

Implementing service using UDP

The first client connects from the port 4508, and the second one (started a few seconds later) from 4509.

The most important observation is that UDP sockets are handled identically on both the client and server, so the name 'server' is actually contractual.

It is worth mentioning that although we do not focus on this feature in this chapter, TclUDP supports multicasting and broadcasting of UDP packets. For details of how to perform this, please consult the package's manual.

Sending reliable messages

The UDP protocol lacks reliability, which is one of its main differences compared to TCP. Applications using UDP must either accept the fact that some of the datagrams may be lost, or implement equivalent functionality on their own. The same is true of topics like the order of incoming packets and data integrity.

The implementation of such logic could be as follows in the following example that follows — the sender calculates the MD5 checksum of the data, and sends both to the receiver. The receiver calculates the checksum again and compares it to the received one and in the case of equality, sends acknowledgment (in this example, the checksum is sent back). The sender will repeatedly attempt to send the data until the confirmation is received, or the permitted number of attempts has been reached.

The sender code is as follows:

package require udp
package require md5
set s [udp_open]
fconfigure $s -buffering none
fconfigure $s -remote [list 127.0.0.1 9876]
proc randomData {length} {
set result ""
for {set x 0} {$x<$length} {incr x} {
set result "$result[expr { int(2 * rand()) }]"
}
return $result
}
proc sendPacket {chan contents {retryCount 3}} {
variable ackArray
if {$retryCount < 1} {
puts "packet delivery failure"
return
}
set md5 [md5::md5 -hex $contents]
# if ack received, remove ack and do not send again
if {[info exists ackArray($md5)]} {
puts "packet successfully delivered"
unset ackArray($md5)
return
}
puts "sending packet, # of retries: $retryCount"
puts "packet content: $md5$contents"
puts -nonewline $chan "$md5$contents"
flush $chan
# handle retries
incr retryCount -1
after 1000 sendPacket [list $chan $contents $retryCount]
}
proc recvAckPacket {chan} {
variable ackArray
set md5 [read $chan]
puts "received ack: $md5"
set ackArray($md5) 1
}
sendPacket $s [randomData 48]
after 5000 sendPacket $s [randomData 48]
after 10000 sendPacket $s [randomData 48]
fileevent $s readable [list recvAckPacket $s]
vwait forever

The main logic is located in the sendPacket procedure. The last parameter is the number of retries left to deliver the data. The procedure calculates the MD5 checksum of the data to be sent (stored in contents variable) and first checks if the appropriate acknowledgment has already been received if the array ackArray contains the entry for the checksum (that is concurrently an acknowledgment), it is removed and the datagram is considered to have been delivered. If it is not, then the checksum along with the data is sent to the receiver, and a sendPacket is scheduled to be executed again after one second, every time with retries counter decreased. If the procedure is called when the counter is equal to zero, the delivery is considered to be negative.

The acknowledgments are received by the procedure recvAckPacket, which simply stores it into ackArray, allowing sendPacket to find it and react appropriately.

The helper procedure randomData allows the generation of a random string of zeroes and ones of a given length.

Note that this example does not cover the topic of received packets order.

The receiver code:

package require udp
package require md5
set server [udp_open 9876]
fconfigure $server -buffering none
fileevent $server readable [list recvPacket $server]
proc recvPacket {chan} {
variable readPackets
set data [read $chan]
puts "received: $data"
set md5 [string range $data 0 31]
set contents [string range $data 32 end]
if {$md5 != [md5::md5 -hex $contents]} {
#the data are malformed
puts "malformed data"
return
}
# send an ack anyway, because original
# might not have been received by other peer
fconfigure $chan -remote [fconfigure $chan -peer]
#simulate the ack package lost over network
if {10*rand() > 7} {
puts -nonewline $chan $md5
flush $chan
}
# check if this packet is not a duplicate
if {[info exists readPackets($md5)]} {
return
}
set readPackets($md5) [clock seconds]
# handle packet here...
}
proc periodicCleanup {} {
variable readPackets
set limit [clock scan "-300 seconds"]
foreach {md5 clock} [array get readPackets] {
if {$clock < $limit} {
unset readPackets($md5)
}
}
after 60000 periodicCleanup
}
vwait forever

The receiver will send back the acknowledgement each time the correct datagram is received, that is when the checksums sent (first 32 chars) and calculated locally are equal. It also stores in the readPackets array the time of arrival of each packet, which allows us to detect duplicated data and processing it only once. To make the example more vivid, about 70% of data loss is simulated by randomly not sending confirmations.

The receiver also implements some simple logic for periodic clean up of the received datagrams log, to prevent it from becoming too huge and memory consumptive.

The result of running the example can be as depicted:

Sending reliable messages

In this example, the first datagram was delivered successfully on the first attempt, the second one's delivery failed despite 3 attempts, and the last one was delivered on the second try.

Comparing TCP and UDP: streams vs. datagrams

Although from the Tcl point of view UDP sockets are identical in usage to normal sockets, you have to be aware of the differences. When you use file or TCP sockets, you operate on streams of data. Issues like buffering or the order of the data do not concern you. TCP features fit perfectly into the channel philosophy: the protocol offers reliable delivery of the stream of bytes in the correct order; transmission errors are detected and corrected automatically (with packet retransmission); flow and congestion control regulate the capacity of the channel.

In the case of UDP, the situation is different. You have to keep in mind that essentially everything you write to the UDP socket will end up in the datagram (UDP packet). In theory, a UDP datagram can carry 65507 bytes of data, but TclUDP has its own internal limit set to 4096 bytes. Therefore you should pay attention to the buffering settings:

  • The buffer size (fconfigure -buffersize) must not be larger than 4096
  • If the buffering mode is set to none, you should not write (puts) data larger than 4096 bytes to a UDP socket in one operation, otherwise the data will be lost
  • In the case of buffering mode being set to full and the buffer size lower than 4096, you can write any amount of data you want, but note that the data will be split into pieces with the maximum size equal to the buffer size

The described behaviour can easily be illustrated with the following example: the sender code uses a buffer of the maximum possible size, 4096 bytes, and sends data of increasing size:

package require udp
set s [udp_open]
fconfigure $s -buffersize 4096
fconfigure $s -remote [list 127.0.0.1 9876]
proc randomData {length} {
set result ""
for {set x 0} {$x<$length} {incr x} {
set result "$result[expr { int(2 * rand()) }]"
}
return $result
}
puts -nonewline $s [randomData 512]
flush $s
puts -nonewline $s [randomData 2048]
flush $s
puts -nonewline $s [randomData 8192]
flush $s
close $s

The function randomData produces strings of the specified length. First, 512 bytes is sent the flush command makes sure that the buffer content is sent to the channel, and then packets of 2048 and 8192 bytes, similarly.

The receiver code prints on the screen the size of every received datagram:

package require udp
set server [udp_open 9876]
fconfigure $server -buffering none
fileevent $server readable [list recvPacket $server]
proc recvPacket {chan} {
set data [read $chan]
puts "received length: [string length $data]"
}
vwait forever

When both are executed, the output is as follows:

C:	cl_bookchapter6UDPSocketslargeData>tclsh85 receiver.tcl
received length: 512
received length: 2048
received length: 4096
received length: 4096

As you can see, the last bulk of data sent by the sender has arrived in form of two datagrams. If we modify the sender code—increase the output buffer size to 5000 bytes:

fconfigure $s -buffersize 5000

The receiver's output is:

C:	cl_bookchapter6UDPSocketslargeData>tclsh85 receiver.tcl

received length: 512
received length: 2048
received length: 3192

In this case, the 8192 bytes data bulk is split into two parts:

  • The first one has a size of 5000 bytes (the buffer was completely filled upon writing, and then flushed) and is lost, as it exceeds the maximum datagram data size allowed by TclUDP
  • The second one, of size 3192 bytes (that is the rest of the packet: 8192 5000 = 3192) arrives successfully

The example clearly shows some of the pitfalls of using UDP, although from a Tcl perspective it looks like a 'normal' channel able to send data, you must be aware of the underlying mechanism and use it adequately.

In the case of a TCP socket, read would read data until the channel is closed (for example, when the client disconnects). In case of a UDP socket, read reads data from only one datagram at a time, as the datagrams are not connected to each other and must be treated separately (does not form a stream of data).

To summarize, treating UDP sockets as channels has the following pros and cons:

Pros

Cons

Consistent programming interface

Confusing; even though it seems to be a channel, you have to keep in mind that it is not

Common set of commands to operate on the channel: read/write/configure

Creation of datagram is not too intuitive for example to address it, you have to use fconfigure as if you were configuring the channel

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.80.101