TCP support is built in to the core of the Tcl interpreter. To be able to use the UDP protocol, you have to use an external package. The default choice is usually the TclUDP
extension, which is available from http://sourceforge.net/projects/tcludp/ (it also comes as a part of ActiveTcl bundle; if you don't have it, install it with teacup install udp)
.
In contrast to TCP, which is a connection-oriented protocol, UDP is connection-less. This means that every data package (datagram) travels from one peer to another on its own, without a return acknowledgement or retransmission in the case of lost packets. What is more, one of the peers may send packages that are never received (for example if the second peer is not listening at the moment), and there is no feedback information that something is going wrong. This implies a difference in the design for handling the transmission, which will be illustrated in the following example.
Lets consider a simple 'time server', where the server sends the current time to any client application that subscribes for such notifications, of course using UDP connectivity. The format of each datagram will be rather simple: it will contain only the current time expressed in seconds.
First let's have a look on client code:
package require udp set s [udp_open] fconfigure $s -buffering none fconfigure $s -remote [list 127.0.0.1 9876] puts -nonewline $s "subscribe" proc readTime {channel} { puts "Time from server: [read $channel]" } fileevent $s readable [list readTime $s] vwait forever close $s
As you have probably figured out, the first line loads the TclUDP
extension. The next line creates a UDP socket, using the udp_open
command, and stores its reference in the s
variable. The UDP protocol uses ports in the same way as TCP. If we executed udp_open 1234
, the port value 1234 would be specified, but if omitted, the operating system would assign a random port. Note that if you specify a port that is already being used by any other program, an error will be generated.
Next, we set the buffering mode to none, meaning that the output buffer will be automatically flushed after every output operation. We will discuss buffering issues more deeply later in this example.
The newly created UDP socket is not connected to anything, as the UDP is connection-less. Such a socket is able to receive packets as they arrive at any time from any source, without establishing a data connection of any type. To have datagrams be sent to a specific destination, you should use the fconfigure
command with a new option (introduced by TclUDP)
remote, along with a two-item list containing the target address and port: fconfigure $s -remote [list 127.0.0.1 9876]
. In this example the server will be executed on local host (so you are able to run it even if you are not part of a network). Note that you can call this command any time you wish, causing successive datagrams to be sent to different peers.
Now it is time to send a message to the server in this case simply a string containing 'subscribe'. If -nonewline
is omitted, puts would generate 2 datagrams (the second one containing the newline character) it is likely that the puts
implementation will write data twice to the buffer (the message, and then the new line character), and as the buffering is set to none
, it is flushed immediately after each write. The other solution would be to set buffering to full
and call flush $s
after each socket write.
The handling of incoming data is implemented based on event programming. The line:
fileevent $s readable [list readTime $s]
defines that every time the socket has some data to read (is readable), the command readTime
with $s
as an argument is called. The command itself is simple it prints to the screen every piece of data that comes from the socket, read with the read $s
command.
The code for the server is a bit more complicated, due to a need to track subscribed clients:
package require udp set clients [list] proc registerClient {s} { global clients lappend clients [fconfigure $s -peer] } proc sendTime {s} { global clients foreach peer $clients { puts "sending to $peer" fconfigure $s -remote $peer puts -nonewline $s [clock seconds] } after 1000 [list sendTime $s] } set server [udp_open 9876] fconfigure $server -buffering none fileevent $server readable [list registerClient $server] sendTime $server vwait forever
The list named clients
will hold an entry for each subscribed client; each entry is also a list containing IP address and port, so it suits perfectly for the fconfigure $s remote
command.
The server opens a UDP socket on port 9876. We would like to avoid the word 'listens' in this context, as this socket does not differ in any way from the one used by the client. By contrast, TCP requires a special server type socket, for listening purposes.
On every incoming data even, the registerClient
procedure is executed. The command appends to the client's list information about the originator of the data (usually referred to as a peer) that has just arrived. This information is retrieved with fconfigure $s peer
. Although it may seem that this data is defined for the socket (represented by $s)
, in reality it refers to the most recent datagram received by this socket.
Every one second the procedure sendTime
is called. The purpose of this command is to send the current time to all subscribed clients, so it iterates over the clients
list, and for each one it first configures the socket with the target address and port (fconfigure $s -remote $peer
), and then sends a datagram containing the time in the form of the output from the clock seconds
command.
The server code is simple, it runs forever and there is no way to unsubscribe from receiving the data, but it demonstrates how to work with UDP in Tcl.
The following picture shows an example of the execution of the server (timeServer.tcl
) and two clients (timeClient.tcl
):
The first client connects from the port 4508, and the second one (started a few seconds later) from 4509.
The most important observation is that UDP sockets are handled identically on both the client and server, so the name 'server' is actually contractual.
It is worth mentioning that although we do not focus on this feature in this chapter, TclUDP
supports multicasting and broadcasting of UDP packets. For details of how to perform this, please consult the package's manual.
The UDP protocol lacks reliability, which is one of its main differences compared to TCP. Applications using UDP must either accept the fact that some of the datagrams may be lost, or implement equivalent functionality on their own. The same is true of topics like the order of incoming packets and data integrity.
The implementation of such logic could be as follows in the following example that follows — the sender calculates the MD5 checksum of the data, and sends both to the receiver. The receiver calculates the checksum again and compares it to the received one and in the case of equality, sends acknowledgment (in this example, the checksum is sent back). The sender will repeatedly attempt to send the data until the confirmation is received, or the permitted number of attempts has been reached.
The sender code is as follows:
package require udp package require md5 set s [udp_open] fconfigure $s -buffering none fconfigure $s -remote [list 127.0.0.1 9876] proc randomData {length} { set result "" for {set x 0} {$x<$length} {incr x} { set result "$result[expr { int(2 * rand()) }]" } return $result } proc sendPacket {chan contents {retryCount 3}} { variable ackArray if {$retryCount < 1} { puts "packet delivery failure" return } set md5 [md5::md5 -hex $contents] # if ack received, remove ack and do not send again if {[info exists ackArray($md5)]} { puts "packet successfully delivered" unset ackArray($md5) return } puts "sending packet, # of retries: $retryCount" puts "packet content: $md5$contents" puts -nonewline $chan "$md5$contents" flush $chan # handle retries incr retryCount -1 after 1000 sendPacket [list $chan $contents $retryCount] } proc recvAckPacket {chan} { variable ackArray set md5 [read $chan] puts "received ack: $md5" set ackArray($md5) 1 } sendPacket $s [randomData 48] after 5000 sendPacket $s [randomData 48] after 10000 sendPacket $s [randomData 48] fileevent $s readable [list recvAckPacket $s] vwait forever
The main logic is located in the sendPacket
procedure. The last parameter is the number of retries left to deliver the data. The procedure calculates the MD5 checksum of the data to be sent (stored in contents
variable) and first checks if the appropriate acknowledgment has already been received if the array ackArray
contains the entry for the checksum (that is concurrently an acknowledgment), it is removed and the datagram is considered to have been delivered. If it is not, then the checksum along with the data is sent to the receiver, and a sendPacket
is scheduled to be executed again after one second, every time with retries counter decreased. If the procedure is called when the counter is equal to zero, the delivery is considered to be negative.
The acknowledgments are received by the procedure recvAckPacket
, which simply stores it into ackArray
, allowing sendPacket
to find it and react appropriately.
The helper procedure randomData
allows the generation of a random string of zeroes and ones of a given length.
Note that this example does not cover the topic of received packets order.
The receiver code:
package require udp package require md5 set server [udp_open 9876] fconfigure $server -buffering none fileevent $server readable [list recvPacket $server] proc recvPacket {chan} { variable readPackets set data [read $chan] puts "received: $data" set md5 [string range $data 0 31] set contents [string range $data 32 end] if {$md5 != [md5::md5 -hex $contents]} { #the data are malformed puts "malformed data" return } # send an ack anyway, because original # might not have been received by other peer fconfigure $chan -remote [fconfigure $chan -peer] #simulate the ack package lost over network if {10*rand() > 7} { puts -nonewline $chan $md5 flush $chan } # check if this packet is not a duplicate if {[info exists readPackets($md5)]} { return } set readPackets($md5) [clock seconds] # handle packet here... } proc periodicCleanup {} { variable readPackets set limit [clock scan "-300 seconds"] foreach {md5 clock} [array get readPackets] { if {$clock < $limit} { unset readPackets($md5) } } after 60000 periodicCleanup } vwait forever
The receiver will send back the acknowledgement each time the correct datagram is received, that is when the checksums sent (first 32 chars) and calculated locally are equal. It also stores in the readPackets
array the time of arrival of each packet, which allows us to detect duplicated data and processing it only once. To make the example more vivid, about 70% of data loss is simulated by randomly not sending confirmations.
The receiver also implements some simple logic for periodic clean up of the received datagrams log, to prevent it from becoming too huge and memory consumptive.
The result of running the example can be as depicted:
In this example, the first datagram was delivered successfully on the first attempt, the second one's delivery failed despite 3 attempts, and the last one was delivered on the second try.
Although from the Tcl point of view UDP sockets are identical in usage to normal sockets, you have to be aware of the differences. When you use file or TCP sockets, you operate on streams of data. Issues like buffering or the order of the data do not concern you. TCP features fit perfectly into the channel philosophy: the protocol offers reliable delivery of the stream of bytes in the correct order; transmission errors are detected and corrected automatically (with packet retransmission); flow and congestion control regulate the capacity of the channel.
In the case of UDP, the situation is different. You have to keep in mind that essentially everything you write to the UDP socket will end up in the datagram (UDP packet). In theory, a UDP datagram can carry 65507 bytes of data, but TclUDP
has its own internal limit set to 4096 bytes. Therefore you should pay attention to the buffering settings:
fconfigure -buffersize
) must not be larger than 4096 none
, you should not write (puts
) data larger than 4096 bytes to a UDP socket in one operation, otherwise the data will be lost full
and the buffer size lower than 4096, you can write any amount of data you want, but note that the data will be split into pieces with the maximum size equal to the buffer sizeThe described behaviour can easily be illustrated with the following example: the sender code uses a buffer of the maximum possible size, 4096 bytes, and sends data of increasing size:
package require udp set s [udp_open] fconfigure $s -buffersize 4096 fconfigure $s -remote [list 127.0.0.1 9876] proc randomData {length} { set result "" for {set x 0} {$x<$length} {incr x} { set result "$result[expr { int(2 * rand()) }]" } return $result } puts -nonewline $s [randomData 512] flush $s puts -nonewline $s [randomData 2048] flush $s puts -nonewline $s [randomData 8192] flush $s close $s
The function randomData
produces strings of the specified length. First, 512 bytes is sent the flush
command makes sure that the buffer content is sent to the channel, and then packets of 2048 and 8192 bytes, similarly.
The receiver code prints on the screen the size of every received datagram:
package require udp set server [udp_open 9876] fconfigure $server -buffering none fileevent $server readable [list recvPacket $server] proc recvPacket {chan} { set data [read $chan] puts "received length: [string length $data]" } vwait forever
When both are executed, the output is as follows:
C: cl_bookchapter6UDPSocketslargeData>tclsh85 receiver.tcl
received length: 512
received length: 2048
received length: 4096
received length: 4096
As you can see, the last bulk of data sent by the sender has arrived in form of two datagrams. If we modify the sender code—increase the output buffer size to 5000 bytes:
fconfigure $s -buffersize 5000
The receiver's output is:
C: cl_bookchapter6UDPSocketslargeData>tclsh85 receiver.tcl
received length: 512
received length: 2048
received length: 3192
In this case, the 8192 bytes data bulk is split into two parts:
TclUDP
The example clearly shows some of the pitfalls of using UDP, although from a Tcl perspective it looks like a 'normal' channel able to send data, you must be aware of the underlying mechanism and use it adequately.
In the case of a TCP socket, read would read data until the channel is closed (for example, when the client disconnects). In case of a UDP socket, read reads data from only one datagram at a time, as the datagrams are not connected to each other and must be treated separately (does not form a stream of data).
To summarize, treating UDP sockets as channels has the following pros and cons:
Pros |
Cons |
---|---|
Consistent programming interface |
Confusing; even though it seems to be a channel, you have to keep in mind that it is not |
Common set of commands to operate on the channel: read/write/configure |
Creation of datagram is not too intuitive for example to address it, you have to use |
52.15.80.101