Chapter 2. Advanced Tcl Features

Now that we know Tcl basics a lot better, it is time to learn some of Tcl's more advanced functionality. This chapter focuses on some aspects of Tcl that will be needed for the chapters that follow throughout the book.

This chapter talks about how Tcl works internally, what we need to be aware of and how we can leverage it to our needs. It introduces in detail how Tcl evaluates code, substitutes variables, and evaluates embedded commands. It also talks about the different types of variables and how they can be used within our code.

We also describe how to work with files in Tcl. This chapter covers how to read and write files, and introduces the concept of channels within Tcl and how different types of channels work. We also show how to copy, rename, and delete files and directories, get information about them, and how to learn about platform-specific issues.

After that, you will see how the Tcl packaging system works and how we can extend it and create reusable packages. It mentions the package system, which is how Tcl has been doing things for a long time. It also introduces us to the Tcl modules concept, which was included in Tcl 8.5 and changes the way some of the packages are built and found on a system.

We also dive into the idea of the event loop, one of Tcl's most powerful features—what it is, the different types of events, how Tcl receives them, and how to use them in our applications. We learn about file events, timer events, and scheduling periodic jobs.

This chapter also gives us a good idea of how threads can be used from within Tcl. It talks about how we can create child threads, communicate with them, and share information between threads. It also shows some examples of how we can use Tcl threads to implement a system where a child thread performs data manipulation, while the main thread is responsible for data management.

Tcl features

The previous chapter introduced the basics of the Tcl language. We learned the basics of strings, lists, dictionaries, and integer and floating-point numbers. This section introduces some of the more advanced Tcl features— working with the time and date, data, namespaces, and stack frames.

Learning some of these features is required if you wish to understand and use Tcl and its features in a better way—especially meta programming, which can be used to create our own syntax and alter a program's flow control. This is one of Tcl's powerful features.

Working with time and date

When writing applications in Tcl, we'll often need to work in the context of date and time. One example is reading a file's access time, another one calculating how much time is left until, for example, next Sunday at 4 AM. Tcl uses the Unix timestamp for all date and time manipulations. This is a common approach which assumes that all dates and times are specified as a number of seconds since midnight on the 1st January 1970 GMT. This is similar to how most operating systems track time. An interesting fact is that regardless of which time zone you are currently in, the actual number of seconds is the same everywhere. The only thing that changes is that when it gets converted to actual time and date, the local time zone is then taken into account.

The Tcl command clock can be used for the majority of date and time related operations. Finding out the current time is done using the clock seconds command, which returns a Unix timestamp value. Taking this value at various times can be used to calculate how many seconds have passed, for example:

% set earlier [clock seconds]

1253652955
% set now [clock seconds]

1253652998
% expr {$now - $earlier}

43

The timestamp idea, although simple and elegant, is not clear when it comes to interpreting the value. The date and time can be converted to a more readable form by using the clock format command, which takes a timestamp as the first argument and formats it as a string. The command also accepts several options, including -gmt, requiring a Boolean value as argument, which allows it to work with a GMT time and date. We can also specify the format in which the date and/or time should be presented. The format string may include various tokens that will be replaced to appropriate values, some of them are:

Format

Description

Example

%Y

4 digit year

2009

%m

Number of months

09

%d

Number of days

22

%H

Number of hours

20

%M

Number of minutes

56

%S

Number of seconds

38

All formatting possibilities and other options can be found on the clock command's manual page at: http://www.tcl.tk/man/tcl8.5/TclCmd/clock.htm

For example, this is how the same timestamp can be printed out in different ways:

% puts [clock format $now]
Tue Sep 22 22:56:38 CEST 2009
% puts [clock format $now format "%Y-%m-%d - %H:%M"]
2009-09-22 22:56
% puts [clock format $now -gmt 1]
Tue Sep 22 20:56:38 GMT 2009

Tcl can also do the opposite—read a textual representation of a date and convert it to a timestamp. This can be done using the clock scan command, which accepts and automatically detects many different types of date and time specification. Text can also specify only a date or only time, in which case it is scanned. For example:

% puts [clock scan "Tue Sep 22 22:56:38 CEST 2009"]
1253652998
% puts [clock scan "2009-09-22 22:56:38"]
1253652998

By default, clock scan parses text basing on the current date and time—so if the text contains only a time, today's date will be used. By adding the -base option, we can specify what timestamp should be taken as the base timestamp for our operations. For example, scanning just a time can produce different results with different bases:

% puts [clock format [clock scan "12:40"]]
Tue Sep 22 12:40:00 CEST 2009
% set base [clock scan {2009-10-01}]
% puts [clock format $base]
Thu Oct 01 00:00:00 CEST 2009
% puts [clock format [clock scan "12:40" -base $base]]
Thu Oct 01 12:40:00 CEST 2009

Besides parsing actual date and time values, clock scan is also able to parse and understand an argument that specifies a difference in time, in which case it is based on the current or provided date and time. The syntax is<number> <units>. For example:

% puts [clock format [clock seconds]]
Tue Sep 22 22:56:38 CEST 2009
% puts [clock format [clock scan "1 hour -15 minutes"]]
Tue Sep 22 23:41:38 CEST 2009
% puts [clock format [clock scan "45 minutes"]]
Tue Sep 22 23:41:38 CEST 2009

Note the flexibility of the input format, we specified the 45 minute interval in various ways here. The clock scan command can also do parsing according to the format in which a given date and time is specified:

% puts [clock scan "20090922225638" -format "%Y%m%d%H%M%S"]
1253652998

It is also possible to combine multiple date and time specifications, in which case Tcl will parse each part individually and combine the results based on the order of the statements. For example, we can pass an arbitrary value and then a difference:

% puts [clock format [clock scan "12:45 +3 hours"]]
Tue Sep 22 15:45:00 CEST 2009

Using clock scan for parsing user input allows us to create a powerful mechanism for scheduling, or providing when certain operations can be performed.

An interesting feature of Tcl's date and time handling is the ability to handle the stardate notation that was used in the Star Trek movies and series. We can use %Q formatting to print out the stardate. For example:

% puts [clock format [clock seconds] -format %Q]
Stardate 63723.9

We can also reverse this operation by running:

% puts [clock format [clock scan "Stardate 63723.9"]]
Mon Sep 21 21:36:00 CEST 2009

Stardates are calculated from 1946-01-01, each year incrementing the date by 1000, the day of year is then divided by number of days in a year (for example, Dec 31st 1946 maps to value of 997). Current time of day is then divided by the entire day's duration and added as value between 0 and 1—for example, noon is specified as 0.5.

Even though this is an interesting gadget in Tcl, it is not a good idea to store a date or time in this way.

For more details on the clock command, please see the corresponding manual page at: http://www.tcl.tk/man/tcl8.5/TclCmd/clock.htm

Tcl data types

As seen in the previous chapter, Tcl provides only a few data types and every data type can be converted to or from a fundamental type, that is, a string. This approach is geared towards the dynamic typing concept where the programming language does not perform strict type checking, therefore, not limiting the way an application manipulates its data. Having a small subset of data types and conforming to a common approach throughout many Tcl extensions helps in achieving this goal.

In essence, Tcl offers the following object types:

  • String / binary data: It consists of zero or more bytes or Unicode characters and can store any data.
  • Integer value: An integer, stored as either 32 bit or 64 bit, depending on needs; starting with Tcl 8.5, it can store integer numbers of any size, which can be used in many algorithms and/or encryption code; examples, for example 12, 7321322.
  • Floating point value: A floating point value, stored as double-precision internally; for example: 3.14159265.
  • List: It consists of zero or more Tcl objects, and can always be converted from and to a string, accessed via large number of list-related commands; for example: {element0 {sublist {item with spaces}}}.
  • Dictionary: It allows us to store zero or more key-value relations, where a key can have only one value, where the value can be any valid Tcl object (including a Tcl list), accessed via dict command and its subcommands.

Arrays are not first-class objects in Tcl, and therefore, are not mentioned in the list. First class objects are data that can be passed directly. They can be used in commands such as set and return.

Arrays themselves are not such objects—they can be passed by their names using a command such as upvar, but it is not possible to return an array—for example, the following will not work:

proc createArray {} {
set value(firstValue) 1
set value(otherValue) 2
return $value
}

This command will fail with the error that the variable value is an array. Dictionaries can be used as first-class objects which work in the same way as arrays—storing key-value information that can be set and retrieved in a fast way. Dictionaries can be passed between procedures, for example:

proc createDict {} {
set value [dict create]
dict set value firstValue 1
dict set value otherValue 2
return $value
}

Each data type can be represented as a string and each type can be compared with any other type, although sometimes, internally, they are converted to strings when comparing values of different types.

Lists and dictionaries work in such a way that each element they keep is another object of a different data type, for example, a list contains zero or more elements of any data type, this is similar for dictionaries where each value is an object of specified data types. It is also true for arrays, where each element of an array is also any data type.

For binary, string, integer, and floating-point values storing a value of particular type and manipulating it is done through commands to manipulate strings and/or binary data.

Internally, Tcl stores data in the most efficient way and converts it as needed. When programming in Tcl, we don't need to know how data is currently stored. All operations will convert data as needed—converting a string to integer, a list, dictionary, or whatever is needed.

If a conversion to the required data type is not possible, an error is thrown—for example, when converting some text to integer, an error will be thrown because it's not an integer value. Otherwise, all conversions are done without us knowing about it.

For example, we can set a variable by running:

set variable 12345678

When run, it will set the value of the variable to 12345678, which most probably will be stored as a string. Following that, we can run:

incr variable

At this point, Tcl will convert this to an integer and increment the value, using integers for internal calculations. Next, when we do the following:

puts $variable

Tcl will print the value of 12345679, but will keep an integer representation of the value internally.

Similarly, strings are converted to lists whenever list operations are performed, such as invoking commands llength, lindex, or any other command that operates on lists.

Calculations in Tcl are done by using a separate command called expr. Due to how Tcl syntax is defined, it would be difficult to allow users to specify calculations in readable way.

Expressions passed to expr are written in ways similar to any other programming language such as C or Python. They should always be passed as a single argument, enclosed in braces. Variables and commands inside the expression will be substituted. For example:

% set somevar 12 ; puts [expr {4 + $somevar * 13}]
160

Evaluation of expressions is done so that priorities of various operands are taken into account, for example, multiplication being done before addition, as shown in the previous example.

More details about expr command, acceptable expressions and operands can be found in its manual page at: http://www.tcl.tk/man/tcl8.5/TclCmd/expr.htm

One of the most interesting aspects of Tcl is how commands are built and evaluated. The main principle of the language is that at evaluation time, all commands are built as lists, with the exception that newline characters and semi-colons are treated as command separators. This means that we can build commands as lists and they can be evaluated using the eval command, which then returns result from the command. For example:

set mycommand [list clock]
lappend mycommand format
lappend mycommand [clock seconds]
lappend mycommand -format "%Y-%m-%d %H:%M:%S"
puts [eval $mycommand]

The preceding code will cause the current date and time to be printed to the standard output. One thing worth noting is that in this example the result from clock seconds is appended to the list during building of the command, so running puts [eval $mycommand] at different times will always print out the time and date from when the mycommand variable was built.

In the preceding example, the command clock second is evaluated only once, but the clock format is run each time eval $mycommand is invoked. For example, when running:

set value 0
set mycommand [list incr]
lappend mycommand value
puts [eval $mycommand]
puts [eval $mycommand]

The script will invoke the incr value command each time eval $mycommand is run. The output from the script will be as follows:

1
2

The eval command can be used to evaluate lists or commands. For lists, it evaluates a list as single command, where the first element is the command name and each element of the list. If the specified argument is not a list, it evaluates it as one or more Tcl commands from it—similar to how the source command loads a script.

One of the major benefits of using lists to build commands is that any Tcl code can easily build other Tcl code. A simple example can be building one invocation of a command with all arguments specified either as result of other operations, for example, invoking file delete with result from glob command, which returns a list of files as a list.

The following command will fail:

file delete [glob /path/*] [glob /other/path/*]

Tcl will pass all files in /path as a single argument and all files in /other/path as the second argument. If the files /path/file1 and /path/file2 were present, an error that file /path/file1/path/file2 could not be deleted will be thrown.

In this case, instead of iterating over a list and deleting a single item at a time, we can do:

set command {file delete}
set command [concat $command [glob /path/*]]
set command [concat $command [glob /other/path/*]]
eval $command

As we can see, this method can also be used to add multiple results in a single command invocation. This can be used, for example, to perform batch operations for something such as storage systems, where such an approach results in better performance.

Tcl 8.5 also introduces another way of adding results from a command, that is, as multiple arguments to another command. If a command placed in brackets is preceded by {*} statement, its result is added to results as a list, where each element is added as another argument. In all other cases, results' commands in brackets are passed as a single argument. For this particular example, we can simply run:

file delete {*}[glob /path/*] {*}[glob /other/path/*]

This will also cause the command to get all results from both invocations of the glob command as multiple arguments.

Global, namespace, and local variables

As with any other languages, Tcl has several types of variables—global, local or namespace variables. Global variables are the ones that are defined and accessible at the global stack frame, for example, when the Tcl interpreter loads main script. They can also be made accessible from any place using the global command, passing as arguments one or more global variable names that should be made available. For example:

proc printValue {} {
global somevalue
puts "somevalue=$somevalue"
}
set somevalue 1
printValue

Local variables are variables that are used within a procedure and are only available for code in this procedure and for this particular invocation of a procedure. The following example will not work as expected, because the variable used is local:

proc addItem {item} {
lappend items $item
puts "All items: $items"
}
addItem "Item1"
addItem "Item2"

For both invocations of addItem, it will only print out the currently added item for all items in the list, because the items variable is local and Tcl does not keep track of it across invocations.

Namespaces in Tcl make it possible to keep commands, variables, and any other metadata within a namespace context. All namespaces have separate commands and variables, which allow us to create reusable code that will not interfere with code in different namespaces.

This can be used to keep libraries or pieces of our code separated so that they do not interfere with each other. For example, if all the multiple pieces of our application define a command called addItem, there will be a collision and one of them will overwrite the other one. Keeping each part of the application or reusable components in separate namespaces makes it possible to resolve such conflicts, such as queue::addItem. Namespace names are separated by double colons; in previous case, the namespace is called queue and addItem is a command within that namespace.

Namespaces are created by invoking the namespace eval command, which creates a namespace if it does not exist yet and can be used to evaluate code within that namespace. For example:

namespace eval queue {}

A namespace variable can be made accessible to procedures inside this namespace by using the variable command, specifying a variable name as the only argument. This is similar to the global command, which makes a global variable available in a procedure, but the variable is specific to the namespace that current command is created in.

Converting the previous example to namespaces we get:

proc queue::addItem {item} {
variable items
lappend items $item
puts "All items: $items"
}
queue::addItem "Item1"
queue::addItem "Item2"

This example will now work correctly, because the variable items is now a namespace variable in the queue namespace. It is also possible to access the variable outside of the namespace itself by using a fully qualified name for the variable—for example, we can add the following at the end of previous example:

puts "All items after adding: $::queue::items"

This will correctly print both items added to the queue. We have used ::queue for a namespace name as adding :: indicates that we mean the namespace queue within the global namespace. As namespaces can be nested, it is safer to specify the namespace names as fully qualified names.

Global variables in this context are variables bound to global namespace. Therefore, they can always be accessed in the same form as variables for any other queue. In order to access the global variable somevalue, we can refer to it as ::somevalue. For example:

proc printValue {} {
puts "somevalue=$::somevalue"
}
set somevalue 1
printValue

Similar to first example in this section, it will write somevalue=1 to standard output.

In addition to this, TclOO objects have their own per-instance variables that can be accessed from within TclOO objects. This subject is described in more detail in the next section.

Stack frames

The stack frame is the context in which current Tcl code is evaluated, and it defines what variables are available, the namespace in which code is evaluated, and many other things. When Tcl loads the main script, that script is evaluated in the global stack frame, which is stack frame 0. Whenever a command is invoked, it creates a new stack frame for this command. For example:

proc proc1 {} {
proc2 "world!"
}
proc proc2 {text} {
puts "Hello $text"
}
proc1

The main code is evaluated in stack frame 0, proc1 is evaluated in stack frame 1, and proc2 is evaluated in stack frame 2. This mechanism handles values such as local variables, which are bound to specific stack frame.

When running inside proc2, this is how stack frame would appear:

Stack frame

Command

0 (global)

proc1

1

proc2 "world!"

2

puts "hello world!"

The following table is a list of all stack frames and what commands they are running at the time when the puts command is invoked. The variable text, which is an argument to proc2 is available in stack frame 1. The puts command is invoked using this variable, so the actual command is run using contents of this variable.

Stack frames are created only for invocations of procedures, object methods, and other operations that require local variables—operations such as iterations with for, foreach, or while commands are performed within the same frame stack.

It is possible to evaluate commands or use variables from different stack frames using upvar and uplevel commands. The first command allows us to map a variable from a different stack frame as the local one in the current stack frame.

The commands upvar and uplevel are used for many different purposes, from referencing variables by their names to creating new control structures. In these cases, we need to use variables from the previous stack frame.

For example, if we were to write a simple command called forsequence that takes a variable name, and the minimum and maximum values, we would use it as follows:

forsequence i 1 20 {
puts "I=$i"
}

We could implement the command so that the command which is provided as the last argument is evaluated in previous stack frame as follows:

proc forsequence {varname min max command} {
upvar $varname var
for {set var $min} {$var <= $max} {incr $var} {
uplevel 1 $command

}
}

The uplevel command allows us to run any command in a different stack frame. For both commands, the first argument can a specify level or how many levels up the operation should be done. If no argument is specified, it defaults to 1 level up. The argument can be:

  • A number; in which case it specifies the number of stack frames up in the stack.
  • A number prefixed by # character; in which case the number specifies the stack frame.

For example, while being in the proc2 stack frame (stack frame 2), #1 indicates the stack frame (stack frame 1) of proc1, and 2 would indicate the global stack frame (stack frame 0 - 2 levels up).

The upvar command requires us to specify one or more pairs of variable names, where the first variable name is the variable name in other stack frame context and the second variable name is the variable name for the local stack frame context.

The following is an example of using upvar to work on variables from different stack frames:

proc proc1 {} {
set othervalue 1
proc2
}
proc proc2 {} {
upvar 2 somevalue sv
upvar #1 othervalue ov
puts "sv=$sv ov=$ov"
}
set somevalue 12
proc1

This would cause sv=12 ov=1 to be written to standard output, as proc2 is able to access variables from other stack frames.

Another example can be writing a command that extends the Tcl's incr command. If the variable is already set, it is incremented by the specified number; otherwise, the variable is set to the default value. For example:

proc incrOrSet {variable increment defaultValue} {
upvar 1 $variable var
if {[info exists var]} {
incr var $increment
} else {
set var $defaultValue
}
}

The preceding example first maps the variable whose name is specified as the first argument (variable) to the local variable var. It then checks if that variable exists, referencing it as the local variable var. If it exists, it is incremented by a specified number; otherwise it is set to the default value.

The uplevel command allows us to evaluate commands in different stack frames. This can be used for various reasons—from running commands in the caller's stack frame to creating new control structures entirely in Tcl.

All arguments are concatenated as a command to be evaluated, although, it is recommended to pass the command as one argument. For example:

proc proc1 {} {
set value 1
proc2
}
proc proc2 {} {
uplevel 1 {puts "Value=$value"}
}
proc1

This will cause Value=1 to be printed out as the puts statement is evaluated within the stack frame of proc1, even though it is within proc2.

A more practical example can be creating a procedure that runs code only if certain flags are enabled:

proc runIf {constraint command} {
global allconstraints
if {$constraint in $allconstraints} {
uplevel 1 $command
}
}

The preceding example will run the command specified as the second argument if the constraint specified by first variable is set. Constraints are checked against the global variable allconstraints—the operator in returns true if the value of $constraint is in the $allconstraints list.

For instance, the following example would check whether the platform the code is running on is win32 and load the registry package if this is true:

runIf platform-win32 {
package require registry
}

While this example simplifies this a bit, the same approach is used in Tcl's test suite package called tcltest—constraints specify whether a certain test is to be run or not and whether it is used to skip tests in specific cases or not, for the platform that the test suite is running on.

While the same thing could be accomplished with a simple if command in this case, this example also applies to more complex scenarios. Creating custom commands to implement flow control and optional execution, which is similar to the if command can be used to optimize Tcl performance. For example, we can define the following procedure:

proc debugCode {code} {
uplevel 1 $code
}

We can then add debug statements as follows:

proc sampleCommand {} {
set value [calculateValue]
debugCode { puts "Value = $value" }
return $value
}

For production code, we can disable running such code by defining the procedure as empty and accepting any arguments—for example:

proc debugCode {args} {}

In this case, when running our code in a test environment, we'll receive information about the value on standard output. When running this code in production nothing will be printed out. Tcl optimizes this so that production code will automatically skip debugCode statements while testing code will evaluate them. This is in turn much faster than using an if command directly. This mechanism is used by Tcl logging packages which are described in more detail in Chapter 4.

Both upvar and uplevel operate within the namespace of the stack frame that we are referring to. For example, if our current function is within the namespace myexample::ns2 and the previous stack frame was in the myexample namespace, referencing one level up would cause names to be looked up from the myexample namespace. For example:

namespace eval myexample {}
namespace eval myexample::ns2 {}
proc myexample::testproc {} {
return [ns2::testproc]
}
proc myexample::ns2::testproc {} {
variable localvar
upvar 1 ns2::localvar referencedvar
set referencedvar "This was set as referenced"
return $localvar
}
puts [myexample::testproc]

This script will output the text "This was set as referenced" text since upvar referenced the myexample::ns2::localvar variable and localvar was mapped to the exact same variable. The uplevel command works in a similar way, if our code were to invoke uplevel with the command set ns2::localvar "Something", then the same variable would be modified.

Usually upvar and uplevel are used to work in the context of the previous stack frame and/or reference variables by the names by which they were accessible in the other stack frames, but it is possible to use upvar to map any variable to any other variable name. It is often used for creating state data as an array, passing the array name to different functions, and using upvar to reference it under a different name. For example:

namespace eval myqueue {}
set myqueue::queueid 0
# function to create a myqueue instance
proc myqueue::create {} {
variable queueid
# initialize unique identifier
set id ::myqueue::[incr queueid]
# map variable to access our data as local variable d
upvar #0 $id d
set d(items) {}
set d(itemcount) 0
return $id
}
proc myqueue::add {id item} {
# map variable to access our data as local variable d
upvar #0 $id d
lappend d(items) $item
incr d(itemcount)
}
proc myqueue::get {id item} {
# map variable to access our data as local variable d
upvar #0 $id d
# throw an error if no data currently exists
if {$d(itemcount) == 0} {
error "No items currently in queue"
}
# get and remove first item from queue
set item [lindex $d(items) 0]
set d(items) [lrange $d(items) 1 end]
incr d(itemcount) -1
return $item
}
# sample usage - add 2 items to queue and retrieve them
set id [myqueue::create]
myqueue::add $id item1
myqueue::add $id item2
puts [myqueue::get $id]
puts [myqueue::get $id]

The output from this example would be as follows:

item1
item2

The example above shows how upvar is commonly used to achieve a lightweight mechanism for keeping data across invocations and making it possible to reuse it. The example is trivial (and can be implemented simply by using a list, without identifiers and arrays), but it shows how this mechanism can be used for keeping data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.209.201