Chapter 5. Tcl Lists

This chapter describes Tcl lists. Tcl commands described are: list, lindex, llength, lrange, lappend, linsert, lreplace, lsearch, lset, lsort, concat, join, and split.

Lists in Tcl have the same structure as Tcl commands. All the rules you learned about grouping arguments in Chapter 1 apply to creating valid Tcl lists. However, when you work with Tcl lists, it is best to think of lists in terms of operations instead of syntax. Tcl commands provide operations to put values into a list, get elements from lists, count the elements of lists, replace elements of lists, and so on. It is a good habit to use commands like list and lappend to construct lists, instead of creating them by hand. Lists are used with commands such as foreach that take lists as arguments. In addition, lists are important when you are building up a command to be evaluated later. Delayed command evaluation with eval is described in Chapter 10, and similar issues with Tk callback commands are described in Chapter 30.

However, Tcl lists are not often the right way to build complicated data structures in scripts. You may find Tcl arrays more useful, and they are the topic of Chapter 8. List operations are also not right for handling unstructured data such as user input. Use regular expressions instead, which are described in Chapter 11.

Tcl Lists

A Tcl list is a sequence of values. When you write out a list, it has the same syntax as a Tcl command. A list has its elements separated by white space. Braces or quotes can be used to group words with white space into a single list element. Because of the relationship between lists and commands, the list-related commands described in this chapter are used often when constructing Tcl commands.

Note

Tcl Lists

Since Tcl 8.0, lists are really 1-dimensional object arrays.

Early versions of Tcl represented all values as strings. Lists were just strings with special syntax to group their elements. The string representation was parsed on each list access, so you could have performance problems with large lists. The performance of lists was improved by the Tcl compiler added in Tcl 8.0. The Tcl runtime now stores lists using an C array of pointers to each element. (The Tcl_Obj type is described on page 694.) Tcl can access any element in the list with the same cost. Appending new elements to a list is made efficient by over allocating the array so there is room to grow. The internal format also records the number of list elements, so getting the length of a list is cheap. However, you can still get into performance trouble if you use a big Tcl list like a string, e.g., for output. Tcl will convert the list into a string representation if you print it to a file, or manipulate it with string commands. Table 5-1 describes Tcl commands for lists.

Table 5-1. List-related commands

list arg1 arg2 ...

Creates a list out of all its arguments.

lindex list ?i ...?

Returns the ith element from list. Specifying multiple index elements allows you to descend into nested lists easily.

llength list

Returns the number of elements in list.

lrange list i j

Returns the ith through jth elements from list.

lappend listVar arg ...

Appends elements to the value of listVar.

linsert list index arg arg ...

Inserts elements into list before the element at position index. Returns a new list.

lreplace list i j arg arg ...

Replaces elements i through j of list with the args. Returns a new list.

lsearch ?options? list value

Returns the index of the element in list that matches the value according to the options. Glob matching is the default. Returns -1 if not found.

lset listVar ?i ...? newValue

Set the ith element in variable listVar to newValue. (Tcl 8.4)

lsort ?switches? list

Sorts elements of the list according to the switches-ascii, -dictionary, -integer, -real, -increasing, -decreasing, -index ix, -unique, -command command. Returns a new list.

concat list list ...

Joins multiple lists together into one list.

join list joinString

Merges the elements of a list together by separating them with joinString.

split string splitChars

Splits a string up into list elements, using the characters in splitChars as boundaries between list elements.

Constructing Lists

Constructing a list can be tricky if you try to write the proper list syntax by hand. The manual approach works for simple cases. In more complex cases, however, you should use Tcl commands that build lists. Using list commands eliminates the struggle to get the grouping and quoting right, and the list is maintained in an efficient internal format. If you create lists by hand with quoting, there is additional overhead to parse the string representation the first time you use the list.

The list command

The list command constructs a list out of its arguments so that there is one list element for each argument. The simple beauty of list is that any special characters in the list elements do not matter. Spaces inside an element do not cause it to become more than one list element. The list command is efficient, too. It doesn't matter if list is making a list of three single-character values, or three 10 kilobyte values. The cost to make that three element list is the same in either case. The most compelling uses of list involve making lists out of variables that could have arbitrary values, as shown in Example 5-1.

Example 5-1. Constructing a list with the list command

set x {1 2}
=> 1 2
set y $foo
=> $foo
set l1 [list $x "a b" $y]
=> {1 2} {a b} {$foo}
set l2 [list $l1 $x]
=> {{1 2} {a b} {$foo}}} {1 2}

Note

Constructing a list with the list command

The list command does automatic quoting.

The first list, l1, has three elements. The values of the elements do not affect the list structure. The second list, l2, has two elements, the value of l1 and the value of x. Internally Tcl shares values instead of making copies, so constructing lists out of other values is quite efficient.

When you first experiment with Tcl lists, the treatment of curly braces can be confusing. In the assignment to x, for example, the curly braces disappear. However, they seem to come back again when $x is put into a bigger list. Also, the double quotes around a b get changed into curly braces. What's going on? There are three steps in the process. In the first step, the Tcl parser groups arguments to the list command. In the grouping process, the braces and quotes are syntax that define groups. These syntax characters get stripped off. The braces and quotes are not part of the values being grouped. In the second step, the list command creates an internal list structure. This is an array of references to each value. In the third step the value is printed out. This step requires conversion of the list into a string representation. The string representation of the list uses curly braces to group values back into list elements.

The lappend Command

The lappend command is used to append elements to the end of a list. The first argument to lappend is the name of a Tcl variable, and the rest of the arguments are added to the variable's value as new list elements. Like list, lappend operates efficiently on the internal representation of the list value. It is always more efficient to use lappend than to try and append elements by hand.

Example 5-2. Using lappend to add elements to a list

lappend new 1 2
=> 1 2
lappend new 3 "4 5"
=> 1 2 3 {4 5}
set new
=> 1 2 3 {4 5}

The lappend command is unique among the list-related commands because its first argument is the name of a list-valued variable, while all the other commands take list values as arguments. You can call lappend with the name of an undefined variable and the variable will be created.

The lset Command

The lset command was introduced in Tcl 8.4 to make it easier, and more efficient, to set one element of a list or nested list. Like lappend, the first argument to lset is the name of a list variable. The last argument is the value to set. The middle arguments, if any, specify which element to set. If no index is specified, the whole variable is set to the new value. If the index is a single integer, or end-integer, then that element of the list is set. If you have a nested list, then you can specify several indices, and each one navigates into the nested list structure. This is illustrated in Example 5-3. If you specify several indices they can be separate arguments, or grouped into a list. Range checking in lset is strict and an error will be thrown for indices given outside of the list or sublist range. The new value of the list in the variable is returned, although you rarely need this because lset modifies the list variable directly.

Example 5-3. Using lset to set an element of a list

lset new "a b c"
=> a b c
lset new 1 "d e"
=> a {d e} c
lset new 1 0 "g h"
=> a {{g h} e} c

The concat Command

The concat command is useful for splicing lists together. It works by concatenating its arguments, separating them with spaces. This joins multiple lists into one list where the top-level list elements in each input list become top-level list elements in the resulting list:

Example 5-4. Using concat to splice lists together

set x {4 5 6}
set y {2 3}
set z 1
concat $z $y $x
=> 1 2 3 4 5 6

Double quotes behave much like the concat command. In simple cases, double quotes behave exactly like concat. However, the concat command trims extra white space from the end of its arguments before joining them together with a single separating space character. Example 5-5 compares the use of list, concat, and double quotes:

Example 5-5. Double quotes compared to the concat and list commands

set x {1 2}
=> 1 2
set y "$x 3"
=> 1 2 3
set y [concat $x 3]
=> 1 2 3
set s { 2 }
=>  2
set y "1 $s 3"
=> 1  2  3
set y [concat 1 $s 3]
=> 1 2 3
set z [list $x $s 3]
=> {1 2} { 2 } 3

The distinction between list and concat becomes important when Tcl commands are built dynamically. The basic rule is that list and lappend preserve list structure, while concat (or double quotes) eliminates one level of list structure. The distinction can be subtle because there are examples where list and concat return the same results. Unfortunately, this can lead to data-dependent bugs. Throughout the examples of this book, you will see the list command used to safely construct lists. This issue is discussed more in Chapter 10.

Getting List Elements: llength, lindex, and lrange

The llength command returns the number of elements in a list.

llength {a b {c d} "e f g" h}
=> 5
llength {}
=> 0

The lindex command returns a particular element of a list. It takes an index; list indices count from zero.

set x {1 2 3}
lindex $x 1
=> 2

You can use the keyword end to specify the last element of a list, or the syntax end-N to count back from the end of the list. The following commands are equivalent ways to get the element just before the last element in a list.

lindex $list [expr {[llength $list] - 2}]
lindex $list end-1

The lrange command returns a range of list elements. It takes a list and two indices as arguments. Again, end or end-N can be used as an index:

lrange {1 2 3 {4 5}} 2 end
=> 3 {4 5}

Modifying Lists: linsert and lreplace

The linsert command inserts elements into a list value at a specified index. If the index is zero or less, then the elements are added to the front. If the index is equal to or greater than the length of the list, then the elements are appended to the end. Otherwise, the elements are inserted before the element that is currently at the specified index. The following command adds to the front of a list:

linsert {1 2} 0 new stuff
=> new stuff 1 2

lreplace replaces a range of list elements with new elements. If you don't specify any new elements, you effectively delete elements from a list.

Notelinsert and lreplace do not modify an existing list like the lappend and lset commands. Instead, they return a new list value. In the Example 5-6, the lreplace command does not change the value of x:

Example 5-6. Modifying lists with lreplace

set x [list a {b c} e d]
=> a {b c} e d
lreplace $x 1 2 B C
=> a B C d
lreplace $x 0 0
=> {b c} e d

Searching Lists: lsearch

lsearch returns the index of a value in the list, or -1 if it is not present. lsearch supports pattern matching in its search. Simple pattern matching is the default, and this can be disabled with the -exact option. The glob pattern matching lsearch uses is described in more detail on page 53. The -regexp option lets you specify the list value with a regular expression. Regular expressions are described in Chapter 11.

In the following example, the glob pattern l* matches the value list, and lsearch returns the index of that element in the input list:

lsearch {here is a list} l*
=> 3

Example 5-7 shows ldelete as a combination of lreplace and lsearch:

Example 5-7. Deleting a list element by value

proc ldelete { list value } {
   set ix [lsearch -exact $list $value]
   if {$ix >= 0} {
      return [lreplace $list $ix $ix]
   } else {
      return $list
   }
}

Tcl 8.4 added several features to lsearch, including typed searching, optimized searches for sorted lists, and the ability to find all matching elements of a list. The lsearch typed searches use the internal object representation for efficiency and speed. For example, if you have a list of numbers, the -integer option tells lsearch to leave the values in their native integer format. Otherwise it would convert them to strings as it did the search. If your list has been sorted, the -sorted option tells lsearch to perform an efficient binary search. Sorting lists is described on page 70.

The -inline option returns the list value instead of the index. This is most useful when you are matching a pattern, and it works well with the -all option that returns all matching indices, or values:

set foo {the quick brown fox jumped over a lazy dog}
lsearch -inline -all $foo *o*
=> brown fox over dog

The lsearch options are described in Table 5-2:

Table 5-2. Options to the lsearch command

-all

Search for all items that match and return a list of matching indices.

-ascii

The list elements are to be compared as ascii strings. Only meaningful when used with -exact or -sorted.

-decreasing

Assume list elements are in decreasing order. Only meaningful when used with -sorted.

-dictionary

The list elements are to be compared using dictionary-style comparison. Only meaningful when used with -exact or -sorted.

-exact

Do exact string matching. Mutually exclusive with -glob and -regexp.

-glob

Do glob-style pattern matching (default). Mutually exclusive with -exact and -regexp.

-increasing

Assume list elements are in increasing order. Only meaning when used with -sorted.

-inline

Return the actual matching element(s) instead of the index to the element. An empty string is returned if no elements match.

-integer

The list elements are to be compared as integers. Only meaning when used with -exact or -sorted.

-not

Negate the sense of the match.

-real

Examine all elements as real (floating-point) values. Only meaning when used with -exact or -sorted.

-regexp

Do regular expression pattern matching. Mutually exclusive with -exact and -glob. Regular expressions are described in Chapter 11.

-sorted

Specifies that the list is presorted, so Tcl can do a faster binary search to find the pattern.

-start ix

Specify the start index in the list to begin searching.

Sorting Lists: lsort

You can sort a list in a variety of ways with lsort. The list is not sorted in place. Instead, a new list value is returned. The basic types of sorts are specified with the -ascii, -dictionary, -integer, or -real options. The -increasing or -decreasing option indicate the sorting order. The default option set is -ascii -increasing. An ASCII sort uses character codes, and a dictionary sort folds together case and treats digits like numbers. For example:

lsort -ascii {a Z n2 n100}
=> Z a n100 n2
lsort -dictionary {a Z n2 n100}
=> a n2 n100 Z

You can provide your own sorting function for special-purpose sorting. For example, suppose you have a list of names, where each element is itself a list containing the person's first name, middle name (if any), and last name. The default sorts by everyone's first name. If you want to sort by their last name, you need to supply a sorting command.

Example 5-8. Sorting a list using a comparison function

proc NameCompare {a b} {
   set alast [lindex $a end]
   set blast [lindex $b end]
   set res [string compare $alast $blast]
   if {$res != 0} {
      return $res
   } else {
      return [string compare $a $b]
   }
}
set list {{Brent B. Welch} {John Ousterhout} {Miles Davis}}
=> {Brent B. Welch} {John Ousterhout} {Miles Davis}
lsort -command NameCompare $list
=> {Miles Davis} {John Ousterhout} {Brent B. Welch}

The NameCompare procedure extracts the last element from each of its arguments and compares those. If they are equal, then it just compares the whole of each argument.

Tcl 8.0 added a -index option to lsort that can be used to sort lists on an index. Instead of using NameCompare, you could do this:

lsort -index end $list

Tcl 8.3 added a -unique option that removes duplicates during sort:

lsort -unique {a b a z c b}
=> a b c z

The split Command

The split command takes a string and turns it into a list by breaking it at specified characters and ensuring that the result has the proper list syntax. The split command provides a robust way to turn input lines into proper Tcl lists:

set line {welch:*:28405:100:Brent Welch:/usr/welch:/bin/csh}
split $line :
=> welch * 28405 100 {Brent Welch} /usr/welch /bin/csh
lindex [split $line :] 4
=> Brent Welch

Note

The split Command

Do not use list operations on arbitrary data.

Even if your data has space-separated words, you should be careful when using list operators on arbitrary input data. Otherwise, stray double quotes or curly braces in the input can result in invalid list structure and errors in your script. Your code will work with simple test cases, but when invalid list syntax appears in the input, your script will raise an error. The next example shows what happens when input is not a valid list. The syntax error, an unmatched quote, occurs in the middle of the list. However, you cannot access any of the list because the lindex command tries to convert the value to a list before returning any part of it.

Example 5-9. Use split to turn input data into Tcl lists

set line {this is "not a tcl list}
lindex $line 1
=> unmatched open quote in list
lindex [split $line] 2
=> "not

The default separator character for split is white space, which contains spaces, tabs, and newlines. If there are multiple separator characters in a row, these result in empty list elements; the separators are not collapsed. The following command splits on commas, periods, spaces, and tabs. The backslash–space sequence is used to include a space in the set of characters. You could also group the argument to split with double quotes:

set line "	Hello, world."
split $line  ,.	
=> {} Hello {} world {}

A trick that splits each character into a list element is to specify an empty string as the split character. This lets you get at individual characters with list operations:

split abc {}
=> a b c

However, if you write scripts that process data one character at a time, they may run slowly. Read Chapter 11 about regular expressions for hints on really efficient string processing and using regexp for a multi-character split routine.

The join Command

The join command is the inverse of split. It takes a list value and reformats it with specified characters separating the list elements. In doing so, it removes any curly braces from the string representation of the list that are used to group the top-level elements. For example:

join {1 {2 3} {4 5 6}} :
=> 1:2 3:4 5 6

If the treatment of braces is puzzling, remember that the first value is parsed into a list. The braces around element values disappear in the process. Example 5-10 shows a way to implement join in a Tcl procedure, which may help to understand the process:

Example 5-10. Implementing join in Tcl

proc join {list sep} {
   set s {}  ;# s is the current separator
   set result {}
   foreach x $list {
      append result $s $x
      set s $sep
   }
   return $result
}

Related Chapters

  • Arrays are the other main data structure in Tcl. They are described in Chapter 8.

  • List operations are used when generating Tcl code dynamically. Chapter 10 describes these techniques when using the eval command.

  • The foreach command loops over the values in a list. It is described on page 79 in Chapter 6.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.6.75