TIP #148: Correct [list]-Quoting of the '#' Character


TIP:148
Title:Correct [list]-Quoting of the '#' Character
Version:$Revision: 1.5 $
Author:Don Porter <dgp at users dot sf dot net>
State:Final
Type:Project
Tcl-Version:8.5
Vote:Done
Created:Friday, 08 August 2003

Abstract

This TIP proposes the correction of a long-standing bug in the [list]-quoting of the # character.

Background

Tcl has a bug in its [list]-quoting function. The bug is recorded as Tcl Bug 489537 (http://sf.net/tracker/?func=detail&aid=489537&group_id=10894&atid=110894).

Briefly, one of the documented functions of [list] is quoting of arguments so the result can be passed to [eval] with each list element becoming exactly one word of the command to be evaluated. Consider the example script:

 # FILE: demo.tcl
 set cmdName [lindex $argv 0]
 proc $cmdName {} {puts Success!}
 set command [list $cmdName]
 puts "Evaluating \[$command]..."
 eval $command

This script expects one argument on the command line, and should write Success! to stdout. This script works correctly for most input:

 $ tclsh demo.tcl foo
 Evaluating [foo]...
 Success!
 $ tclsh demo.tcl "with space"
 Evaluating [{with space}]...
 Success!

But it fails for any argument beginning with the # character:

 $ tclsh demo.tcl #bar
 Evaluating [#bar]...

This is because, contrary to the documentation, [list] does not quote the leading # in a manner to make the list safe for passing to [eval]. The Tcl parser sees the unquoted # as the start of a comment.

Starting in Tcl 8.3, optimizations for evaluation of pure lists were added, making inconsistency due to Tcl_ObjType shimmering a new symptom of this bug. If we adapt the example script to remove the [puts] (so that a pure list is maintained):

 # FILE: demo2.tcl
 set cmdName [lindex $argv 0]
 proc $cmdName {} {puts Success!}
 set command [list $cmdName]
 eval $command

We get a script that actually works with the troublesome input:

 $ tclsh demo2.tcl #bar
 Success!

This bug in [list]-quoting is present in all released versions of Tcl since and including Tcl 7.4. It may go back further.

There is no question that Tcl's behavior disagrees with its documentation on this point. I believe the documentation to be correct. From that viewpoint, this is a bug, not requiring a TIP for fixing. Because the bug has been around for so long, though, it seems prudent to make the TIP proposal, if only as fair warning to those who might have bug-dependent scripts to fix. The particular fix proposed also adds a single #define to Tcl's public header file.

A large number of tests have been added to the Tcl test suite in the HEAD, demonstrating this bug in several ways.

Proposal

Tcl_ConvertCountedElement() is modified to have the default behavior of quoting any leading # character in a list element. With this default quoting, any string representation of a list generated by Tcl will not begin with the # character, so cannot be mis-parsed as a comment.

Tcl_ConvertCountedElement() is also modified to recognize a new bit flag value in its flags argument, TCL_DONT_QUOTE_HASH, which is defined in Tcl's public header file so that it may be used by extensions. When the TCL_DONT_QUOTE_HASH bit is set in the flags argument, Tcl_ConvertCountedElement() will not quote the leading # character. Quoting of the leading # character is only necessary for the first element of a list. Those callers of Tcl_ConvertCountedElement() that can be sure they are not generating the first element of a list can pass in the TCL_DONT_QUOTE_HASH bit to produce the simplest quoting required. The behavior of the TCL_DONT_QUOTE_HASH bit flag is added to the documentation. The Tcl_ConvertElement() routine is similarly modified (trivially, since it is just a wrapper).

All callers of Tcl_ConvertCountedElement() in the Tcl source code are modified to use the TCL_DONT_QUOTE_HASH flag as appropriate, so that Tcl continues to generate as simple string representations of lists as possible that do not suffer from Bug 489537.

Prototype

A patch implementing this proposal is attached to Tcl Bug 489537 at SourceForge.

Compatibility

After acceptance of this patch, the string representation of some lists will change, though as little as possible while still fixing the bug. Scripts that perform string comparisons on lists may see different results. Notably, a test in a test suite that has a test body that generates a list, and then has an expected result as a string may see new test failures. The minimal quoting changes should keep this incompatibility to a minimum, but it may happen.

Notably there are no such compatibility problems in either the Tcl or Tk test suites. Any such incompatibility in other test suites can be easily remedied by using [list] to generate the expected result.

Scope

It has been observed by some Tcl users that [list] is used for two conceptually distinct purposes. First, adding quoting to list elements as required, so that element boundaries can be re-discovered from the string representation. Second, adding quoting so that the string representation can be passed to [eval] with the original element boundaries becoming the argument boundaries in the evaluation. One can imagine a Tcl where these two functions were separated. However, this TIP does not propose such a separation, and further arguments on that point are out of scope, and should be considered in another TIP, if at all.

Acknowledgements

The author acknowledges the discovery of this bug by Joe English, analysis by Donal Fellows, and a first draft patch from Miguel Sofer.

Copyright

This document has been placed in the public domain.


Powered by Tcl[Index] [History] [HTML Format] [Source Format] [LaTeX Format] [Text Format] [XML Format] [*roff Format (experimental)] [RTF Format (experimental)]

TIP AutoGenerator - written by Donal K. Fellows