TIP #190: Implementation Choices for Tcl Modules


TIP:190
Title:Implementation Choices for Tcl Modules
Version:$Revision: 1.3 $
Authors: Andreas Kupries <akupries at shaw dot ca>
Jean-Claude Wippler <jcw at equi4 dot com>
Jeff Hobbs <jeffh at activestate dot com>
State:Draft
Type:Informative
Vote:Pending
Created:Wednesday, 24 March 2004

Abstract

This document is an informational adjunct to TIP #189 "Tcl Modules", describing a number of choices for the implementation of Tcl Modules, pure-Tcl, binary, or mixed. It lists these choices and then discusses their relative merits and problems, especially their interaction with wrapping, i.e. when used in a wrapped application. The main point of the document is to dispel the illusion that the restriction to the "source" command for the loading Tcl Modules is an actual limitation. A secondary point is to make recommendations regarding preferred implementations, based the merits and weaknesses of the various possibilities.

Implementation Choices

A small recap first: Tcl Modules are Packages in a single file, and only source is used to import them into the running interpreter. These restrictions are the backdrop to all implementations discussed here.

Packages Written in Tcl

These are easy.

The usage of pure-Tcl Modules within wrapped applications poses no problems at all.

Also note that all of the choices available to binary packages, as explained in the next section, are available to pure-Tcl packages as well.

Binary Packages

A binary package consists of a shared library, possibly with adjunct Tcl and data files. These have to be bundled into a single file to be a Tcl Module.

The general approach to this is to combine an init script written in Tcl with binary data attached to it, both sections separated by a ^Z character. This is possible since 8.4, where source was changed to read only up to the first ^Z and ignore the remainder of the file, whereas other file and channel operations will see it.

Embedding a Virtual Filesystem

The most obvious way of doing this is the any-kit approach: a small initialization script in front which loads all required supporting packages and then uses them to mount an attached virtual filesystem containing all the other files. After the mount any package specific initialization can be performed, either in the initialization script itself, or in a separate script file stored in filesystem. The latter is the recommended form as it keeps the main initialization script small and package neutral i.e. it will be only filesystem specific, and not package specific. These two tasks are kept separate, which is good design in general, and becomes more important later on as well.

   +-------------||------------------------------------+
   | Init header ^Z VFS +------------+ +-----...-----+ |
   |             ^Z     | Shared lib | | Other files | |
   |             ^Z     +------------+ +-----...-----+ |
   +-------------||------------------------------------+

A concrete example of this are starkits, except that they use this technique to wrap an application into a single file, and not a package.

When interacting with wrapping this approach runs into problems. It is not possible to simply copy the module file into the wrapped application and then use it. The problem is a limitation in most implementations of alternate filesystem: they are not able to mount a virtual file again as a directory, i.e. nested mounting. This however is required when a Tcl Module using the any-kit approach is placed into a wrapped application. It was thought for a while that this could be a limitation in the VFS core of Tcl itself, but further investigation proved this to be wrong. This proof came in the form of TROFS, the Tcl Read Only Filesystem, by Don Porter. This filesystem supports nesting and thus shows that the Tcl core is strong enough for this as well. It is the implementation of a filesystem which determines if nesting is possible or not, and most do not support this.

See also SF bug report [941872] path<->FS function limitations [3] for more details on this and other problems.

There are two ways to work around this limitation, while it exists. These are explained below. However note that even if the limitation is removed we may still run into performance problems because of a file accessed through several layers of file systems, each with its own overhead. The workarounds we are about to discuss will help with this as well by removing layers of indirection and are therefore of general importance.

More a problem of taste might be that Tcl Modules in this form require additional packages which implement the filesystem they use. This can be remedied in the future by adding additional reflection capabilities to the Tcl core which would allow the implementation of channel drivers, channels transformations, and filesystems in pure Tcl, and then implementing simple filesystems based on that. This would also allow the Tcl core itself to make use of filesystems attached to its shared libraries and executables.

Appending a Shared Library

Should the binary package consist of only one shared library we can forgo the use of a full-blown virtual filesystem and simply attach the shared library to the init script as is. Instead of mounting anything the init script just has to copy the library to a temporary place and then "load" it.

   +------------------------------||----------------+
   | Init script (p name, p size) ^Z Shared library |
   +------------------------------||----------------+

Tcl Modules implemented in this way will have no problems when used in a wrapped application as they will always copy their relevant file to the native filesystem before using it.

The disadvantage is that this is not a very general scheme. There are not that much packages which consist of only one shared library and nothing else.

Note: Should we ever get loading of a shared library directly from memory or from a location in another file, then copying the library to the filesystem won't be necessary anymore either.

Appending a Library and a VFS

An extension of the last approach is to attach the virtual file system not to the init script, but the shared library.

   +-------------||----------------++---------------------+
   | Init script ^Z Shared library // VFS +-----...-----+ |
   |             ^Z                //     | Other files | |
   |             ^Z                //     +-----...-----+ |
   +-------------||----------------++---------------------+

This approach has the same advantages as the last with regard to its interaction with wrapping, i.e no problems, and additionally handles additional files coming with the shared library. The initialization of the VFS happens in the init script, but after the shared library has been loaded.

I have to admit that I am not sure if this will truly work. In essence the library will have to be told about the directory for its files after its C level initialization has been run.

If the VFS is required during the C level initialization then the VFS has to be initialized and mounted from within the shared library, i.e. at the C level. This is not very convenient as we need an embedded Tcl script for this, and that makes the code of the library more complicated than required.

Recommendations

We currently recommend usage of the any-kit approach for binary packages, despite its problem with nested mounting. This approach has an existing implementation in the metakit-based starkits and is thus well tested in general. The other two approaches are currently purely theoretical, with neither any implementation, nor testing.

Regarding Tcl packages no recommendation is necessary as we have in essence only one possibility for the more complex case, the simple concatenation of multiple files into a single one.

Questions

Comments

[ Add comments on the document here ]

Copyright

This document has been placed in the public domain.


Powered by Tcl[Index] [History] [HTML Format] [Source Format] [LaTeX Format] [Text Format] [XML Format] [*roff Format (experimental)] [RTF Format (experimental)]

TIP AutoGenerator - written by Donal K. Fellows