TIP #240: An Ensemble Command to Manage Processes

TIP:	240
Title:	An Ensemble Command to Manage Processes
Version:	$Revision: 1.14 $
Author:	Steve Bold <stevebold at hotmail dot com>
State:	Draft
Type:	Project
Tcl-Version:	8.7
Vote:	Pending
Created:	Tuesday, 22 February 2005
Obsoletes:	TIP #88
Keywords:	Tcl

Abstract

This TIP proposes some new commands through which Tcl scripts can create and monitor child processes.

Rationale

This TIP is intended to overcome the following limitations of the existing exec and open commands:

While the stderr stream of a child process can be redirected to a file, it cannot be directed to a pipe and so cannot be captured progressively as the process runs. TIP #202 has partially addressed this issue but only for the case where the child's stderr stream is directed to the same pipe as its stdout stream. Independent progressive capture of both stdout and stderr is still not possible.
In the (admittedly rare) case that a program has a significant delay between closing its standard streams and the process itself terminating, a Tcl script running that program as a background process cannot determine the exit status without blocking until the process terminates.
The existing exec and open commands impose a special interpretation on the characters <>|&. This causes two kinds of problems:
- scripts wishing to invoke a command on a remote computer using an rsh or similar command will sometimes wish to have characters such as <>|& interpreted on the remote machine
- scripts may pass a user entered string as an argument to exec. Such scripts may break unexpectedly if the comment string contains one of the special characters. Such problems could be considered a security weakness in Tcl.
Multiple child processes can be launched together with pipes used to link the streams of adjacent processes. However, little flexibility is provided in such cases, for example you can only capture the exit status of the last process in the pipeline.

A more general problem is that each process related command is a separate top-level command. This is inconsistent with much else in Tcl, makes it harder to find the related commands in some forms of documentation and increases the risk of name clashes as new process related commands are introduced.

The BLT toolkit contains the command bgexec which addresses items (1) and (2) in the above list. However, the resulting implementation is complex and does not appear easy to transfer to the Tcl core. In addition, it is not clear to the author how bgexec could be extended to address items (3) and (4).

A variety of other approaches to addressing these problems are listed on the Wiki [1]. This suggests that it may be difficult to achieve a consensus on what the ideal command(s) for launching processes should look like. This TIP provides a basis through which many of these approaches could be implemented in pure Tcl. The commands specified in this TIP map easily onto the existing low level process related functions in the Tcl core, so the implementation cost is low.

Specification

There shall be a new ensemble command, process, with at least four subcommands.

The sub-command invoke takes 4 arguments and invokes a sub-process, returning the process id of the child process. The arguments are (in order):
- a list containing the program name invoke and its arguments
- a channel to be connected to the stdin stream of the child process (or an empty string if the channel is to be disconnected in the child process).
- a channel to be connected to the stdout stream of the child process (or an empty string if the channel is to be disconnected in the child process).
- a channel to be connected to the stderr stream of the child process (or an empty string if the channel is to be disconnected in the child process).
The sub-command pipe takes no arguments and returns a two element list containing the input and output channels of the pipe in that order.
The sub-command status takes a single argument which is a process id and returns a two element list. The first element is either running or completed The second element is the exit status of the process.

A process will report an arbitrary exit status of zero while it is running.
The sub-command wait is similar to status but blocks until the child process has completed.

Examples

The following shows how the commands proposed here can be used to produce a bgexec like command in pure Tcl. Not all the bgexec options are included and the implementation lacks the error handling needed for a robust implementation.

proc bgExecCloseHandler {pid cmd} {
   lassign [process status $pid] status exitCode
   if {$status eq "running"} {
      puts "... deferring close handling for $pid"
      after 1000 [list bgExecCloseHandler $pid $cmd]
   } else {
      if {$cmd ne ""} {
         {expand}$cmd $pid $exitCode
      }
   }
}

proc bgExecReadHandler {chan cmd} {
   if {[gets $chan line] == -1} {
      close $chan
      
      if {[info exists ::bgExecCloseInfo($chan)]} {
         lassign $::bgExecCloseInfo($chan) pid cmd
         after 0 bgExecCloseHandler $pid $cmd
         unset ::bgExecCloseInfo($chan)
      }
   } else {
      {expand}$cmd $line
   }
}

proc bgExecLike {args} {
   set outChan ""; set errChan ""
   set i 0
   set exitCmd ""; set parentOutChan ""
   while {$i != [llength $args]} {
      set arg [lindex $args $i]
      switch -glob -- $arg {
      	
      	-onoutput {
      	   incr i
      	   set cmd [lindex $args $i]
      	   lassign [process pipe] parentChan outChan
      	   fileevent $parentChan readable [list \
                  bgExecReadHandler $parentChan $cmd]
      	   set outCmd $cmd
      	   set parentOutChan $parentChan
      	}
      	
      	-onerror {
      	   incr i
      	   set cmd [lindex $args $i]
      	   lassign [process pipe] parentChan errChan
      	   fileevent $parentChan readable [list \
                  bgExecReadHandler $parentChan $cmd]
      	}
      	
      	-onexit {
      	   incr i
      	   set exitCmd [lindex $args $i]
      	}
      	
      	-* {
      	   error "Unknown switch $arg"
      	}
      	
      	* {
      	   break
      	}
      }
      incr i
   }
   
   set cmdLine [lrange $args $i end]

   # puts [list process invoke $cmdLine "" $outChan $errChan]
   set pid [process invoke $cmdLine "" $outChan $errChan]

   # Close the child end of the pipes - if we opened them.
   foreach var {outChan errChan} {
      if {[set $var] ne ""} {
         close [set $var]
      }
   }
   
   if {$parentOutChan eq ""} {
      # Poll for child process exit then notify client, or at least
      # clean up the zombie.
      after 0 bgExecCloseHandler $pid $exitCmd
   } else {
      # We copy BLT's trick of deferring polling till the stdout pipe
      # closes. This is marginally more efficient, more importantly
      # it stops clients being notified of their process until at stdout
      # channel has closed.
      set ::bgExecCloseInfo($parentOutChan) [list $pid $exitCmd]
   }   
}


# now show bgExecLike in action ...

proc showExit {pid code} {
   puts "$pid terminated with code $code"
}

proc showLine {channel line} {
   puts "$channel: $line"
}

proc runLs {args} {
   puts "invoking ls on $args"
   bgExecLike -onoutput "showLine stdout" -onerror "showLine stderr" \
           -onexit showExit ls {expand}$args
}

# Sample invocations: note when running under tclsh, there is no event loop,
# use 'update' to see the output to see what's happening.

# successful listing
runLs .

# Unsuccessful listing
runLs not-found

# Listing of (non existent) files containing exec/open meta characters
runLs < > | &

Limitations

For convenient use, the functionality proposed here needs to be supplemented with additional commands providing a higher level interface, perhaps one of them being similar to the bgExecLike example given previously. The author has decided to omit this feature from the TIP because:
- such commands can be implemented in pure Tcl using the commands described here
- the exact nature of the high level commands may produce lengthy discussions
- it could even be argued that such commands are more appropriate in tcllib rather than the Tcl core
As with the current implementation of exec, each channel passed to process invoke must have a valid underlying OS file handle. Consequently when running on Windows:
- use of a wish standard channel will be immediately rejected
- use of a socket will be accepted but will trigger an error in the child process.
Efficiency - The author has not yet attempted a detailed performance study, but this proposal does have some theoretical inefficiencies when compared to a pure C implementation, such as bgexec:
- an intermediate Tcl procedure is used to capture output from a pipe
- each end of each pipe has to be wrapped in a CommandChannel before it can be passed back to the calling script, even if the pipe is just going to be used to link together two processes in a pipeline.

Related Possibilities for Future Enhancements

For Windows, an important limitation that is not addressed by this TIP, is the lack of control over the console window settings when invoking a process. This will require changes to TclpCreateProcess.
A kill subcommand would be a useful addition to the process ensemble. On Windows, the ability to kill a child console process cleanly is related to the choice of console mode, so this issue would ideally be addressed in conjunction with item (1) above.
A command to categorise an exit status obtained from process status or process wait along similar lines to the data placed in $errorCode by TclCleanupChildren().
Some aspects of the existing exec command depend on use of temporary files. Since this TIP transfers the high level implementation of process launching into Tcl scripts, support for creation of uniquely named temporary files, as proposed in TIP #210, would be useful.
The wish console on Windows could be improved, using this mechanism, so that program names typed interactively will run in the background, allowing output to be seen before the process completes.
The ability to define argv[0], independently from the program name, would occasionally be useful. For example, some UNIX shells run as login shells when argv[0] begins with a dash.
Public C functions for invoking a process and creating a pipe wrapped in command channels.
Support for detaching process and for reaping detached processes.
The existing exit command could be duplicated in the process ensemble.
The basic form of the existing pid command, which obtains the process id of the current process, could be added to the process ensemble.
In some cases, it is more appropriate to run a child process with its streams connected to a null file rather than disconnected. Since the name of the null file is platform specific, it would be helpful to have a platform independent way of accessing the name.
An option to obtain full status information. On Windows, process exit codes are 32 bit. On UNIX, higher bits of a waitpid() status value distinguish termination via exit() from termination via an uncaught signal.

Reference Implementation

Submitted as patch 1315115 [2]

Copyright

This document has been placed in the public domain.

[Index] [History] [HTML Format] [Source Format] [LaTeX Format] [Text Format] [XML Format] [*roff Format (experimental)] [RTF Format (experimental)]

TIP AutoGenerator - written by Donal K. Fellows