exec and error information

Arjen Markus (27 february 2003) The exec command will return an error whenever the program (or process) that was invoked writes to standard error (or exits with a non-zero value). Some programs, like compilers, use this output channel not only to report errors but to report progress as well.

A naive script like this:

   exec f77 -o myprog myprog.f

fails (at least on Sun/Solaris), simply because the compiler, the now venerable Fortran 77 compiler, writes the names of the subroutines it encounters to standard-error.

You can use catch to prevent the script from failing, but if there truly is an error in the compilation process, how do you know?

Well, that is where the Tcl standard variables errorCode and errorInfo come in. They are described in extenso in the man page on tclvars. But here is an example:

 # Attempt to catch the return code from f77
 #
 set rc [catch { exec f77 -c myff2.f } msg ]
 set errc $errorCode; set erri $errorInfo
 puts "rc: $rc"
 puts "errc: $errc"
 puts "erri: $erri"
 puts "msg: $msg"

The file "myff2.f" does not exist and here is the result:

 rc: 1
 errc: CHILDSTATUS 7612 1
 erri: myff2.f:
 Error: Cannot open file myff2.f 

     while executing
 "exec f77 -c myff2.f "
 msg: myff2.f:
 Error: Cannot open file myff2.f 

Changing the script (to compile the existing file "myff.f") gives:

 rc: 1
 errc: NONE
 erri: myff.f:
         myff:
     while executing
 "exec f77 -c myff.f "
 msg: myff.f:
         myff:

Note that the return value from catch is 1 in both cases, but the big difference is in errorCode.


male 14th Sep. 2004:

Again: the exit code of a piped executable

Our problem is, that an external company provides an C executable (on Windows) using old FORTRAN functionality mapped in a DLL. This FORTRAN code returns failure code integers using 4Bytes (INTEGER*4). Theses failure codes are used as exit code of the C executable.

Catching the close on the blocked command channel to the C executable let the exit code be stored in side the global errorCode variable.

So far so good!

Now the problem!

On Windows the exit function in C allows to use a 4Byte integer (int or int32). But we get only the last byte of the exit code.

example: the original failure & exit code is 655, the exit code in tcl via errorCode is 143 (655 & 255 = 143)

Has anybody a tipp how to avoid this? Any hint or suggestion? Please think of the fact, that the executable is not maintained by us, it's a company external executable! Or is this a tcl speciality to be platform independent by supporting only an 1 byte sized integer as exit code in command channels?


male - 14th Sep. 2004: (my own answer)

Its a pity that the Windows platform exit code is strictly reduced from 4 bytes to 1 byte!

The comp.lang.tcl thread http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=1c8670d2.0405260607.5a4c08ce%40posting.google.com&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D1c8670d2.0405260607.5a4c08ce%2540posting.google.com decribes this.


samoc: here are some wrappers for exit and exec that simplify working with high level exit status codes:

https://github.com/samoconnor/oclib.tcl

set ex_codes {
    EX_OK           0
    EX_USAGE       64
    EX_DATAERR     65
    EX_NOINPUT     66
    EX_NOUSER      67
    EX_NOHOST      68
    EX_UNAVAILABLE 69
    EX_SOFTWARE    70
    EX_OSERR       71
    EX_OSFILE      72
    EX_CANTCREAT   73
    EX_IOERR       74
    EX_TEMPFAIL    75
    EX_PROTOCOL    76
    EX_NOPERM      77
    EX_CONFIG      78
}

foreach {name code} $ex_codes {
    dict set ex_names $code $name
}


rename exit tcl_exit

proc exit {code {message {}}} {

    if {[dict exists $::ex_codes $code]} {
        set code [dict get $::ex_codes $code]
    }
    if {$message ne {}} {
        flush stderr
        puts $message
        flush stdout
    }
    tcl_exit $code
}


rename exec tcl_exec

proc exec {args} {

    try {

        uplevel tcl_exec $args

    } trap CHILDSTATUS {result info} {

        set status [lindex [dict get $info -errorcode] 2]
        if {[dict exists $::ex_names $status]} {
            dict set info -errorcode [dict get $::ex_names $status]
        }
        return -options $info $result
    }
}

example:

try {
    ...
} trap {AWS.SimpleQueueService.QueueDeletedRecently} {
    exit EX_TEMPFAIL "Can't reuse queue name immediately after deletion. Please wait..."
}

retry count 2 {
    ...
    exec ./create_queue.tcl "foobar"
    ...
} trap {EX_TEMPFAIL} {
    after 60000
}

see retry