Throw it all away!

I’ve recently been experimenting with calling external commands (mplayer and lame, so you might be able to guess what I’ve been doing) from within a scripting language (Python, although it needn’t have been as it turns out). Bizarrely, the external commands—argumentae intactae—worked absolutely fine on their own, chained together by me, by hand. However, when executed in the scripted environment the command that produced large volumes of output was stalling at –86.somethingMB, whereas the other command stalling at 7.737MB of output.

Very odd, I thought. Odder still that the higher-volume command was being permitted to write much bigger files before it stopped. So… it can’t be a limit imposed by Python on files creating runaway outputs. Unless it’s a bandwidth limit, so the second command was having its input stream throttled…. No documentation, though, and very few reports of the error on the web. So what could it be?

The external commands are opened with the subprocess.Popen method in Python, which lets you specify destinations for the three standard I/O streams: input, output and error. I tried setting the latter two to None. All the output from the child processes was splatted onto the screen, and lo! the Python script ran to its natural conclusion.

It turned out that both commands were producing an on-screen counter or ticker to denote progress, and as this was writing to standard output, the buffers provided by Python to collect output and errors were filling up. Once a command is told by the shell that it can no longer write into its output buffer then it can end up stalling indefinitely until the buffer is cleared!

When I added command-line parameters to turn the ticker off, and piped the output into files rather than into any temporary storage, the whole system ran very nicely indeed. I hope to publish it here soon.

Exit gracefully: it’s not always possible to run external commands in ultra-quiet mode. The diagnostics they produce might be handy, so you can’t suppress them. However, unless you have a good reason to hang onto the output within your program—post-processing, say, to extract meaningful error messages—then you should be directing them to files. Also, look out for progress counters that don’t seem to cause very much output: just because the screen isn’t scrolling past doesn’t mean that the command-line program isn’t generating reams and reams of diagnostics onto the standard output and error streams. And if you direct them into a file then you could end up with a log of gibberish. Try to think in a single dimension and a single direction, like a stream of text: there’s no support for such cute rewind-and-rewrite diagnostics in logfiles.