maxcpu Architecture & Design
----------------------------

I am frequently trying to take full advantage of my SMP desktop machine,
but have been repeatedly disappointed by my shell's inability to execute
jobs on multiple CPUs.  There is no way with a shell's "for" loop to
tell it that all the commands in the for loop can be run separately on
different CPUs.

The example I use most frequently is that of image manipulation.  When I
download all the images I've taken on my digital camera, they're huge:
1600x1200 pixels.  I'm not in the habit of making people download such
giant images with their browser, so for the images that I post on the web,
I resize them to 640x480, which is a reasonable size for most web sites.
To do the resizing quickly, I use the ImageMagick tool "convert" to do all
the work.  I don't need to mouse around and click things to make it happen
(as I would if I was using the Gimp).  So, I generally type in my shell:

	for i in *.jpg; do
		convert -geometry 640x480 -quality 70 $i $i
	done

This will get the job done, but it's mostly CPU intensive.  There is disk
access, but it's only for reading the original file, and writing back the
result.  The result is pure number crunching.  Watching my CPU monitor on
dual-CPU machine is sad, because that "for" loop executes one "convert"
at a time, leaving an entire CPU idle.  And as a result, I have to wait
twice as long as I would have to if both CPUs were working on the task.

I couldn't find anything that would do batch execution of command
lines based on the number of available CPUs.  The closest thing was
"make"'s "-j" option, which will let you specify how many parallel tasks
to execute.  However, "make" does not detect how many CPUs there are,
and to use it, the tasks must be defined in a Makefile.  As a result,
maxcpu was born.  It reads command lines from STDIN, and executes as
many in parallel as there are detected CPUs.

So now, I can use both CPUs for my image processing (or anything else, for
that matter).  Note the added "echo" in the "for" loop:

	(for i in *.jpg; do
		echo convert -geometry 640x480 -quality 70 $i $i
	done) | maxcpu

To portably detect how many CPUs are available, I use the POSIX system
call "sysconf".  Unfortunately, Perl's POSIX module doesn't include the
constant required for CPU detection (_SC_NPROCESSORS_ONLN).  As a result,
I use Inline::C to execute the detection function.


