Distributing Work

During November and December of last year I developed my checkers program to the point where I could run all three of the experiments conducted by Fogel and Chellapilla.  The longest ran for 5 days on my 2 processor laptop, which indicates that it'll take months to run some of the experiments I have in mind.

Since I'm doing this exercise for fun, and I like building "systems", I decided to make a system for distributing work (games to be played) to other machines in the house.  I've developed a somewhat complicated Java web app (using embedded Jetty) that manages a jar repository and runs (in subprocesses) Java programs whose descriptions are uploaded.  The description consists of jar names and jar hashes (to avoid running the wrong version of a jar as I'm developing), a main class name, and command line.  The Java program is provided with a work area in the file system, to which the remote client can upload any necessary files prior to starting the program, and from which the remote client can download any files after the program is done, including files containing the output of the stdout and stderr streams.

I say "somewhat complicated" because I didn't make the decision to use embedded Jetty until late in the process, and if I'd made it earlier, I would have used more of Jetty's facilities to handle the file management.

To simplify creating the description of a Java program, I created a mechanism for walking the current process's classpath, creating jars for unpacked entries (e.g. the bin folders of my Eclipse projects), and uploading them to the various servers.

Next up I'm working on how to transfer units of work from the main program to the workers, and how to get the responses back.  My current plan is to have the main program run an embedded web server (Jetty) with a servlet that a worker will contact to get a unit of work (e.g. two checkers players to compete), and will then re-contact with the result of the unit of work (e.g. the game outcome).

This mechanism is certainly sufficient to the task, but feels a bit awkward.  I think I'd like some variant of a ThreadPoolExecutor that supports distributed threads, but I've not located nor invented such a beast.  Sigh.

More later.

Popular Posts