CSc 422 Assignment 1

Inmates Running the Asylum

Assigned
Due

Sep 16
90 pts
Oct 8

This program involves using the Unix process management primitives. You should be able to run it on Linux or Mac (though I have no way to test it there), but Windows folks will have to find one of the above. You can install Linux, use a bootable USB, use the WSL, or I'll happily give you a login on Sandbox.

You are given a dangerous program loony.cpp. The loony program is rather unstable. Mostly it sleeps, but when it does wake up, it tries to throw something at one of it's siblings (with a Unix signal), after which it may do something else deadly stupid (division by zero or illegal memory reference). The assignment is to create a program, loonybin, which manages a set of five loonies. Your program must create five loony processes, each with its own name (pick any five names you like). It then maintains a collection of five by replacing ones that die. Your program reports each demise and its cause, and reports each replacement of a dead loony.

Your program should take a single number on the command line, which is the total number of loonies to be started. Start the first five loonies, one for each of the five names. Then, whenever a loony dies, start another to replace it with a new loony running under the same name. Run until the specified number of loonies has been started, then continue running until they have all died, but without replacing the departed ones. Report all process deaths with cause, and report all initial and replacement creations.

I'll discuss the requirements in more detail below, but here's what what mine looks like. The lines starting with === are reports generated by the loonybin program, the other lines are generated by the loonies themselves:

bennet@bennet$ ./loonybin 20 === Frank (pid 502657) started === === Fred (pid 502658) started === === Alice (pid 502659) started === === George (pid 502660) started === === Sally (pid 502661) started === Alice throwing Quit at 502658 === Fred (pid 502658): Signaled: Quit === === Fred (pid 502658 -> 502664) restarted === George throwing Interrupt at 502661 George exiting code 8. === George (pid 502660): Error exit code 8 === === George (pid 502660 -> 502672) restarted === === Sally (pid 502661): Signaled: Interrupt === === Sally (pid 502661 -> 502673) restarted === Frank throwing Terminated at 502664 Frank exiting code 0. === Frank (pid 502657): Normal exit === === Frank (pid 502657 -> 502674) restarted === === Fred (pid 502664): Signaled: Terminated === === Fred (pid 502664 -> 502675) restarted === Fred throwing Interrupt at 502674 === Frank (pid 502674): Signaled: Interrupt === === Frank (pid 502674 -> 502676) restarted === Alice throwing Terminated at 502673 === Sally (pid 502673): Signaled: Terminated === === Sally (pid 502673 -> 502677) restarted === George throwing Interrupt at 502677 George exiting code 0. === George (pid 502672): Normal exit === === George (pid 502672 -> 502678) restarted === === Sally (pid 502677): Signaled: Interrupt === === Sally (pid 502677 -> 502679) restarted === Fred throwing Interrupt at 502678 === George (pid 502678): Signaled: Interrupt === === George (pid 502678 -> 502680) restarted === George throwing Terminated at 502676 === Frank (pid 502676): Signaled: Terminated === === Frank (pid 502676 -> 502682) restarted === Alice throwing Quit at 502680 Alice exiting code 9. === Alice (pid 502659): Error exit code 9 === === Alice (pid 502659 -> 502684) restarted === === George (pid 502680): Signaled: Quit === === George (pid 502680 -> 502686) restarted === Fred throwing Interrupt at 502682 Fred exiting code 0. === Fred (pid 502675): Normal exit === === Fred (pid 502675 -> 502693) restarted === === Frank (pid 502682): Signaled: Interrupt === === Frank (pid 502682 -> 502694) restarted === Sally throwing Interrupt at 502686 === George (pid 502686): Signaled: Interrupt === Fred throwing Terminated at 502684 === Alice (pid 502684): Signaled: Terminated === Sally throwing Terminated at 502693 === Fred (pid 502693): Signaled: Terminated === Frank throwing Terminated at 502679 === Sally (pid 502679): Signaled: Terminated === === Frank (pid 502694): Signaled: Floating point exception === === 20 processes started. === === 20 processes ended. ===

Specific Requirements

Your program must do the following:
  1. Choose five names, whatever you like. The program tries to keep a process running under each of the names.
  2. Accept a number of loonies to create from the command line. If no number is provided, default to 50. If a number less than five is given, round up to five.
  3. Create five processes running the given loony program, one for each name. When you start each of them, send it its name as its command-line parameter.
  4. As each is created, print its name and process id.
  5. Enter a loop which contains a wait call, probably near the top of the body. This this will suspend the caller until some child process terminates. When it returns (when a child process ends), it reports which process died, and what caused termination. You should collect and print this information. The process can exit normally (code zero), exit abnormally (non-zero code), or be killed (signaled) by any of several signals. Say which happened, and print the exit code number or the name of the terminating signal.
  6. Figure out the name of the process which exited, and start a new one running under the same name. Report that you have restarted the process for this name, and give both the old and new process ids.
  7. Any of the various process management calls (fork, exec or wait) may fail. If so, you should print an appropriate message and exit. The die function in loony.cpp prints such a message, asking the system for a description of the error. (You can't just call it from your program, but you can copy it if you like.) The exec reports failure just by returning; the others return negative values for failure.
  8. Your main loop must run until it has created the specified number of loonies. The first five are included in this count. After creating the needed number, your program must continue to run until all have completed. Make sure you don't accidentally start more loonies while you are waiting for the last five to finish. Make sure to check before restarting that you haven't already started enough.
  9. When all children have completed, your program should exit normally.

How To Do All This Stuff

Write in C++. You may write in plain C if you like, but there's not much advantage, unless you just happen to much better at plain C.

You will need some sort of list to keep the id of the current process running under each name. I used a C++ map, pid to name, like

std::map<pid_t,std::string> running;
But anything you like is fine. Since the size is fixed at five, no great search efficiency is required. When you start your initial five, fill the list. When a process terminates, use the list to look up its name, perform the restart, and replace the old record with the up-to-date one.

Create processes using Unix fork and exec. You will want to look at out runner.cpp, and you can find many others online. There are several variations of fork, with different signatures. You will probably want to use the same execl form as runner.cpp. Your call should be like the first one, running the command "loony" with the parameter of the name it is running under. The loony.cpp program is given for download; see more on that below.

Your code should differ from runner.cpp by checking for the error return from fork. Store fork's return in a variable, then test it. For -1, it failed, and you should complain an exit. For 0 you run exec, and for positive, you continue as the parent. Be careful:
  1. Don't manage to call fork twice. You won't like it.
  2. If exec fails, print a message and exit with a non-zero code. If your program works otherwise, it will eventually report this exit as a terminated child and try to restart it.

After creating your initial five children, your program basically enters a loop which contains a wait. Whenever wait returns, you should first collect and print the cause of the termination. The wait call returns the you the process id of the process which ended. The runner.cpp ignores this, but you will need to capture it in a variable. The example also shows how wait returns status information through its parameter. Read the the manual to see how to extract the exit code or signal number, which you need for reporting.

When a program crashes, the status will indicate signaled. The signal number indicates the cause of the the crash, and the status returned from wait will tell you the number. Then use the strsignal utility to find the printable name that goes with the code number.

Your loop should have some counters to control it. The simplest thing is probably to have a count of child processes created, and another of child processes exited. Then keep looping while the number finished is below the limit, but stop restarting once the number created has reached the limit.

General Advice

You should be prepared for failure of any of fork, exec or wait, and generally print a message and exit. (As discussed above, for exec this won't actually make your program exit.) Fork generally will not fail, unless the system is stressed. You probably won't see that happen unless you manage to create an infinite loop containing a fork. The usual reason for wait to fail is that there are no child processes to wait for. This may just mean that you flubbed your loop counters and didn't exit even though everyone went home. This could also happen if you somehow manage to fork your main program without a proper exec and somehow reach a wait.

You will be a lot happier if one of the first things you do is create a function which creates one child loony for you. Have it take the name as a parameter, run fork, run the exec in the child and return the child's process id in the parent. Then you can call this whenever you need to run a new child process.

You will probably want to write this program incrementally, as series of partial versions. You might start by writing a program to create one loony and simply wait for its demise. Then, modify this program to correctly report the status of the loony's demise. A third version might just start the five loonies and report each demise. A fourth version might try to replace the processes when they end.

If you manage to run loony at the wrong place in the process tree, it will try to kill whatever siblings it finds, which could be almost anything. Unless you are (foolishly) running as the administrator, this will only involve destruction of your own processes, since you loony does not have permission to terminate others. This is unlikely to create any damage that can't be fixed by logging out and back in. Loony also tries to figure out if it's been run directly from the command shell, in which case it just crashes immediately, and doesn't try to take anyone else along.

You may be unfamiliar with the process of geting a number from the command line, as required and shown in the example. It's not too hard. To wit:
#include <cstdlib> ... int main(int argc, char **argv) { int num_kids = 50; if(argc > 2) num_kids = atoi(argv[1]); if(num_kids < 5) num_kids = 5; ...
Now, that wasn't so hard, was it?

Building Loony

The assignment is to write the manager program, loonybin, which starts and manages the child processes. Each child program runs loony.cpp, which you are given. Download loonypack.zip and unpack it. Now, a wrinkle: The loony is programmed to throw things at its siblings, so it must figure out which processes those are. Unfortunately, this depends on your Unix flavor. Use find_siblings_proc.cpp on Linux, and use find_siblings_kvm.cpp on BSD or Mac. They do pretty much the same thing, but in very different ways.

If you want to create a project in an IDE, the simplest thing is probably just to rename either find_siblings_proc.cpp or find_siblings_kvm.cpp to find_siblings.cpp. Then add to your project loony.cpp, find_siblings.h and find_siblings.cpp. You should be able to build loony. The KVM version requires a library flag, -lkvm. You may need to configure that somewhere in the bowels of your IDE configuration to make that one work. You will probably want to then create your loonybin program in the same project and directory.

If you want to just use make, move the unpacked files to a reasonable working directory, and say
make loony_proc
or
make loony_kvm
as appropriate, and it will create a loony of the correct type. Do not run make loony. It won't know what to do. You can then create loonybin.cpp in the same directory, and just make loonybin to build it.

There is also a ind_siblings_test.cpp program there, if you want to test the function on your system.

Submission

When your program works, and is properly commented and indented, submit it over the web using this form.