Inmates Running the Asylum
This program involves using the Unix process management primitives.
You should be able to run it on Linux or Mac (though I have no way to
test it there), but Windows folks will have to find one of the above.
You can install Linux, use a bootable USB, use the
WSL,
or I'll happily give you a login on Sandbox.
You are given a dangerous program loony.cpp. The loony program is
rather unstable. Mostly it sleeps, but when it does wake up, it tries
to throw something at one
of it's siblings (with a Unix signal), after which
it may do something else deadly stupid (division by zero or
illegal memory reference). The assignment is to create a program,
loonybin,
which manages a set of five loonies. Your program must create five
loony processes, each with its own name (pick any five names you like).
It then maintains a collection of five by replacing ones that die.
Your program reports each demise and its cause, and reports each
replacement of a dead loony.
Your program should
take a single number
on the command line, which is the total number of
loonies to be started.
Start the first five loonies, one for each of the five names.
Then, whenever a loony dies, start another to replace it with a new
loony running under the same name. Run until the specified number of
loonies has been started, then continue running until they have all
died, but without replacing the departed ones. Report all process
deaths with cause, and report all initial and replacement creations.
I'll discuss the requirements in more detail below, but here's what
what mine looks like. The lines starting with === are reports
generated by the loonybin program, the other lines are generated
by the loonies themselves:
bennet@bennet$ ./loonybin 20
=== Frank (pid 502657) started ===
=== Fred (pid 502658) started ===
=== Alice (pid 502659) started ===
=== George (pid 502660) started ===
=== Sally (pid 502661) started ===
Alice throwing Quit at 502658
=== Fred (pid 502658): Signaled: Quit ===
=== Fred (pid 502658 -> 502664) restarted ===
George throwing Interrupt at 502661
George exiting code 8.
=== George (pid 502660): Error exit code 8 ===
=== George (pid 502660 -> 502672) restarted ===
=== Sally (pid 502661): Signaled: Interrupt ===
=== Sally (pid 502661 -> 502673) restarted ===
Frank throwing Terminated at 502664
Frank exiting code 0.
=== Frank (pid 502657): Normal exit ===
=== Frank (pid 502657 -> 502674) restarted ===
=== Fred (pid 502664): Signaled: Terminated ===
=== Fred (pid 502664 -> 502675) restarted ===
Fred throwing Interrupt at 502674
=== Frank (pid 502674): Signaled: Interrupt ===
=== Frank (pid 502674 -> 502676) restarted ===
Alice throwing Terminated at 502673
=== Sally (pid 502673): Signaled: Terminated ===
=== Sally (pid 502673 -> 502677) restarted ===
George throwing Interrupt at 502677
George exiting code 0.
=== George (pid 502672): Normal exit ===
=== George (pid 502672 -> 502678) restarted ===
=== Sally (pid 502677): Signaled: Interrupt ===
=== Sally (pid 502677 -> 502679) restarted ===
Fred throwing Interrupt at 502678
=== George (pid 502678): Signaled: Interrupt ===
=== George (pid 502678 -> 502680) restarted ===
George throwing Terminated at 502676
=== Frank (pid 502676): Signaled: Terminated ===
=== Frank (pid 502676 -> 502682) restarted ===
Alice throwing Quit at 502680
Alice exiting code 9.
=== Alice (pid 502659): Error exit code 9 ===
=== Alice (pid 502659 -> 502684) restarted ===
=== George (pid 502680): Signaled: Quit ===
=== George (pid 502680 -> 502686) restarted ===
Fred throwing Interrupt at 502682
Fred exiting code 0.
=== Fred (pid 502675): Normal exit ===
=== Fred (pid 502675 -> 502693) restarted ===
=== Frank (pid 502682): Signaled: Interrupt ===
=== Frank (pid 502682 -> 502694) restarted ===
Sally throwing Interrupt at 502686
=== George (pid 502686): Signaled: Interrupt ===
Fred throwing Terminated at 502684
=== Alice (pid 502684): Signaled: Terminated ===
Sally throwing Terminated at 502693
=== Fred (pid 502693): Signaled: Terminated ===
Frank throwing Terminated at 502679
=== Sally (pid 502679): Signaled: Terminated ===
=== Frank (pid 502694): Signaled: Floating point exception ===
=== 20 processes started. ===
=== 20 processes ended. ===
Specific Requirements
Your program must do the following:
- Choose five names, whatever you like. The program tries to keep
a process running under each of the names.
- Accept a number of loonies to create from the command line. If no
number is provided, default to 50. If a number less than five is given,
round up to five.
- Create five processes running the given loony program, one
for each name. When you start each of them, send it its name as its
command-line parameter.
- As each is created, print its name and process id.
- Enter a loop which contains a wait call, probably
near the top of the body. This this will suspend
the caller until some child process terminates. When it returns
(when a child process ends), it reports
which process died, and what caused termination. You should
collect and print this information. The process can exit normally (code
zero), exit abnormally (non-zero code), or be killed (signaled) by any
of several signals.
Say which happened, and print the exit code number or the name of
the terminating signal.
- Figure out the name of the process which exited, and start a new one
running under the same name. Report that you have restarted the process for
this name, and give both the old and new process ids.
- Any of the various process management calls (fork,
exec or wait) may fail. If so, you should print an
appropriate message and exit.
The die
function in loony.cpp prints such a message, asking the system for
a description of the error. (You can't just call it from your program,
but you can copy it if you like.) The exec reports failure just
by returning; the others return negative values for failure.
- Your main loop must run until it has created the
specified number of loonies. The first five are included in this
count. After creating the needed number, your program must continue to run
until all have completed. Make sure you don't
accidentally start more loonies while you are waiting for the last
five to finish. Make sure to check before restarting that you
haven't already started enough.
- When all children have completed, your program should exit normally.
How To Do All This Stuff
Write in C++. You may write in plain C if you like, but there's not
much advantage, unless you just happen to much better at plain C.
You will need some sort of list to keep the id of the current process
running under each name. I used a C++ map, pid to name, like
std::map<pid_t,std::string> running;
But anything you like is fine. Since the size is fixed at five, no
great search efficiency is required.
When you start your initial five, fill the list. When a process
terminates, use the list to look up its name, perform the restart,
and replace the old record with the up-to-date one.
Create processes using Unix fork and exec.
You will want to look at out
runner.cpp, and you can find many others online.
There are several variations of fork, with different signatures.
You will probably want to use the same execl form as runner.cpp.
Your call should be like the first one, running the command
"loony" with the parameter of the name it is running under.
The loony.cpp program is given for download;
see more on that below.
Your code should differ from
runner.cpp by
checking for the error return from
fork.
Store
fork's return in a variable, then test it. For -1, it failed,
and you should complain an exit. For 0 you run exec, and for positive,
you continue as the parent. Be careful:
- Don't manage to call fork twice. You won't like it.
- If exec fails, print a message and exit with a non-zero code.
If your program works otherwise, it will eventually report this exit as
a terminated child and try to restart it.
After creating your initial five children, your program basically enters
a loop which contains a wait. Whenever wait returns,
you should first collect and print the cause of the termination.
The wait call returns the
you the process id of the process which ended. The runner.cpp
ignores this, but you will need to capture it in a variable. The example
also shows how wait returns status information through its
parameter. Read the the manual to see how to extract
the exit code or signal number, which you need for reporting.
When a program crashes, the status will indicate signaled. The
signal number indicates the cause of the the crash, and the status
returned from wait will tell you the number. Then use the
strsignal utility to find the printable
name that goes with the code number.
Your loop should have some counters to control it. The simplest
thing is probably to have a count of child processes created, and another
of child processes exited. Then keep looping while the number
finished is below the limit, but stop restarting once the number
created has reached the limit.
General Advice
You should be prepared for failure of any of
fork,
exec or
wait, and generally print a message and exit.
(As discussed above, for
exec this won't actually
make your program exit.)
Fork generally will not fail, unless the
system is stressed. You probably won't see that happen unless you
manage to create an infinite loop containing a
fork.
The usual reason for
wait to fail is that there are no
child processes to wait for. This may just mean that you flubbed your
loop counters and didn't exit even though everyone went home.
This could also happen if you somehow manage to
fork your main
program without a proper exec and somehow reach a wait.
You will be a lot happier if one of the first things you do is
create a function which creates one child loony for you. Have it
take the name as a parameter, run fork, run the exec
in the child and return the child's process id in the parent.
Then you can call this whenever you need to run a new child process.
You will probably want to write this program incrementally, as
series of partial versions. You might start by writing a program
to create one
loony and simply wait for its demise. Then, modify this program to
correctly report the status of the loony's demise.
A third version might just start the five loonies and report each
demise. A fourth version might try to replace the processes
when they end.
If you manage to run loony at the wrong place in the process
tree, it will try to kill whatever siblings it finds, which could be
almost anything. Unless you are (foolishly) running as the administrator,
this will only involve destruction of your own processes, since you
loony does not have permission to terminate others. This is unlikely to
create any damage that can't be fixed by logging out and back in.
Loony also tries to figure out if it's been run directly from the command
shell, in which case it just crashes immediately, and doesn't try to
take anyone else along.
You may be unfamiliar with the process of geting a number from the
command line, as required and shown in the example. It's not
too hard. To wit:
#include <cstdlib>
...
int main(int argc, char **argv)
{
int num_kids = 50;
if(argc > 2) num_kids = atoi(argv[1]);
if(num_kids < 5) num_kids = 5;
...
Now, that wasn't so hard, was it?
Building Loony
The assignment is to write the manager program,
loonybin, which
starts and manages the child processes. Each child program runs
loony.cpp, which you are given. Download
loonypack.zip and unpack it.
Now, a wrinkle:
The loony is programmed to throw things
at its siblings, so it must figure out which processes those are.
Unfortunately,
this depends on your Unix flavor. Use find_siblings_proc.cpp
on Linux, and use find_siblings_kvm.cpp on BSD or Mac. They do
pretty much the same thing, but in very different ways.
If you want to create a project in an IDE, the simplest thing is
probably just to rename either find_siblings_proc.cpp or
find_siblings_kvm.cpp to find_siblings.cpp.
Then add to your
project loony.cpp, find_siblings.h and
find_siblings.cpp.
You should be able to build loony. The KVM version requires
a library flag, -lkvm. You may need to configure that somewhere
in the bowels of your IDE configuration to make that one work.
You will probably want to then create your loonybin program in
the same project and directory.
If you want to just use make, move the unpacked files to a
reasonable working directory, and say
make loony_proc
or
make loony_kvm
as appropriate,
and it will create a loony of the correct type. Do
not
run
make loony. It won't know what to do.
You can then create
loonybin.cpp in the same directory, and
just
make loonybin to build it.
There is also a ind_siblings_test.cpp program there, if you want
to test the function on your system.
Submission
When your program works, and is properly commented and indented,
submit it over the web using
this
form.