CSc 422 Assignment 2

The Prime Thread

Assigned
Due

Oct 28
80 pts
Nov 12

This assignment is about the old academic pass-time of generating primes. If we were really serious about doing this quickly, we'd get a better algorithm, but this is a threading exercise, so we're going to try to use threads to be inefficient faster.

Here are two working C++ programs. The first one, simple.cpp finds prime numbers by the straightforward and inefficient method of trying numbers in sequence and testing each one to see if it has divisors by simply trying all the possibilities.

The second one, thr2simp.cpp, farms out the elimination operation to a second thread in a perfectly useless way. The main thread creates an eliminator thread that finds the divisors. It sends it candidates, and receives results back using this C++ bounded-buffer implementation. This bit of threading is useless, since nothing happens in parallel.

The threaded program here is written using the standard threading facility in C++, which was introduced in the 2011 standard. You should be able to run it on any platform which supports a modern C++ compiler. But you may need a special compile command. On my Linux box, I need to compile with something like

g++ thr2simp.cpp -o thr2simp -lpthread
(With the boundbuf.h in the same directory.) The last option loads the pthreads library. Though we are not using the pthreads interface for this program, the C++ compiler uses it to implement the language threading features, so the library must loaded to make the program run. If you have an older version of gcc, you may also need the option -std=c++11, or a later standard, but I think most installations will be up-to-date by now.

I had no trouble running thr2simp.cpp under CodeBlocks on Windows. You will want to create a project, delete the main.cpp that it “helpfully” adds to the new project, then add the thr2simp.cpp and boundbuf.h to it. You can then compile and run. (I did get a couple of warnings.) There is also a menu entry Project / Set programs' arguments..., which will allow you to provide command line arguments when the beast runs.

The assignment is to create a modified version of the threaded primes program in which the main thread creates multiple elimination worker threads which run, and do actual work, at the same time. If multiple CPUs are present, this arrangement can allow the program to finish more quickly. Your main thread will need to start multiple worker threads, and stop them after the printing is done. You will, of course, need several candidates active at once, perhaps by making sure the main always has several outstanding.

Your program should optionally take two command line parameters. The first is the number of primes to generate, defaulting to 200, the same as the example programs. You should also take a second parameter which is the number of worker threads to create, defaulting to five. You will also need to choose the sizes of your bounded buffers. Some small multiple of the number of threads is usually reasonable.

In thr2simp.cpp, then main feeds candidates to the worker. You can generalize this to several workers, being careful not to send the same condidate to more than on worker. Alternatively, you can let your workers generate their own candidates, but again, organize them so candidates are not repeated. The primes must be printed in order. Depending on how you distribute work, you may receive results out of order. In that case, you may use a data structure, such as a priority queue, to re-order them. (A simpler structure with poorer abstract performance will suffice if you're sure it never gets very large.) There are also reasonable organizations which deliver the results in order. However you organize it, make sure that all your threads keep busy. Avoid having them idle waiting for candidates. There are a number of solutions.

On Linux, and probably the Mac command line, you can see how long a program runs by simply adding the word time in front of the command.

time ./thr2simp 5000 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 ... 48413 48437 48449 48463 48473 48479 48481 48487 48491 48497 48523 48527 48533 48539 48541 48563 48571 48589 48593 48611 real 0m0.597s user 0m0.452s sys 0m0.123s
The first figure is elapsed time. You will find that the simple threaded program given here takes longer than the non-threaded one because it just adds overhead. Your solution should be able to run faster than the simple one on a machine with multiple cores, but it may only do so for fairly large runs. The threaded solution will always have more overhead, that may dominate for small problems.

The goal is to get the same output faster. Your speedup should be near the number of cores on your machine.

Using this approach with a better method of finding primes is an interesting thing to think about, but I haven't spent much time at it.

Here is a short tutorial on the C++ 2011 thread interface. You can find much additional material online.

Submission

When your program works, and is properly commented and indented, submit it over the web using this form.