Note: if you need additional or different load libraries (-l option), look at the LDFLAGS setting at the top of Makefile. You can add any needed -l options here.
This project is a small threading exercise. We'll start with a simple program to guess a password by exhaustive enumeration, and convert it use threads. This can make the search faster when multiple cores are available. We'll be using C++ 2011 standard threading. It's a nice interface which retains the pattern of the low-level operating system interface, but makes good use of C++ templates and type system to make it much cleaner than either pthreads or the native win32 interface. We'll use the Unix-style crypt method to encrypt passwords. It will be available on Unix and Linux, and probably on Macs (but see below).
The starting download contains two complete programs, passfile maintains a file of account and encrypted password pairs, and scan1 is a non-threaded password guesser that reads the same format. The file format is simply a series of lines containing an account and encrypted password separated by a colon, the same as the first two fields of a Unix passwd or shadow file. Like this:
The starting distro will build on Linux and (probably) Mac. Windows folks will have bear with some Unix for this one. You may use Sandbox remotely, fire up WSL, use a VM, install on a flash, or any of several other things. It should also be possible to build libxcrypt or some other crypt on Windows, though I have not tried.
The above-mentioned executables, passfile, scan1 should build. The download includes two password files, misc.pwd and pins.pwd, and files that can rebuild them, misc.txt and pins.txt. The .pwd files will probably work for you on Linux. Check like this:
The existing main function contains a loop (near the bottom) that gets all of the possible password guesses, each of which it sends to the scan function to check if it matches any existing password. This loop uses an enumerator object, part of the download, which generates all possible passwords in the specified character set and range. The code is there for your interest, but for this assignment we can just use it.
First change: move this loop into a separate function, (this will become your thread function). I'll call it scandrive for this discussion, but you can call it what you like. You'll need to send in the password list and the enumerator object. Send the password list by const reference just as it is sent to scan. You might pass the enumerator in the same way, or you could use a global.
Since C++ plain threads must return void, make your scandrive return void. Scandrive should now contain the loop which runs through all the passwords provided by the enumerator and tests each. You also need to return a boolean to the caller that tells if any password was found, so it knows whether to print the final No passwords matched. message. This is a pain since the function needs to be void, so you will need to resort to an additional reference parameter or global to return this value to main. Replace the loop in main with a call to scandrive, and have it collect the success boolean. After all this work, you should have a program that does just the same thing as the one you started with :-). Test it and so that it is.
Next, replace the simple call of scandrive in the main to a single thread execution. Here is an example of calling a function as a thread. Start by including the header thread. Change the scandrive call to be a thread creation, then immediately run join to wait for the thread to finish. Now, for some technical reasons which I'll attempt to explain if you foolishly ask, you can't directly send a non-const reference to a thread function. The simplest thing is to use a utility provided for this: replace any parameter x sent by non-const reference to std::ref(x). (Alternatively, you can use the & operator and send pointers to the objects.) If you don't take care of any reference parameters this way, you will get one of the most opaque error messages the C++ compiler is able to create. And that's saying something.
Compile this version and verify that it works. You will still have a program that does exactly the same thing, but now with a bit more overhead but no speedup because you're only running one thread.
We want to change the program to use multiple threads in order to search faster. Before actually doing that, change the parameter passing logic at the top of main to collect a number of threads after the file name. The first if in main checks that there are five words on the command line. Change that to six, since we're going to add one. Look just below the if where the file name and character set are collected into string variables, and between those collect an integer into a new variable called nthread. The min and max are collected just below, so that's how it's done. Also change the Usage help message (inside the if block) to indicate that the number of threads should appear after the file name. Your solution will now accept the command arguments shown in the threaded execution above, taking a thread count after the file name, which is then ignored. So you still haven't sped anything up.
Since we will be running multiple copies of scandrive at once, we will need to synchronize the data it will share. That would be the enumerator object and (probably) the success flag. The enumerator object will need to be locked when its next method is called. The example linked above shows how to create a mutex object and use lock and unlock to place the data operation in a critical section. There are several ways to apply that here. The example declares the mutex as a global. You can do that, but if you send the enumerator by reference, it is probably not a good idea to let the mutex be global. The two are closely related, and should stay together. You can do both global, or declare the mutex in the main and send both it and the enumerator by reference, but the cleanest way is probably to add it to the enumerator class, so that its next method acquires the lock before getting the next guess, then releases it before returning. (Be careful not to let a return statement miss the unlock.) Alternatively, you could add a new method which calls the existing next under lock and returns the result.
If your success flag is shared, it must be synchronized as as well. You can use an additional mutex, or check out std::atomic. If you have a global flag, it will be shared, likewise if you have a single flag in the main which you send by reference. Sharing can be avoided, but is probably not worth the trouble.
After adding the synchronization, you might make sure your program still compiles and runs, though it still won't be any faster since you still haven't created any extra threads.
Now, time for that. In place of the single thread creation in main, make a loop, and store the threads in some container. Here is a page containing an example (the second code block), which creates 20 simple threads and stores them in a vector. (The & before the name of the function in the thread call is not required, and doesn't change the meaning.) Pay attention and create the number of threads specified in the nthread parameter, not the 20 from the example. Note also the second loop, which iterates through the vector of threads, and performs a join on each one. (The ampersand there has a different meaning, and is needed. Ah, C++!) Make sure you do not write a loop that creates a thread then waits within each iteration. That's not creating any parallelism. Now you should have a threaded program.
If you want to avoid sharing the found flag, you must send a separate one to each thread. This involves creating a vector or array of booleans so you can send one to each thread. (And don't extend such a vector in the body of the thread starting loop, since that may invalidate your references.) Then, after the join, the main must loop through to see if any password was found. This loop must be done after all the joins complete, so only the main is using the values.
When you get your program working, you should be able see good speedup with multiple threads. Use the time command to find out. Speedup will depend on the number of cores on the underlying machine. You probably will get a good return on each additional thread, up to the number of cores, and very little after. (VM's may give screwey results since it will depend on how they map VM threads to hardware cores.)
Here's what I think I know about running on a Mac. This is just from Googling, since I don't have one.
The original Unix crypt is quite obsolete because of the improvement in hardware over the years. Macs are built on a Unix base, and Apple apparently keeps the original crypt function around for compliance with the Posix standard, which requires one to exist. But they don't use it in their software, since it is obsolete. (Posix just requires the function to exist in the API, but doesn't say much about how it works. Apparently, one that always returns an error and does nothing else would be compliant.)
Linux systems generally use an updated crypt, which is actually used to store system credentials. The specific choice would be distro-dependent, but the usual one seems to be something called libxcrypt. For Mac, that library seems available here. I wouldn't be able to tell you how to install it, or how to link your code to it.
But, you should be able to build on a Mac just using the standard crypt. This should, in fact, let your project find keys much faster, since the encryption algorithm is now way too easy. That is sufficient for this assignment, since the threading, and resulting speedup, is of most interest.