MC logo

Using The Cluster

  Parallel Programming

There is a small cluster located behind the Perlie server. The hostnames are cities. At this instant, there are five working: atlanta, chicago, nashville, seattle and stlouis. The /home partition is shared, so files you save on one will be saved on all. (The disk actually resides on chicago.)

Setting Up

You should do the following to set up your account.

  1. Log on to sandbox.

  2. Log on to one of the cluster machines.

  3. Create a password-less SSH credential for use within the cluster. (If you prefer to use a password, you may, but you will need to use an agent when you want to do anything.) Use the ssh-keygen program and simply press return when it asks for a password:
      test@stlouis:~$ ssh-keygen -t dsa
    Generating public/private dsa key pair.
    Enter file in which to save the key (/home/test/.ssh/id_dsa): 
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
    Your identification has been saved in /home/test/.ssh/id_dsa.
    Your public key has been saved in /home/test/.ssh/id_dsa.pub.
    The key fingerprint is:
    61:58:af:01:f6:29:80:01:f2:5b:10:4d:51:51:c1:07 test@stlouis
    test@stlouis:~$

  4. Now, copy the public key you just created as an authorized key.
      test@stlouis:~$ cd .ssh
    test@stlouis:~/.ssh$ ls
    id_dsa  id_dsa.pub  known_hosts
    test@stlouis:~/.ssh$ cp id_dsa.pub authorized_keys
    test@stlouis:~/.ssh$ ls
    authorized_keys  id_dsa  id_dsa.pub  known_hosts
    test@stlouis:~/.ssh$ 

  5. Since the home areas are shared, this authorized_keys file immediately exists on all hosts. Therefore, you will be able to run commands elsewhere on the cluster without a commaond.
      test@stlouis:~$ ssh seattle hostname
    seattle
    test@stlouis:~$

    In fact, there is a simple script that lets you run the same command on all the hosts
      test@stlouis:~$ allhosts hostname
    atlanta
    chicago
    nashville
    seattle
    stlouis
    test@stlouis:~$

Once this is set up, you can log into any of the cluster machines and run code on any of them. Presently, I haven't set up a reasonable way of unifying passwords, so you password on each node is set separately. What you probably want to do is set up ssh credentials on Perlie that will let you connect to a node. You can create an id_dsa.pub on perile as above (only you should probably use a passphrase), then append a copy to the authorized_keys file on one of the cluster nodes. This will give you access to any node from perlie with the passphrase you created.

To run a parallel program:

  1. Log onto your favorite node.

  2. Create your program, say fred.c, with your favorite editor.
      include <stdio.h>
    #include <mpi.h>

    int main(int argc, char **argv)
    {
            int nproc, myid;
            MPI_Init(&argc,&argv);
            MPI_Comm_size(MPI_COMM_WORLD,&nproc);
            MPI_Comm_rank(MPI_COMM_WORLD,&myid);
            printf("Process %d of %d.\n", myid, nproc);
            MPI_Finalize();
    }

  3. Compile like so: mpicc -o fred fred.c. There's also mpicxx for C++ code.

  4. Create a file .mpd.conf in your home directory which contains just the line secretword=whatever. You get the choose the secret word.

  5. Make sure no one else can read your secret word: chmod 0600 .mpd.conf

  6. Create a file mpd.hosts which is a list of the machines to use, one per line. Like this:
      atlanta
    chicago
    nashville
    seattle
    stlouis

    This needs to be in the directory where you are working.

  7. Now start the manager: mpdboot -n 5. The five is the number of hosts to use, and it cannot be more than what's in mpd.hosts, but can be less. You can use a different file for you host list with the -f switch.

  8. Now you can run your job:
      test@stlouis:~$ mpiexec -n 8 fred
    Process 0 of 8.
    Process 2 of 8.
    Process 1 of 8.
    Process 3 of 8.
    Process 4 of 8.
    Process 5 of 8.
    Process 6 of 8.
    Process 7 of 8.
    test@stlouis:~$ 

    The 8 is the number of tasks, which may be more or less than the number of machines in mpdboot.

  9. When you are done, stop the managers with mpdallexit