On Question 11

The MFT practice sheet includes the following question 11. This question is a little bit programming languages, computer org, and compiling all rolled into one. And I'm not sure any of our courses really cover all the pieces. So here's my guess about how to get the answer they give in the key.

  1. Following is a definition of widget and a declaration of an array A that contains 10 widgets. The sizes of a byte, short, int, and long are 1, 2, 4, and 8 respectively. Alignment is restricted so that an n-byte field must be located at an address divisible by n. The fields in a struct are not rearranged; padding is used to ensure alignment. All widgetss in A must have the same size.

    struct widget short s byte b long l int i end widgit widget A[10]

    Assuming that A is located at a memory address divisible by 8, what is the total size A, in bytes?

Modern computer architectures impose alignment requirements on data. Simply, any quantity must be stored at an address which is a multiple of its size. The main reason is to prevent any single store or fetch instruction from straddling two memory pages or cache lines, simplifying hardware implementation. In practice, this applies to any architecture except the Pentium and friends, since all the others of its age died long ago. But even Pentium programs run faster when alignment is obeyed. Note that, since all the sizes are powers of two, an address aligned for a larger size is also aligned for all the smaller ones.

When laying out a struct containing data of different sizes, each field must have correct alignment, which may forbid simply placing the fields tightly together in memory, since an item preceding, say, a four-byte integer might not end at an address divisible by four. When a struct or class is allocated, compilers do not try to determine padding for each instance individually depending on context. Way too much work. They assume the struct is stored at an address with maximal alignment (8 in this case) and determine padding within the struct from that assumption. Then all instances of the struct must be allocated at max-aligned addresses, and have the same padding. No padding is needed before the first field. Padding is added after each field to align the following one, and at the end to make the whole struct a multiple of the maximum size.

So we'll pad the struct like this: none in front of s, since it's already at an 8-aligned address. Then b needs no padding since it can align on any address (divisible by 1). That takes us to some address three bytes past the start. Since l needs an address divisible by 8, five bytes of padding are needed to align it (starting address plus plus 8). Then i takes up four. No padding is needed between l and i, but we add four at the end to make the struct size a multiple of 8. So the total size of the struct is 2 (for s) + 1 (for b) + 5 (padding to reach an offset of 8) + 8 (for l) + 4 (for i) + 4 (padding to reach an offset of 24).

Since the array starts at an 8-aligned address, and all structs are multiple of 8 in size, all we need to do is string them together to make the array. Its size is 10×24=240.

The problem specifies that the fields may not be re-arranged, which is also the usual rule for a compiler. But suppose the programmer moved l to the first position. How large would the struct be then? What about moving l to the end? What happens if you move b to the end.

Could the programmer add another short for “free” in terms of space? Where could it go to make that happen?