Paging

Paging and Virtual Memory
1. Paging is a technique for memory management.
2. Requires significant hardware support.
3. Eliminates the fragmentation problem.
4. Most often used to implement virtual memory.
  1. User software sees more memory than is actually present.
  2. Often meaning memory seen by all running programs exceeds the installed memory, not that each program sees more.
  3. Part of the memory contents actually resides on disk, and is move to real RAM on demand.
A forerunner: Overlays.
1. An early (1960's) solution allowing a program to be larger than main memory.
2. Programmer divides the program into a main part and one or more overlays.
3. The main commands the O/S to move overlays into or out of memory.
4. A lot of work.
5. Which can suddenly be wasted when the boss buys a memory upgrade.
Paging is a system of address translation.
1. A program's address space is broken into equal-sized pages.
2. The size of a page is some power of two.
3. The physical memory is broken up into page frames (or real pages) of the same size.
4. A program's pages are placed into pages in any order.
5. Virtual addresses break into page number and offset.
6. Translation replaces the page number with the correct frame number to produce the real address.
7. For instance, if the program generates address A7, the page number (A) is replaced with the frame number where the page is located (5) to produce real address 5A. That address is sent to the main memory unit.
  
  Translation
  
  0 1 2 3 4 5 6 7 8 9 A B C D E F
  
  4 1 6 2 0 5 7 3
  
  virtual = A7 → A 7 → 5 7 → 57 = real
Page tables.
1. In-memory table similar to the translation listing above.
2. Each item in the page table describes one page, and is known, shockingly, as a page table entry (PTE).
3. Array indexed by the virtual page number.
4. Contains a bit to say if the page is in memory, the location in memory, and other information.
  1. The present/absent bit indicates that the page is present in real memory.
  2. The permission bits indicate what may be done with the page, typically read, write or execute (instruction fetch).
  3. The referenced and modified bits record when the page in memory is used or modified. More on these later.
5. When a virtual address is generated:
  1. Hardware divides into a page number and offset parts, and uses the page number as offset into the page table.
  2. This locates the PTE for the virtual page.
  3. Make sure the permission bits are satisfied by the operation, or generate a fault.
  4. If the present bit is false, generate a page fault (OS takes over).
  5. Hardware sets the referenced bit.
  6. If the memory reference is a store, set the modified bit.
  7. Combine the frame number with the offset to produce the real address and complete the operation.
6. When the present bit generates a fault
  1. If paging is being used only for memory management, this is a fatal error.
  2. For virtual memory.
    1. The OS finds the page on disk.
    2. Brings it into memory (see next section).
    3. Restarts the program.
Processes and page tables.
1. Generally, each process has its own page table.
2. A special register denotes the location of the current page table.
3. Switching between processes involves changing the active page table.
4. The OS no longer needs to assign a process to a region. Each one has its own virtual address space starting at zero. No base register.
5. No limit register, either, since a process cannot generate addresses within the space of another.
Translation and the Memory Management Unit (MMU).
1. Sits between the CPU and memory.
2. Performs the above translation.
3. Efficiency
  1. Page table is in memory.
  2. Translation needs a memory reference for each translation.
  3. Essentially doubles the time of each memory references.
4. Translation Look-aside Buffer (TLB).
  1. An associative cache of PTEs.
  2. Located inside the MMU.
  3. Lookups.
    1. Query the TLB first, which can respond more quickly than main memory.
    2. If absent from the TLB, look at the page table. Place results into the TLB.
5. TLB management.
  1. The page translation procedure will use the TLB.
  2. The CPU architecture may manage the TLB in hardware, or let the O/S do it.
  3. When the TLB is managed in hardware, address translation goes something like this:
    1. Hardware divides the address into a page number and offset parts, and performs a parallel hardware search to find the PTE in the TLB.
    2. If not found
      1. Hardware finds the PTE in the in-memory page table.
      2. If the page is absent, generate a page fault. The O/S updates the page table and restarts the instruction.
      3. Otherwise, add the PTE to the TLB. If an entry must be replaced, copy its contents back to the page table.
    3. Make sure the permission bits are satisfied by the operation, or generate a fault.
    4. Hardware sets the referenced bit, and possibly the modified bit, in the TLB copy of the PTE.
    5. Combine the frame number with the offset to produce the real address and complete the operation.
  4. When the TLB is managed in software, more like this:
    1. Hardware divides the address into a page number and offset parts, and performs a parallel hardware search to find the PTE in the TLB.
    2. If not found, generate a page fault:
      1. OS finds the PTE in the in-memory page table.
      2. If the page is not present, fetch it from disk and update the page table.
      3. Update the TLB and restart the instruction.
    3. Make sure the permission bits are satisfied by the operation, or generate a fault.
    4. Hardware sets the referenced bit, and possibly the modified bit, in the TLB copy of the PTE.
    5. Combine the frame number with the offset to produce the real address and complete the operation.
  5. A software-managed TLB simplifies the hardware, and leaves the page table layout up to the O/S designer.
  6. A hardware-managed TLB is faster since it generates fewer faults and less work in software.
  7. Hardware-managed is generally the older approach.
Two-level page tables.
1. Break the address into three parts.
2. For 32-bit, typically two levels, 10,10,12. This is the layout for 32-bit Pentium.
3. Page tables beyond the first
  1. Might simply not need to exist, since we won't usually need all 232=4G of virtual space.
  2. Might be paged out. One fetch could cause multiple page faults.
4. Some 32-bit Pentiums extend the physical address size to 36.
  1. Does not increase the number of virtual addressees for any one program.
  2. Allows the system to spread more frames among all the programs running.
5. x64 bit uses four levels.
  1. Sixteen high bits are ignored (64-bit is really 48-bit).
  2. Four groups of nine bits index four tables, each pointing to the others.
  3. Twelve-bit page offset for 4K pages.
  4. Larger pages may also be used.
Inverted page tables.
1. Instead of using the page number as an offset, hash it.
2. Page table is organized as a hash table.
3. Works best with a large, software-managed TLB.
4. Often used in 64-bit systems.