- Types of memory.
- Random-Access Memory (RAM).
- Volatile.
- Static RAM (SRAM)
- Transistor memory, much like registers.
- Retain contents so long as power is applied.
- Dynamic RAM (DRAM).
- Collection of small capacitors.
- Loses contents after a few ms. Motherboard refreshes to maintain
contents.
- Slower than SRAM.
- Denser and cheaper than SRAM.
- Comprises most of a PC's main memory.
- Read-Only Memory (ROM).
- Holds fixed contents and cannot be stored by the CPU.
- Also random-access. (Unfortunate naming.)
- Non-volatile (of course).
- Primarily needed for booting.
- Types
- Basic ROM chips are manufactured with content and unchangeable.
- Programmable ROMs (PROMs) can be “burned” once with
appropriate equipment.
- Erasable-Programmable ROMs (EPROMs) can also be erased and
re-programmed.
- Electronically- (EEPROMs) can be erased by the computer that contains
them, under program control, though they don't respond to plain
store operations.
- Flash a variety of EEPROM.
- Memory Hierarchy
- Speed up a large, slow, cheap (per-byte) storage technology by keeping
the most active contents in a small, fast, expensive memory called
a cache.
- Check in the cache.
- If present, produce the answer.
- If absent, fetch from the larger store.
- Save the value in the cache, perhaps replacing older there.
- Produce the answer.
- Primary hierarchy
- Each level is a cache for the on its right.
- Cache performance terms.
- Hit: The information was found in the cache.
- Miss: It wasn't.
- Hit rate: The proportion of references which are hits.
- Miss rate: The proportion of references which are misses.
- Hit time: The time required to access information at a given level.
- Miss penalty: The time required to process a miss, including the
overhead of adding it to the cache.
- Locality of reference
- Temporal locality: Recently accessed locations tend to be
accessed again soon.
- Spatial locality: After a reference, nearby locations are
more likely to be next.
- Sequential locality: Spatial locality that resulting from
sequential instruction execution.
- Cache memory.
- Made of SRAM. Holds entries from RAM.
- Managed automatically by hardware.
- PC's typically have at least two, one on the CPU chip, and one
on the motherboard.
- Arrangements.
- Direct Cache.
- Division:
Tag
Block (Line Number)
Offset (Within Block)
- Item is located on the indicated cache line, at the indicated
offset.
- The valid bit is set to indicate the line contents are valid.
- A new reference must use that line, and replace anything that's
already there.
- Fully-Associative Cache.
- Division:
- Any cache row can take any location.
- Look-up by search, not an index number from the address.
- Search is done in parallel.
- Set-Associative Cache
- Division:
Tag
Block (Line Number)
Offset (Within Block)
- A compromise.
- Divide the address in three parts and choose a line.
- Each line is a small associative cache.
- Gives more flexibility, so a few rows with the same line may
be retained.
- Replacement policies: What to evict?
- Least-Recently Used (not practical).
- First-In First-Out.
- Random.
- Can start the cache and memory access in parallel.
- When does caching not work well?
- A program might not exhibit good locality.
- An array scan where the step size equals the row size can be very bad.
- Write policy: What to do with written data.
- Write-through: Stores sent both to cache and memory.
- Write-back.
- Stores just go to the cache.
- Copied from cache to memory when the entry is replaced.
- Special caches.
- Separate instruction and data caches.
- Victim cache: a small associative that holds entries evicted
by conflict.
- Trace cache: hold decoded instructions.
- Cache levels.
- Usually multiple levels of cache.
- Level 1 is on the chip with the CPU.
- Level 2 on the system motherboard.
- Some computers have a high-speed level 2, and a slower level 3.
- Inclusive: Entry replicated in higher-level caches.
- Exclusive: Entry in just one place.
- Virtual Memory
- Name refers to a system which allows a program to use more RAM
than the system has.
- Implemented by keeping memory contents on disk, and moving things
into and out of memory as needed.
- Uses RAM as a cache for information stored on disk.
- Page Mapping.
- Addresses used by the program are not the true addresses of the
information in RAM.
- The program uses virtual addresses
- The data are located in RAM at real addresses
- Memory contents
is divided up into fixed-size blocks called pages.
Typical page size 4K.
- Addresses are broken into a page number and and in-page offset.
Page Number
Offset (Within Page)
- The page size is a power of two, and the offset size is the power.
- All addresses within the page have the same page number.
- Offsets are from 0 to page size minus one.
- Main memory is divided blocks of the same size, called
page frames.
- Each frame holds one page (or is perhaps empty).
- Each address in memory is divided in a similar way.
- Each reference is
- Divided into a page number and offset.
- The page number translated to the frame number where the page
is located.
- The frame number and the offset is sent to the actual RAM.
- Programs must be rounded up to a multiple of the page size. This waste is
is called internal fragmentation.
- The page map.
- The mapping from page number to frame number is stored in a
page table.
- Entries are located at the page number offset.
- A page table entry contains
- A valid bit indicating if the entry contains valid data.
- The frame number where the page is located.
- The referenced and modified bits (later).
- Permission bits specifying the operations allowed on the page.
- References are translated by looking up the PTE in the table, and
using the frame number to build the real address.
- Translation Look-Aside Buffer (TLB) speeds up translation.
- A special cache of PTEs.
- Fully associative.
- Avoid needing an extra memory access (for the page table)
for every memory access.
- Demand paging.
- There are more pages than page frames. Extra pages are kept on disk.
- When such a page is needed,
- The program waits until the page can be copied in from disk.
- It is placed in a free frame, if any, or a page is removed
from memory to make room.
- Translation is handled by hardware.
- If a page is absent from RAM, its PTE is not valid, and the
translation fails.
- This failure causes a page fault, which is a trap that
invokes the O/S.
- The O/S is responsible to:
- Suspend the running program.
- Choose a frame and read the page into it.
- Update the page table.
- Restart the program.
- Full Procedure.
- Note that part of this is performed by hardware, and part by software.
- Hardware produces a page fault to make it the software's problem.
- Depending on the CPU, this will be after the failure of either the PTE lookup or
the page table lookup.
- Referenced and modified bits.
- When a PTE is used, the hardware sets its referenced bit.
- If the use is a store, it also sets the modified bit.
- When the O/S brings a page in from disk, it clears both.
- When a page is removed from RAM, the modified bit tells if it
needs to be copied back to disk.
- The referenced bit is used by the O/S to help choose which
page to replace.
- The O/S has some replacement policy
- Most involve periodically clearing the referenced bit so it
tells if the page was referenced recently.
- Virtual memory as caching.
- Fully associative (a page may go anywhere in memory).
- Write-back.
- Segmentation.
- A segment is a logical division of the program: a single function, library, object or
data structure.
- Segments differ in size.
- Addresses are explicitly two-part: [ segment number, offset (within segment) ],
denoting a location inside a specified segment.
- A segment table contains a descriptor for each segment.
- Offset into the table to find the segment descriptor.
- Descriptor contains the base, limit (size), and permissions for the segment.
- Verify the offset < the segment size, or fault.
- Verify the permissions allow the operation, or fault.
- Add the base to the offset to form the real address.
- Variable-sized segments are difficult to manage.
- Segments must be given contiguous slots in RAM.
- Segments are frequently added or removed.
- Over time, unusably-small
chunks accumulate.
- This is called external fragmentation
- Segmentation with paging.
- Eliminates external fragmentation in favor of internal.
- Classic: Instead of a base address,
the segment descriptor contains the location of a page table.
Treat the offset as a virtual address using this table.
- Pentium: Add the base to the offset as above to produce a virtual address, and
process it against a global page table.
- History.
- Used in GE-645 supporting the Multics O/S.
- Pentium supports a Mutics-inspired segmenting system. No one uses it.
- Pentium.