Essay on Computer Architecture

Submitted By cy105212393
Words: 4018
Pages: 17

CS [46]290 ECE [46]100 Conte
Memory Hierarchies, Cache Memories

CACHE-1

Who Cares About the Memory Hierarchy?
Processor-DRAM Memory Gap (latency)

1000!
Moore s Law

CPU!

µProc 60%/yr. (2X/1.5yr)

Performance

100! 10!

Processor-Memory Performance Gap: (grows 50% / year) Less Law?
DRAM

1!
1980! 1981! 1982! 1983! 1984! 1985! 1986! 1987! 1988! 198 9! 1990! 1991! 1992! 1993! 1994! 1995! 1996! 1997! 1998! 1999! 2000!

DRAM 9%/yr. (2X/10 yrs)

Time
CACHE-2

Random Access Memory Technology


Random is good: access time is “the same” for all locations ! As opposed to sequential access, as in tapes, disk ! SRAM: Static Random Access Memory

◊ Low density, high power, $expensive$, fast ◊ “Two inverters arguing” ◊ Static: content will last forever (until lose power) ◊ 4 transistors per memory cell (2 per inverter, 2 for selecting the cell/driving the output)

◊ SRAM cell about 10X larger than DRAM cell ◊ Burns A LOT more power than DRAM
! DRAM: Dynamic Random Access Memory

◊ High density, low power, cheap, slow ◊ FORGETS– “dynamic” is marketing: need to be refreshed regularly- memory that forgets!

◊ 2 Transistor per memory cell
CACHE-3

Structure of a random access memory
Memory cells: stores 1 bit each Word (select) lines

Additional signals: “Read/~Write” line Connects to all cells Memory “strobe” line Instructs memory to perform read/write decoder address Bit (input/output) lines
CACHE-4

Predictable Program Behavior: Locality of Reference


Temporal locality:
! Recently accessed items are likely to be re-accessed in the near future ! If access X at time t, then highly likely also access X at time t ± D for small D



Spatial locality:
! Items with addresses near one another in memory are likely to be referenced close together in time ! If access X at time t, then highly likely also access X ± A at time t ± D for small A, D



Sequential locality
! Important special case of spatial locality ! next item accessed is likely to be at the next sequential address in memory ! If access X at time t, then also access X + B at time t + 1 for a constant B

CACHE-5

How are caches organized and accessed
❍ ❍ ❍

For simplicity, data is stored in fixed-length parcels called blocks or lines Data is tagged with its address in memory (so that it can be found and identified) As a black box:

CPU asks for data @ location X

Cache searches for data @ X

CPU gets data (it never talks to main memory)

If not there, cache asks main memory for data (called a cache miss)
CACHE-6

A 64B direct mapped cache
31 Tag (26) 6 5 index (3) 3 2 block offset 0 (3) MAR tag (26 bits) blocks (8 bytes) set (holds 1 block)

row dec

=?

word select (mux)

hit/miss signal

MDR CACHE-7

A 64B 2-way set associative cache
31 5 4 3 2 tag (27) index (2) 0 block (3) offset MAR blocks (8 bytes) set (holds 2 blocks) tag (27bits)

row dec

LATCH =?

LATCH =? block select (mux)

Notes: this cache is the same size as the (previous) direct mapped cache, but the index field is 1 bit shorter (index: 2 bits total) A direct-mapped cache is really a 1-way set associative cache

word select (mux)

MDR

CACHE-8

A 64B fully associative cache (also called a Content Addressable Memory or CAM)
31 tag tag (29bits) =? tag =? tag =? tag =? tag =? tag =? tag =? tag =? tag 3 2 block offset 0 MAR

Notes: Same size as previous caches, but no row decoder and no index at all! Comparators (one per block) take the place of the row decoder

word select (mux) MDR CACHE-9

Cache performance: terms




Hit: data is in the cache ! Hit Time: Time to get the data from the cache to the processor Miss: data is not in the cache ! Miss Ratio = number of misses / total number of accesses ! Miss Penalty: Time to get a missing block and get the data to the processor

CACHE-10

Cache performance and AAT
❍ ❍ ❍

To measure: (1) Run a program and