I am planning to study new things every week and writing blogs about these occasionally.
Week 1
Course link - https://onlinecourses.nptel.ac.in/noc23_cs61
5 stage RISC pipeline
- Fetch (IF = Instruction Fetch)
- Decode (ID = Instruction Decode)
- Execute (EX = Execute)
- Memory (MEM = Memory access)
- Write (WB = Register write back)
CSIC -
MUL addr1 addr2 addr3;
RISC -
LOAD R2 addr2;
LOAD R3 addr3;
MUL R1 R2 R3;
STORE addr1 R1;
More number of instructions for RISC implies that more instructions are to be kept in memory. On the other hand, less transistors are required to build RISC and has more physical space on chip.
All instructions execute uniform in temporal space (eg. in one clock cycle), these instructions can be pipelined.
Pipeline Hazards
- Structural hazards: multiple instructions compete for the same resource
- Data hazards: a dependent instruction cannot proceed because it needs a value that hasn’t been produced
- Control hazards: the next instruction cannot be fetched because the outcome of an earlier branch is unknown
Data path
Locality
- Temporal
If at one point a particular memory location is referenced, then it is likely that the same location will be referenced again in the near future. There is temporal proximity between adjacent references to the same memory location. In this case it is common to make efforts to store a copy of the referenced data in faster memory storage, to reduce the latency of subsequent references. Temporal locality is a special case of spatial locality (see below), namely when the prospective location is identical to the present location.
- Spatial
If a particular storage location is referenced at a particular time, then it is likely that nearby memory locations will be referenced in the near future. In this case it is common to attempt to guess the size and shape of the area around the current reference for which it is worthwhile to prepare faster access for subsequent reference.
- Branch
If there are only a few possible alternatives for the prospective part of the path in the spatial-temporal coordinate space. This is the case when an instruction loop has a simple structure, or the possible outcome of a small system of conditional branching instructions is restricted to a small set of possibilities. Branch locality is typically not spatial locality since the few possibilities can be located far away from each other.
- Equidistant
Halfway between spatial locality and branch locality. Consider a loop accessing locations in an equidistant pattern, i.e., the path in the spatial-temporal coordinate space is a dotted line. In this case, a simple linear function can predict which location will be accessed in the near future.
Cache mapping
Caches are divided into blocks, which may be of various sizes.
- The number of blocks in a cache is usually a power of 2.
Where should we put data in the cache?
Direct-mapped cache
Each main memory address maps to exactly one cache block. Idea - is to use the mod (remainder) operator. Example - If the cache contains 2^k blocks, then the data at memory address i would go to cache block index i mod 2^k. For instance, with the four-block cache (assuming 1 byte block size and 16 byte main memory), address 14 would map to cache block 2. // 14 mod 4 = 2
least-significant bits
Idea - look at the least significant k bits of the address Example - (assuming 1 byte block size and 16 byte main memory) address 14 (1110 in binary) maps to cache block 2 (10 in binary).
Note - Taking the least k bits of a binary value is the same as computing that value mod 2^k
Additional resources
- https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/
- https://www.cise.ufl.edu/~mssz/CompOrg/CDA-proc.html