Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

What is cache memory? Explain the different cache mapping techniques (direct, associative and set-associative) with suitable diagrams.

Cache Memory

Cache memory is a small, fast (usually SRAM) memory placed between the CPU and main memory. It holds copies of the most frequently/recently used instructions and data so the CPU can access them faster than going to slower main memory. It works on the principle of locality of reference (temporal and spatial locality).

When the CPU needs a word, it first checks the cache:

  • Cache hit: word is present, served quickly.
  • Cache miss: word is fetched from main memory (a whole block/line) and stored in cache.

The mapping function decides where a main-memory block is placed in the cache.

1. Direct Mapping

Each main-memory block maps to exactly one fixed cache line:

cache line=(block number)mod(number of cache lines)\text{cache line} = (\text{block number}) \bmod (\text{number of cache lines})

The address is split into three fields:

TagLine (Index)Word (Offset)
  • Simple and cheap; only one tag comparison is needed.
  • Drawback: high conflict misses — two active blocks mapping to the same line keep evicting each other.

Diagram (in words): main memory blocks 0, m, 2m, … all map to cache line 0; blocks 1, m+1, … map to line 1, and so on (m = number of cache lines).

2. Associative (Fully Associative) Mapping

A block can be placed in any cache line. The address has only two fields:

TagWord (Offset)
  • On access, the tag is compared with all lines in parallel (using associative/CAM hardware).
  • Most flexible, lowest conflict misses; needs a replacement policy (LRU, FIFO, random).
  • Drawback: expensive — requires a comparator for every line.

3. Set-Associative Mapping

A compromise: the cache is divided into sets, each containing k lines (k-way set-associative). A block maps to one fixed set but can occupy any line within that set:

set number=(block number)mod(number of sets)\text{set number} = (\text{block number}) \bmod (\text{number of sets})

Address fields:

TagSet (Index)Word (Offset)
  • Only k tags (within the set) are compared in parallel.
  • Balances cost and performance; widely used (e.g. 4-way, 8-way).
  • Special cases: 1-way = direct mapping; k = total lines = fully associative.

Summary

TechniqueBlock placementComparatorsConflict misses
DirectOne fixed line1High
AssociativeAny lineAll linesLowest
Set-associativeAny line in one setk (set size)Moderate (low)
memorycache
2long10 marks

What is pipelining? Explain instruction pipelining with its stages. Discuss the different types of pipeline hazards and their solutions.

Pipelining

Pipelining is a technique of overlapping the execution of multiple instructions by dividing the processing into independent stages. Each stage works on a different instruction simultaneously, like an assembly line, increasing instruction throughput (instructions completed per unit time) without reducing the latency of a single instruction.

Instruction Pipeline Stages

A classic 5-stage RISC instruction pipeline:

  1. IF (Instruction Fetch) – fetch instruction from memory using PC.
  2. ID (Instruction Decode) – decode opcode and read source registers.
  3. EX (Execute) – ALU performs the operation / computes address.
  4. MEM (Memory Access) – read from or write to data memory.
  5. WB (Write Back) – write the result back into the register file.

In an ideal pipeline of k stages, after the pipeline fills, one instruction completes every clock cycle. For n instructions and k stages:

Total cycles=k+(n1),Speedupk  (ideal)\text{Total cycles} = k + (n-1), \qquad \text{Speedup} \approx k \;(\text{ideal})

Pipeline Hazards and Solutions

Hazards prevent the next instruction from executing in its designated cycle.

1. Structural Hazard

Two instructions need the same hardware resource in the same cycle (e.g. one memory for both fetch and data access).

  • Solutions: add resources (separate instruction and data memories/caches — Harvard organization), or stall.

2. Data Hazard

An instruction depends on the result of a previous instruction still in the pipeline (RAW – read after write):

ADD R1, R2, R3   ; produces R1
SUB R4, R1, R5   ; needs R1
  • Solutions: forwarding/bypassing (route ALU result directly to the next stage), pipeline stalls/bubbles, and compiler instruction reordering.

3. Control (Branch) Hazard

The outcome/target of a branch is not known until later stages, so the next fetch is uncertain.

  • Solutions: branch prediction (static/dynamic), delayed branch (fill delay slot), flushing mispredicted instructions, and computing the branch target early.

Conclusion

Pipelining boosts throughput but introduces hazards; forwarding, stalling, and branch prediction are used to keep the pipeline full and correct.

pipelining
3long10 marks

Explain the different modes of data transfer between CPU and I/O devices: programmed I/O, interrupt-driven I/O and DMA.

Modes of Data Transfer between CPU and I/O

Data transfer between the processor and I/O devices is handled in three main modes.

1. Programmed I/O

The CPU continuously controls the entire transfer. It executes a program that polls the device status flag in a loop:

LOOP: read status register
      if (flag == 0) goto LOOP   ; busy-wait / polling
      read/write data word
  • Every word passes through the CPU register.
  • Advantage: simple hardware.
  • Disadvantage: CPU is kept busy in a polling loop and wastes time — very inefficient for slow devices.

2. Interrupt-Driven I/O

The CPU issues the I/O command and then continues other work. When the device is ready, it raises an interrupt signal.

  • The CPU finishes the current instruction, saves its state (PC, registers), and runs the Interrupt Service Routine (ISR) to transfer the word, then resumes.
  • Removes wasteful polling, so the CPU is freed during the wait.
  • Disadvantage: each word still requires CPU intervention (interrupt + ISR overhead), costly for large/fast transfers.

3. Direct Memory Access (DMA)

A dedicated DMA controller transfers a whole block of data directly between memory and the I/O device without the CPU handling each word.

Steps:

  1. CPU initializes the DMA controller with the starting memory address, word count, and direction (read/write).
  2. DMA requests the system bus via HOLD/Bus Request; CPU grants control (HLDA/Bus Grant) and releases the bus (cycle stealing or burst mode).
  3. DMA transfers data block directly to/from memory.
  4. On completion, DMA interrupts the CPU to signal the transfer is done.
  • Advantage: fastest, very low CPU overhead — ideal for high-speed devices (disk, network).

Comparison

ModeCPU involvementSpeedUse
Programmed I/OFull (polling)SlowSimple/slow devices
Interrupt-drivenPer word (ISR)MediumModerate-speed devices
DMAOnly setup + endFastBlock transfer, high-speed
iointerrupt
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

What is a system bus? Explain address bus, data bus and control bus.

System Bus

A system bus is a shared set of parallel conducting lines (wires) that interconnect the CPU, main memory, and I/O modules, allowing them to exchange data, addresses and control signals. It is broadly divided into three functional buses:

  • Address Bus – carries the address of the memory location or I/O port to be accessed. It is unidirectional (CPU → memory/I/O). Its width determines the addressable memory size; e.g. an n-bit address bus can address 2n2^n locations.

  • Data Bus – carries the actual data/instructions being transferred. It is bidirectional (data flows both to and from the CPU). Its width (e.g. 8, 16, 32, 64 bits) determines how many bits are transferred at once, affecting performance.

  • Control Bus – carries control and timing signals that coordinate the operations, such as Memory Read, Memory Write, I/O Read, I/O Write, clock, interrupt request, and bus grant. It is mostly bidirectional (individual lines unidirectional).

bus
5short5 marks

What is DMA? Explain how direct memory access transfers data without CPU intervention.

DMA (Direct Memory Access)

DMA is a data-transfer technique in which a dedicated DMA controller transfers a block of data directly between an I/O device and main memory without the CPU processing each word, greatly reducing CPU overhead for high-speed transfers.

How DMA Transfers Data

  1. Initialization: The CPU programs the DMA controller with the starting memory address, the number of words (count) to transfer, and the direction (read/write), then continues with other tasks.
  2. Bus request: When the I/O device is ready, the DMA controller requests the system bus by asserting HOLD / Bus Request.
  3. Bus grant: The CPU completes its current bus cycle, releases the address, data and control buses, and acknowledges with HLDA / Bus Grant. The CPU is now temporarily disconnected from the bus.
  4. Transfer: The DMA controller takes over the buses and transfers data directly between the device and memory, incrementing the address and decrementing the count for each word. This is done in cycle-stealing or burst mode.
  5. Completion: When the count reaches zero, the DMA controller releases the bus and sends an interrupt to inform the CPU that the transfer is complete.

Thus the CPU is involved only at the start (setup) and end (interrupt), making DMA much faster than programmed or interrupt-driven I/O for block transfers.

dma
6short5 marks

Differentiate between RISC and CISC architectures.

RISC vs CISC

FeatureRISCCISC
Instruction setSmall, simple instructionsLarge, complex instructions
Instruction lengthFixed lengthVariable length
Execution timeMostly single clock cycleMultiple clock cycles
Addressing modesFewMany
Memory accessOnly via load/store instructionsMany instructions access memory directly
RegistersLarge number of registersFewer registers
Control unitMostly hardwiredMostly microprogrammed
PipeliningEasy and efficientHarder due to varying instructions
Code sizeLarger (more instructions)Smaller (fewer, denser instructions)
ComplexityIn compiler/softwareIn hardware/microcode
ExamplesARM, MIPS, SPARC, RISC-VIntel x86, VAX

Summary: RISC emphasizes a small set of simple, fixed-length, single-cycle, register-based instructions that pipeline well, shifting complexity to the compiler. CISC provides powerful, variable-length instructions that reduce program size but complicate the hardware.

risc-cisc
7short5 marks

Explain register transfer language with examples of micro-operations.

Register Transfer Language (RTL)

Register Transfer Language (RTL) is a symbolic notation used to describe the micro-operations — the elementary operations performed on data stored in registers — and the flow of information between registers in a digital system. Registers are denoted by capital letters (e.g. R1, PC, MAR) and a transfer is shown with the replacement/assignment operator .

General form of a conditional transfer:

P:  R2R1P:\; R2 \leftarrow R1

meaning “if control condition P = 1, copy the contents of R1 into R2.”

Types of Micro-operations with Examples

1. Register-transfer micro-operations – move data between registers:

R2R1R2 \leftarrow R1

2. Arithmetic micro-operations – add, subtract, increment, decrement, complement:

R3R1+R2,R1R1+1  (increment)R3 \leftarrow R1 + R2, \qquad R1 \leftarrow R1 + 1 \;(\text{increment})

3. Logic micro-operations – bitwise AND, OR, XOR, NOT:

R1R1R2  (AND),R1R1  (complement)R1 \leftarrow R1 \land R2 \;(\text{AND}), \qquad R1 \leftarrow \overline{R1}\;(\text{complement})

4. Shift micro-operations – shift register contents left/right (logical, circular, arithmetic):

R1shlR1,R2shrR2R1 \leftarrow shl\,R1, \qquad R2 \leftarrow shr\,R2

RTL is hardware-independent and is used to specify the internal operations of a CPU precisely.

register-transfer
8short5 marks

Explain the algorithm for division of unsigned integers (restoring division) with an example.

Restoring Division (Unsigned Integers)

Restoring division divides an unsigned dividend by an unsigned divisor using repeated shift and subtract. It uses three registers: A (initially 0, holds partial remainder), Q (holds the dividend, finally the quotient) and M (the divisor). For n-bit numbers, repeat n times:

Algorithm

1. Initialize A = 0, Q = dividend, M = divisor, count = n
2. Repeat n times:
   a. Shift left (A, Q) by one bit
   b. A = A - M            (subtract divisor)
   c. If A < 0 (MSB = 1):  // failed
         Q0 = 0
         A = A + M         // RESTORE
      Else:                // succeeded
         Q0 = 1
3. After n steps: Q = quotient, A = remainder

Example: 7 ÷ 3 (4-bit, n = 4)

Dividend Q = 0111, Divisor M = 0011, A = 0000.

StepOperationAQQ0
Init00000111
1shift; A−M=0000−0011=1101 (<0) → restore000011100
2shift; A−M=0001−0011=1110 (<0) → restore000111000
3shift; A−M=0011−0011=0000 (≥0)000010011
4shift; A−M=0001−0011=1110 (<0) → restore000100110

Result: Quotient Q = 0010 = 2, Remainder A = 0001 = 1. Check: 3×2+1=73\times 2 + 1 = 7. ✓

arithmetic
9short5 marks

Explain the concept of microprogrammed control and microinstruction format.

Microprogrammed Control

A control unit generates the control signals that drive all micro-operations of the CPU. In a microprogrammed control unit, these control signals are not generated by fixed hardwired logic but are stored as microinstructions (a sequence of bits) in a special fast memory called the control memory (control store).

  • Each machine instruction corresponds to a microprogram — a sequence of microinstructions.
  • A Control Address Register (CAR) holds the address of the next microinstruction; the Control Data Register (CDR) holds the microinstruction currently being executed.
  • Executing a microinstruction means activating the control lines specified by its bits, then determining the address of the next microinstruction (via a sequencer).

Advantages: flexible, easy to design and modify (just change the microprogram). Disadvantage: slower than hardwired control.

Microinstruction Format

A microinstruction is typically divided into fields:

Control / Micro-operation fieldCondition (select) fieldBranch fieldNext-address field
  • Control / micro-operation field: bits that specify which control signals/micro-operations to activate (may be horizontal — one bit per signal, or vertical — encoded).
  • Condition field: selects the status flag/condition to test for branching.
  • Branch field: specifies the type of branch (jump, call, map, etc.).
  • Address field: gives the address of the next microinstruction in control memory.

Horizontal microinstructions have long, mostly-unencoded fields (fast, more memory); vertical ones use encoded fields (compact, slower due to decoding).

control-unitmicroprogram
10short5 marks

What is virtual memory? Explain the concept of paging.

Virtual Memory

Virtual memory is a memory-management technique that gives a program the illusion of a very large, contiguous main memory by using secondary storage (disk) as an extension of physical RAM. Programs use virtual (logical) addresses, which the system translates into physical addresses; only the actively needed portions of a program are kept in RAM while the rest stays on disk.

Benefits: programs larger than physical memory can run, better multiprogramming, and memory protection/relocation.

Paging

Paging is the most common implementation of virtual memory:

  • The virtual address space is divided into fixed-size blocks called pages.
  • The physical memory is divided into blocks of the same size called frames.
  • A page table maps each virtual page number to a physical frame number.

A virtual address is split as:

Page NumberOffset

The page number indexes the page table to obtain the frame number; the offset is added to locate the exact word.

  • If the referenced page is not in memory, a page fault occurs; the OS loads the page from disk into a free frame (replacing one using LRU/FIFO if needed) and updates the page table.
  • A TLB (Translation Lookaside Buffer) caches recent page-table entries to speed up address translation.

Paging eliminates external fragmentation and allows non-contiguous allocation of a program in physical memory.

memoryvirtual
11short5 marks

Explain the instruction format and the types of instructions based on the number of addresses.

Instruction Format

An instruction format defines the layout of bits in a machine instruction. A typical instruction is divided into fields:

OpcodeModeOperand / Address field(s)
  • Opcode – specifies the operation to be performed (e.g. ADD, LOAD).
  • Mode – specifies the addressing mode (how operands are interpreted).
  • Address/operand field(s) – specify the location(s) of operands or the data itself.

Types of Instructions Based on Number of Addresses

1. Three-address instruction – specifies two source operands and one destination:

ADD R1, R2, R3   ; R1 ← R2 + R3

Short programs, but long instructions.

2. Two-address instruction – one operand serves as both a source and the destination:

ADD R1, R2       ; R1 ← R1 + R2

Most common in commercial machines.

3. One-address instruction – uses an implied accumulator (AC); only one operand is specified:

LOAD A           ; AC ← M[A]
ADD  B           ; AC ← AC + M[B]
STORE C          ; M[C] ← AC

4. Zero-address instruction – operands are implied on a stack (used in stack machines):

PUSH A
PUSH B
ADD              ; TOS ← (TOS) + (TOS-1)
POP  C

Fewer addresses give shorter instructions but require more instructions to perform the same task.

instruction
12short5 marks

Explain set-associative mapping technique of cache memory with an example.

Set-Associative Mapping

In set-associative mapping, the cache is divided into a number of sets, and each set contains a fixed number of lines, k (this is called a k-way set-associative cache). A main-memory block maps to one specific set (determined by a modulo function) but can be placed in any line within that set. It combines the advantages of direct mapping (simple indexing) and associative mapping (flexible placement).

set number=(block number)mod(number of sets)\text{set number} = (\text{block number}) \bmod (\text{number of sets})

The memory address is divided into three fields:

TagSet (index)Word (offset)

On access, the set field selects a set, and the tag is compared in parallel with the tags of all k lines only in that set. A hit occurs if any matches.

Example

Suppose a cache has 8 lines organized as 2-way set-associative ⇒ number of sets =8/2=4= 8 / 2 = 4 sets. Block placement uses:

set=(block number)mod4\text{set} = (\text{block number}) \bmod 4
  • Block 0 → set 0, Block 1 → set 1, Block 2 → set 2, Block 3 → set 3
  • Block 4 → set 0, Block 5 → set 1, …

So blocks 0, 4, 8, 12, … all map to set 0, but since each set has 2 lines, two of these blocks can reside in the cache at the same time without conflict. A replacement policy (e.g. LRU) chooses which line to evict when the set is full.

Advantage: fewer conflict misses than direct mapping, and cheaper than fully associative (only k tag comparisons per access).

cachemapping

Frequently asked questions

Where can I find the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) question paper 2075?
The full BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2075 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Computer Architecture (BSc CSIT, CSC208) 2075 paper come with solutions?
Yes. Every question on this Computer Architecture (BSc CSIT, CSC208) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2075 paper?
The BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2075 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Computer Architecture (BSc CSIT, CSC208) past paper free?
Yes — reading and attempting this Computer Architecture (BSc CSIT, CSC208) past paper on Kekkei is completely free.