Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

What is cache memory? Explain the different cache mapping techniques (direct, associative and set-associative) with suitable diagrams.

Cache Memory

Cache memory is a small, high-speed memory placed between the CPU and main memory (RAM). It stores copies of frequently accessed instructions and data so that the CPU can access them faster than fetching from slower main memory. It works on the principle of locality of reference (temporal and spatial locality).

When the CPU needs a word, it first checks the cache:

  • Cache hit — the word is found in cache (fast access).
  • Cache miss — the word is not in cache, so it is fetched from main memory and a block is copied into cache.

The ratio of hits to total references is the hit ratio, a key measure of cache performance.

Main memory is divided into blocks and cache into lines of the same size. A mapping function decides which main-memory block goes to which cache line.

Cache Mapping Techniques

1. Direct Mapping

Each main-memory block maps to exactly one fixed cache line, given by:

Cache line=(Block number)mod(Number of cache lines)\text{Cache line} = (\text{Block number}) \bmod (\text{Number of cache lines})

The physical address is split into three fields:

TagLine (Index)Word (Offset)
  • Word selects the byte/word within a block.
  • Line selects the cache line directly.
  • Tag is stored with the line and compared to verify the right block.

Diagram (in words): Each cache line has a tag field and a data block. The line field of the address indexes one line; its stored tag is compared with the address tag — equal means hit.

Advantage: Simple and cheap (one comparison). Disadvantage: High conflict misses — two blocks mapping to the same line repeatedly evict each other.

2. Associative (Fully Associative) Mapping

A main-memory block can be placed in any cache line. The address has only two fields:

TagWord

The tag must be compared with the tags of all cache lines simultaneously (using associative/content-addressable memory).

Diagram (in words): All cache-line tags are searched in parallel by comparators; any match is a hit.

Advantage: Most flexible, lowest conflict misses, best hit ratio. Disadvantage: Expensive hardware (many comparators) and needs a replacement algorithm (LRU/FIFO).

3. Set-Associative Mapping

A compromise between the two. The cache is divided into sets, each containing kk lines (a k-way set-associative cache). A block maps to one fixed set but can occupy any line within that set:

Set number=(Block number)mod(Number of sets)\text{Set number} = (\text{Block number}) \bmod (\text{Number of sets})

The address has three fields:

TagSetWord

Diagram (in words): The set field indexes one set; the tag is compared in parallel with the kk lines of that set only.

Advantage: Lower conflict misses than direct mapping and far cheaper than fully associative (kk comparators only). Most real caches use 2-way, 4-way or 8-way set-associative mapping.

Summary

Direct mapping is the special case k=1k=1; fully associative is the case where the whole cache is one set. Set-associative balances cost and performance.

memorycache
2long10 marks

What is pipelining? Explain instruction pipelining with its stages. Discuss the different types of pipeline hazards and their solutions.

Pipelining

Pipelining is a technique of implementing instruction-level parallelism in which multiple instructions are overlapped in execution. The processor is divided into stages, each performing a part of an instruction, like an assembly line. While one instruction is being decoded, another can be fetched, and so on, increasing throughput (instructions completed per unit time).

For a kk-stage pipeline executing nn instructions, ideal speedup approaches kk:

Speedup=n×kk+(n1)k for large n\text{Speedup} = \frac{n \times k}{k + (n-1)} \to k \text{ for large } n

Instruction Pipeline Stages

A classic RISC pipeline has five stages:

  1. IF (Instruction Fetch) — fetch the instruction from memory (using PC).
  2. ID (Instruction Decode) — decode the opcode and read source registers.
  3. EX (Execute) — perform the ALU operation or compute the effective address.
  4. MEM (Memory Access) — read from or write to data memory (for load/store).
  5. WB (Write Back) — write the result back to the register file.

In steady state one instruction finishes every clock cycle.

Pipeline Hazards and Solutions

Hazards prevent the next instruction from executing in its designated cycle.

1. Structural Hazard

Two instructions need the same hardware resource in the same cycle (e.g., a single memory port for IF and MEM).

  • Solution: Provide separate instruction and data memories/caches (Harvard organization) or duplicate resources; stall if unavoidable.

2. Data Hazard

An instruction depends on the result of a previous instruction still in the pipeline (RAW — read after write being the most common).

  • Solutions:
    • Operand forwarding (bypassing): route the ALU result directly to a dependent instruction before write-back.
    • Pipeline stalls / bubbles inserted by hardware.
    • Compiler instruction reordering to fill the gap.

3. Control (Branch) Hazard

Caused by branch/jump instructions — the next instruction's address is not known until the branch is resolved.

  • Solutions:
    • Branch prediction (static or dynamic) and speculative execution.
    • Delayed branch (fill the delay slot with a useful instruction).
    • Flushing wrongly-fetched instructions when a branch is taken.
    • Compute branch target early in the pipeline to reduce the penalty.

Conclusion

Pipelining greatly improves throughput, but hazards limit the ideal speedup; forwarding, stalling, branch prediction and resource duplication are used to mitigate them.

pipelining
3long10 marks

Explain the different modes of data transfer between CPU and I/O devices: programmed I/O, interrupt-driven I/O and DMA.

Modes of Data Transfer between CPU and I/O

Data transfer between the CPU/memory and I/O devices occurs in three modes: programmed I/O, interrupt-driven I/O, and DMA.

1. Programmed I/O

The CPU is fully responsible for the transfer. It continuously polls (checks) the status flag of the I/O interface in a loop until the device is ready, then transfers the data word.

Steps:

  1. CPU reads the device status register.
  2. If not ready, it loops back (busy-waiting).
  3. When ready, CPU reads/writes the data word through a CPU register.

Drawback: The CPU wastes most of its time busy-waiting; very inefficient. Suitable only for slow, simple systems.

2. Interrupt-Driven I/O

Instead of polling, the device interrupts the CPU when it is ready.

Steps:

  1. CPU issues an I/O command and continues other work.
  2. When the device is ready, it raises an interrupt request.
  3. CPU finishes the current instruction, saves state (PC, registers), and jumps to the Interrupt Service Routine (ISR).
  4. The ISR transfers the data word, then control returns to the interrupted program.

Advantage: CPU is free during the wait. Drawback: Each word still requires CPU intervention (one interrupt per word), so it is slow for high-speed bulk transfers.

3. Direct Memory Access (DMA)

A dedicated DMA controller transfers a whole block of data directly between memory and the I/O device without the CPU handling each word.

Steps:

  1. CPU programs the DMA controller with the starting memory address, word count, and direction (read/write).
  2. The DMA controller requests the system bus (bus request / HOLD); CPU grants it (bus grant / HLDA) and releases the bus.
  3. DMA transfers the block word-by-word directly to/from memory (cycle stealing or burst mode).
  4. On completion, the DMA controller interrupts the CPU to signal that the block transfer is done.

Advantage: Very fast, frees the CPU; ideal for disk and high-speed devices.

Comparison

FeatureProgrammed I/OInterrupt-driven I/ODMA
CPU involvementHighest (polling)Per word (ISR)Lowest (per block)
SpeedSlowModerateFast
Best forFew/slow devicesModerate-speed devicesBulk/high-speed transfers
iointerrupt
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

What is a system bus? Explain address bus, data bus and control bus.

System Bus

A system bus is a set of parallel conducting wires (a shared communication pathway) that connects the CPU, main memory, and I/O devices, allowing them to transfer data, addresses, and control signals. It is logically divided into three functional groups:

1. Address Bus

  • Carries the address of the memory location or I/O port to be accessed.
  • Unidirectional (CPU → memory/I/O).
  • Its width determines the addressable memory size: an nn-line address bus can address 2n2^n locations.

2. Data Bus

  • Carries the actual data/instructions being transferred between CPU, memory, and I/O.
  • Bidirectional (data flows both ways).
  • Its width (e.g., 8, 16, 32, 64 bits) determines how many bits are transferred at once, affecting performance.

3. Control Bus

  • Carries control and timing signals that coordinate and manage operations, e.g., Memory Read, Memory Write, I/O Read, I/O Write, clock, interrupt, and bus-request/grant signals.
  • A mix of unidirectional lines that synchronise the devices.
bus
5short5 marks

What is DMA? Explain how direct memory access transfers data without CPU intervention.

Direct Memory Access (DMA)

DMA is a data-transfer technique in which a dedicated hardware unit called the DMA controller transfers a block of data directly between an I/O device and main memory without continuous CPU intervention. The CPU is only involved at the start and end of the transfer.

How DMA Transfers Data

  1. Setup: The CPU programs the DMA controller registers with the starting memory address, the word/byte count (block size), and the direction (read or write).
  2. Bus request: When the I/O device is ready, the DMA controller sends a bus request (HOLD) signal to the CPU.
  3. Bus grant: The CPU completes its current bus cycle, releases the system bus (address, data, control), and responds with a bus grant (HLDA). The CPU is now temporarily idle on the bus.
  4. Transfer: The DMA controller takes control of the buses and transfers data word-by-word directly between memory and the device, incrementing the address and decrementing the count after each word. Transfer can be by cycle stealing (one word at a time, returning the bus) or burst mode (entire block).
  5. Completion: When the count reaches zero, the DMA controller releases the bus and sends an interrupt to the CPU indicating the transfer is complete.

Because the CPU does not handle each individual word, DMA is much faster and far more efficient than programmed or interrupt-driven I/O for large/high-speed transfers (e.g., disk, network).

dma
6short5 marks

Differentiate between RISC and CISC architectures.

RISC vs CISC

FeatureRISC (Reduced Instruction Set Computer)CISC (Complex Instruction Set Computer)
Instruction setSmall, simple, fixed-length instructionsLarge, complex, variable-length instructions
Instructions per programMore instructions neededFewer instructions (one does more)
ExecutionMostly single-cycleMulti-cycle, varies per instruction
Addressing modesFew, simpleMany, complex
Memory accessOnly via load/store instructionsMemory operands allowed in many instructions
Control unitHardwired controlUsually microprogrammed control
RegistersLarge number of registersFewer registers
PipeliningEasy and efficientHarder due to varying instruction lengths
Code sizeLargerSmaller (compact)
ExamplesARM, MIPS, SPARC, RISC-Vx86 (Intel/AMD), VAX, IBM System/360

Summary: RISC emphasises simple instructions executed quickly with a hardwired control unit and heavy use of registers and pipelining, putting complexity in the compiler. CISC emphasises powerful, complex instructions that reduce program length, putting complexity in the hardware/microcode.

risc-cisc
7short5 marks

Explain register transfer language with examples of micro-operations.

Register Transfer Language (RTL)

Register Transfer Language is a symbolic notation used to describe the operations and information flow between the registers of a digital system. A micro-operation is an elementary operation performed on the data stored in registers during one clock pulse.

Notation

  • Registers are denoted by capital letters: R1, R2, AC, PC, MAR.
  • A transfer is shown with the replacement operator .
  • Conditional transfers use a control function with :.

Example of a conditional transfer (transfer R2 into R1 only when control signal P = 1):

P: R1 ← R2

Types of Micro-operations with Examples

1. Register transfer micro-operation — move data between registers:

R1 ← R2          ; copy contents of R2 into R1

2. Arithmetic micro-operations — add, subtract, increment, decrement:

R3 ← R1 + R2     ; addition
R3 ← R1 + R2' + 1 ; subtraction using 2's complement
R1 ← R1 + 1      ; increment

3. Logic micro-operations — bitwise AND, OR, XOR, complement:

R1 ← R1 ∧ R2     ; bitwise AND
R1 ← R1 ∨ R2     ; bitwise OR
R1 ← R1'         ; complement

4. Shift micro-operations — logical, circular, arithmetic shifts:

R1 ← shl R1      ; shift left
R1 ← shr R1      ; shift right

RTL provides a concise, hardware-independent way to specify the internal organisation and control sequences of a computer.

register-transfer
8short5 marks

Explain the algorithm for division of unsigned integers (restoring division) with an example.

Restoring Division of Unsigned Integers

Restoring division performs binary division of an nn-bit dividend by a divisor using repeated subtraction and shifting, with a register A (initially 0, the partial remainder), register Q (the dividend, ends holding the quotient), and register M (the divisor).

Algorithm

Repeat nn times (n = number of dividend bits):

  1. Shift left the combined register pair A,Q by one bit.
  2. Subtract: A ← A − M.
  3. Test sign of A:
    • If A ≥ 0 (sign bit 0): set the least-significant bit Q0 ← 1.
    • If A < 0 (sign bit 1): restore by A ← A + M and set Q0 ← 0.

After nn iterations: Q = quotient, A = remainder.

Example: 7 ÷ 3 (4-bit)

Dividend Q = 0111, Divisor M = 0011 (so −M = 1101 in 2's complement). Start A = 0000.

StepOperationAQ
Init00000111
1Shift left0000111_
A − M1101
A<0 → restore, Q0=000001110
2Shift left0001110_
A − M1110
A<0 → restore, Q0=000011100
3Shift left0011100_
A − M0000
A≥0 → Q0=100001001
4Shift left0001001_
A − M1110
A<0 → restore, Q0=000010010

Result: Quotient Q = 0010 = 2, Remainder A = 0001 = 1.

Check: 7=3×2+17 = 3 \times 2 + 1. ✓

arithmetic
9short5 marks

Explain the concept of microprogrammed control and microinstruction format.

Microprogrammed Control

In a microprogrammed control unit, the control signals required to execute each instruction are generated by reading microinstructions stored in a special memory called the control memory (control store). Each machine instruction corresponds to a microprogram — a sequence of microinstructions. This contrasts with hardwired control, where control signals are produced by fixed logic gates.

Key Components

  • Control Memory (ROM): stores all microprograms.
  • Control Address Register (CAR): holds the address of the next microinstruction.
  • Control Data Register (CDR): holds the microinstruction currently being executed.
  • Sequencer / Next-address generator: determines the address of the next microinstruction (sequential, branch, or based on instruction opcode).

Operation: CAR addresses control memory → the microinstruction is read into CDR → its control field activates the control signals → the sequencer updates CAR for the next microinstruction.

Microinstruction Format

A typical microinstruction contains the following fields:

FieldPurpose
Control / Micro-operation fieldBits specifying which control signals are activated (the micro-operations to perform)
Condition (CD) fieldSpecifies the status/condition to test for branching (e.g., zero, carry, unconditional)
Branch (BR) fieldType of branch/next-address selection (jump, call, return, map)
Address fieldThe address in control memory of the next microinstruction if a branch is taken

Advantages: flexible, easy to modify/design, supports complex instructions. Disadvantage: slower than hardwired control due to control-memory access.

control-unitmicroprogram
10short5 marks

What is virtual memory? Explain the concept of paging.

Virtual Memory

Virtual memory is a memory-management technique that gives a program the illusion of a very large, contiguous main memory by using a combination of RAM and secondary storage (disk). Only the parts of a program currently in use are kept in physical RAM; the rest reside on disk and are brought in on demand. This allows programs larger than physical memory to run and provides isolation between processes.

The addresses generated by the CPU are logical (virtual) addresses, which are translated to physical addresses by the Memory Management Unit (MMU).

Paging

Paging is a virtual-memory scheme that removes the need for contiguous allocation:

  • The virtual address space is divided into fixed-size blocks called pages.
  • Physical memory is divided into blocks of the same size called frames.
  • A page table maps each virtual page number to a physical frame number.

Address Translation

A virtual address is split into a page number and an offset:

Virtual address = | Page number | Offset |

The page number indexes the page table to get the frame number; the frame number combined with the offset gives the physical address.

  • Page hit: the page is in a frame → access proceeds.
  • Page fault: the page is on disk → the OS loads it into a free/replaced frame, updates the page table, and retries.

A TLB (Translation Lookaside Buffer) caches recent page-table entries to speed up translation. Paging eliminates external fragmentation but can cause internal fragmentation in the last page.

memoryvirtual
11short5 marks

Explain the instruction format and the types of instructions based on the number of addresses.

Instruction Format

An instruction format defines the layout of bits in a machine instruction. It specifies the fields that the control unit interprets. A typical instruction contains:

OpcodeOperand / Address field(s)Addressing mode
  • Opcode (operation code): specifies the operation to perform (ADD, LOAD, JMP, etc.).
  • Operand/Address field: specifies the data or the address of the operand(s).
  • Mode field: specifies how the operand address is interpreted (direct, indirect, indexed, etc.).

The number of address fields gives rise to the following instruction types.

Types of Instructions by Number of Addresses

1. Three-address instruction — two source operands and a destination.

ADD R1, R2, R3   ; R1 ← R2 + R3

For X = (A+B)*(C+D): ADD T1,A,B ADD T2,C,D MUL X,T1,T2. Shorter programs but long instructions.

2. Two-address instruction — one operand is both source and destination.

ADD R1, R2       ; R1 ← R1 + R2

Most common in real machines.

3. One-address instruction — uses an implied accumulator (AC).

LOAD A           ; AC ← A
ADD  B           ; AC ← AC + B
STORE X          ; X ← AC

4. Zero-address instruction — operands are implicit on a stack (stack-based machine).

PUSH A
PUSH B
ADD              ; pops A, B, pushes A+B
POP X

As the number of address fields decreases, instructions become shorter but more instructions are needed to perform the same task.

instruction
12short5 marks

Explain set-associative mapping technique of cache memory with an example.

Set-Associative Mapping

Set-associative mapping is a cache mapping technique that combines the features of direct and fully associative mapping. The cache is divided into a number of sets, and each set contains kk lines. This is called a k-way set-associative cache.

  • A main-memory block maps to one specific set (like direct mapping):
Set number=(Block number)mod(Number of sets)\text{Set number} = (\text{Block number}) \bmod (\text{Number of sets})
  • Within that set, the block may be placed in any of the kk lines (like associative mapping).

The physical address is divided into three fields:

TagSetWord (offset)

On access, the set field selects one set, and the tag is compared in parallel with the tags of all kk lines in that set only. A match is a hit.

Example

Suppose:

  • Cache = 8 lines, 2-way set-associative ⇒ number of sets = 8 / 2 = 4 sets.
  • Main memory has 256 blocks.

Then block number BB maps to set =Bmod4= B \bmod 4.

  • Block 0 → set 0, Block 1 → set 1, Block 4 → set 0, Block 5 → set 1, etc.

Block 4 can be stored in either of the 2 lines of set 0. So blocks 0 and 4 (both mapping to set 0) can reside in cache simultaneously — which would conflict in direct mapping. This reduces conflict misses.

Advantages: Fewer conflict misses than direct mapping; far cheaper than fully associative (only kk comparators). A replacement policy (e.g., LRU) selects the victim line within the set. Most modern caches are 2-, 4-, or 8-way set-associative.

cachemapping

Frequently asked questions

Where can I find the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) question paper 2080?
The full BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2080 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Computer Architecture (BSc CSIT, CSC208) 2080 paper come with solutions?
Yes. Every question on this Computer Architecture (BSc CSIT, CSC208) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2080 paper?
The BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Computer Architecture (BSc CSIT, CSC208) past paper free?
Yes — reading and attempting this Computer Architecture (BSc CSIT, CSC208) past paper on Kekkei is completely free.