Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Explain Flynn's classification of computer architectures (SISD, SIMD, MISD, MIMD) with examples.

Flynn's Classification of Computer Architectures

Michael Flynn (1966) classified computer architectures based on the number of concurrent instruction streams and data streams that can be processed. This gives four categories.

ClassInstruction StreamData StreamMeaning
SISDSingleSingleOne instruction on one data item
SIMDSingleMultipleOne instruction on many data items
MISDMultipleSingleMany instructions on one data item
MIMDMultipleMultipleMany instructions on many data items

1. SISD (Single Instruction, Single Data)

A single processing unit fetches and executes one instruction at a time, operating on a single data stream. It corresponds to the classical von Neumann sequential machine. Instruction-level parallelism (pipelining) may exist, but there is only one instruction and one data stream logically.

  • Example: Traditional uniprocessor machines such as early IBM PCs, the Intel 8085/8086, and any single-core CPU.

2. SIMD (Single Instruction, Multiple Data)

A single control unit broadcasts one instruction to many processing elements, each operating on its own data. Ideal for data-parallel problems (vector/matrix/image processing).

  • Example: Vector processors (Cray-1), array processors, GPUs, and CPU SIMD extensions like Intel MMX/SSE/AVX.

3. MISD (Multiple Instruction, Single Data)

Several processing units execute different instructions on the same data stream. It is mostly theoretical and rarely built.

  • Example: Fault-tolerant systems that run the same data through redundant pipelines (e.g., flight-control voting systems); systolic arrays are sometimes cited.

4. MIMD (Multiple Instruction, Multiple Data)

Multiple autonomous processors each execute their own instruction stream on their own data stream. This is the most general and widely used parallel model.

  • Shared-memory MIMD (multiprocessors) or distributed-memory MIMD (multicomputers/clusters).
  • Example: Multi-core CPUs, symmetric multiprocessors (SMP), and clusters/supercomputers.

Summary

SISD = serial computer, SIMD = data parallelism (one op on many data), MISD = many ops on one datum (rare), MIMD = full multiprocessing. Modern systems often combine these (e.g., a multi-core CPU = MIMD with SIMD units inside each core).

parallel
2long10 marks

What is an addressing mode? Explain different addressing modes with suitable examples.

Addressing Modes

An addressing mode is the method (specified in the instruction) used to determine the effective address (EA) of the operand, i.e., the way the operand of an instruction is located. Addressing modes give flexibility in accessing operands (registers, immediate values, memory) and support features like pointers, arrays, and program relocation.

Assume R1, R2 are registers, M[] denotes memory, and AC is an accumulator.

1. Implied / Implicit Mode

The operand is specified implicitly in the instruction itself; no address field is needed.

  • Example: CMA (complement accumulator), CLC (clear carry). Stack operations also use implied SP.

2. Immediate Mode

The operand (a constant) is given directly in the instruction's address field.

  • Example: MOV R1, #5R1 = 5.

3. Register Mode

The operand is in a CPU register; the address field names the register. EA = register.

  • Example: MOV R1, R2R1 = R2.

4. Register Indirect Mode

The register holds the address of the operand in memory. EA = contents of register.

  • Example: MOV R1, (R2)R1 = M[R2].

5. Direct (Absolute) Mode

The address field contains the memory address of the operand. EA = address field.

  • Example: MOV R1, 2000R1 = M[2000].

6. Indirect Mode

The address field points to a memory location that holds the address of the operand. EA = M[address field].

  • Example: if M[2000] = 3000, then MOV R1, @2000R1 = M[3000].

7. Indexed Mode

EA = base address + contents of an index register X. Useful for arrays.

  • Example: EA = Address + X; element A[i] accessed by varying X.

8. Base-Register Mode

EA = base register + offset in instruction. Used for relocation/segmentation.

  • Example: EA = BR + offset.

9. Relative Mode

EA = contents of Program Counter (PC) + signed offset. Used in branch instructions.

  • Example: EA = PC + offset → jump relative to current instruction.

10. Auto-Increment / Auto-Decrement Mode

Like register indirect, but the register is automatically incremented/decremented after/before access. Useful for stepping through arrays/stacks.

  • Example: MOV R1, (R2)+ reads M[R2] then increments R2.

Summary

Immediate and register modes are fastest (no memory access for operand); direct/indirect/indexed/relative trade speed for flexibility and support arrays, pointers, and relocatable code.

addressing
3long10 marks

Explain the design of a binary adder-subtractor circuit. Describe Booth's multiplication algorithm and multiply (-5) x (3) using it.

Binary Adder-Subtractor

A binary adder-subtractor performs both addition and subtraction using a single circuit built from full adders, exploiting the fact that subtraction is addition of the 2's complement.

Design

For two n-bit numbers AA and BB with a mode control line MM:

  • Each bit BiB_i is XORed with MM before entering full adder ii.
  • MM is also fed as the carry-in C0C_0 to the least-significant full adder.
Si=Ai(BiM)CiS_i = A_i \oplus (B_i \oplus M) \oplus C_i

Operation:

  • When M=0M = 0: Bi0=BiB_i \oplus 0 = B_i and C0=0C_0 = 0, so the circuit computes A+BA + B (addition).
  • When M=1M = 1: Bi1=BiB_i \oplus 1 = \overline{B_i} (1's complement) and C0=1C_0 = 1, so it computes A+B+1=ABA + \overline{B} + 1 = A - B (2's complement subtraction).

The overflow is detected by V=CnCn1V = C_n \oplus C_{n-1} (XOR of the last two carries) for signed numbers.

         M (mode)
          |
B3  B2  B1  B0
 |   |   |   |  (each XOR with M)
A3  A2  A1  A0
 FA<-FA<-FA<-FA <- C0 = M
 |   |   |   |
S3  S2  S1  S0

M=0 → A+B, M=1 → A−B.

Booth's Multiplication Algorithm

Booth's algorithm multiplies two signed 2's-complement numbers efficiently by examining pairs of bits. Let the multiplicand be M, multiplier be Q. Use an extra bit Q1Q_{-1} (initially 0), accumulator A (initially 0), and a counter for n bits.

Rule (examine Q0Q1Q_0 Q_{-1}):

  • 00 or 11 → arithmetic shift right only.
  • 01A = A + M, then arithmetic shift right.
  • 10A = A − M, then arithmetic shift right.

Repeat for n cycles; result is in A,Q.

Multiply (−5) × (3) using Booth's Algorithm

Use 4-bit numbers (n=4n = 4):

  • Multiplicand M = −5 = 1011, so −M = +5 = 0101.
  • Multiplier Q = +3 = 0011.
  • Initially A = 0000, Q = 0011, Q₋₁ = 0.
StepOperationAQQ₋₁
Init000000110
1Q₀Q₋₁=10 → A = A − M (A+0101)010100110
ASR001010011
2Q₀Q₋₁=11 → ASR000101001
3Q₀Q₋₁=01 → A = A + M (A+1011)110001001
ASR111000100
4Q₀Q₋₁=00 → ASR111100010

Final result A,Q = 1111 0001.

Interpreting 11110001 as an 8-bit 2's-complement number:

111100012=(000011112)=1511110001_2 = -(00001111_2) = -15

Verification: (5)×(3)=15(-5) \times (3) = -15. ✔

aluarithmetic
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

What is virtual memory? Explain the concept of paging.

Virtual Memory

Virtual memory is a memory-management technique that gives a program the illusion of a very large, contiguous main memory by using a combination of RAM and secondary storage (disk). The total addressable (virtual) address space can be larger than the physical RAM. The OS, with hardware support (MMU), transparently moves data between disk and RAM as needed.

Benefits: programs larger than RAM can run, multiprogramming with isolation/protection, and efficient memory utilization.

Paging

Paging is the most common implementation of virtual memory:

  • The virtual address space is divided into fixed-size blocks called pages.
  • Physical memory is divided into equal-size blocks called frames (page size = frame size, e.g., 4 KB).
  • A page table (one per process) maps each virtual page number to a physical frame number.

Address translation: A virtual address is split into a page number and an offset:

Physical Address=(Frame No. from page table)×page size+offset\text{Physical Address} = (\text{Frame No. from page table}) \times \text{page size} + \text{offset}

When a referenced page is not in RAM, a page fault occurs; the OS loads the page from disk into a free frame (replacing one using LRU/FIFO if memory is full) and updates the page table. A TLB (Translation Lookaside Buffer) caches recent translations to speed this up.

Advantage: eliminates external fragmentation (fixed-size blocks). Paging is transparent to the programmer.

memoryvirtual
5short5 marks

Explain the instruction format and the types of instructions based on the number of addresses.

Instruction Format

An instruction format defines the layout of bits in a machine instruction. It typically consists of:

  • Opcode field – specifies the operation to be performed (e.g., ADD, LOAD).
  • Operand/Address field(s) – specify the operands or their addresses.
  • Mode field – (optional) specifies the addressing mode.
| Opcode | Mode | Address/Operand 1 | Address 2 | ... |

The number of address fields depends on the CPU organization.

Types of Instructions Based on Number of Addresses

1. Three-Address Instruction

Specifies two source operands and one destination. (General-register machines.)

  • Format: ADD R1, R2, R3R1 = R2 + R3.
  • Computes X = (A+B)*(C+D):
ADD T1, A, B
ADD T2, C, D
MUL X, T1, T2
  • Merit: short programs. Demerit: long instructions (many address bits).

2. Two-Address Instruction

Most common; one operand acts as both source and destination.

  • Format: ADD R1, R2R1 = R1 + R2.
MOV R1, A
ADD R1, B

3. One-Address Instruction

Uses an implied accumulator (AC) for one operand and result.

  • Format: ADD BAC = AC + B.
LOAD A
ADD B
STORE X

4. Zero-Address Instruction

Uses a stack; operands are implicitly on top of the stack (no address field for arithmetic ops). Found in stack machines using reverse Polish notation.

  • Format: ADD → pops two top elements, pushes their sum.
PUSH A
PUSH B
ADD
POP X

Trade-off

More addresses → fewer instructions but longer instructions; fewer addresses → shorter instructions but more of them per program.

instruction
6short5 marks

Explain set-associative mapping technique of cache memory with an example.

Set-Associative Mapping

Set-associative mapping is a cache mapping technique that combines the advantages of direct mapping and fully-associative mapping. The cache is divided into a number of sets, and each set contains k lines (called a k-way set-associative cache). A memory block maps to exactly one set (like direct mapping), but can be placed in any line within that set (like associative mapping).

Mapping Rule

Set number=(Block number)mod(Number of sets)\text{Set number} = (\text{Block number}) \bmod (\text{Number of sets})

The memory address is divided into three fields:

| Tag | Set (Index) | Word (Offset) |
  • Word/Offset: selects the byte/word within a block.
  • Set index: selects the set.
  • Tag: compared against the tags of all k lines in that set to detect a hit.

Example

Suppose:

  • Main memory = 4 K blocks (block numbers 0–4095).
  • Cache = 128 lines, organized as 2-way set associative ⇒ Number of sets = 128 / 2 = 64 sets.

Then block B maps to set B mod 64.

  • Block 0 → set 0, block 64 → set 0, block 128 → set 0, ... (all may reside in either of the 2 lines of set 0).
  • Block 5 → set 5, block 69 → set 5, etc.

When the CPU accesses an address, the set index selects one set; the tags of both lines in that set are compared in parallel with the address tag. A match = cache hit; otherwise a miss and a line is replaced (using LRU within the set).

Advantages

  • Fewer conflict misses than direct mapping (k choices per set).
  • Cheaper and faster than fully associative mapping (only k tag comparisons, not all lines).
  • Most modern caches are 2-way, 4-way, or 8-way set associative.
cachemapping
7short5 marks

Explain the IEEE 754 floating point representation for single precision numbers with an example.

IEEE 754 Single-Precision Floating-Point Representation

The IEEE 754 single-precision format uses 32 bits to represent a real number, divided into three fields:

| Sign (1 bit) | Exponent (8 bits) | Mantissa/Fraction (23 bits) |
  • Sign (S): 1 bit — 0 for positive, 1 for negative.
  • Exponent (E): 8 bits — stored in biased form with a bias of 127. So Estored=Eactual+127E_{stored} = E_{actual} + 127.
  • Mantissa (M): 23 bits — the fractional part of the normalized significand. A hidden/implicit leading 1 is assumed, so the actual significand is 1.M.

Value (for normalized numbers):

N=(1)S×(1.M)2×2(Estored127)N = (-1)^S \times (1.M)_2 \times 2^{(E_{stored} - 127)}

Example: Represent 12.625-12.625

Step 1 – Convert to binary:

12.62510=1100.101212.625_{10} = 1100.101_2

(12=110012 = 1100, 0.625=0.1010.625 = 0.101)

Step 2 – Normalize (scientific form):

1100.101=1.100101×231100.101 = 1.100101 \times 2^{3}

Step 3 – Determine fields:

  • Sign S=1S = 1 (number is negative).
  • Exponent: Estored=3+127=130=100000102E_{stored} = 3 + 127 = 130 = 10000010_2.
  • Mantissa: digits after the point = 100101, padded to 23 bits → 10010100000000000000000.

Step 4 – Assemble (32 bits):

1 10000010 10010100000000000000000

Grouping in hex: 1100 0001 0100 1010 0000 0000 0000 0000 = 0xC14A0000.

So 12.625-12.625 is stored as C1 4A 00 00 in IEEE 754 single precision.

Special values: E=0,M=0E=0, M=0 → ±0; E=255,M=0E=255, M=0 → ±∞; E=255,M0E=255, M\ne0 → NaN.

number-system
8short5 marks

Explain the memory hierarchy in a computer system with a suitable diagram.

Memory Hierarchy

The memory hierarchy organizes the various types of memory in a computer system into levels based on speed, cost per bit, and capacity. As we move up the hierarchy (toward the CPU), memory becomes faster, smaller, and more expensive; as we move down, it becomes slower, larger, and cheaper.

        ^  Faster, Smaller, Costlier
        |
   [ CPU Registers ]        <- fastest, smallest
   [ Cache (L1, L2, L3) ]
   [ Main Memory (RAM) ]
   [ Secondary Storage (SSD/HDD) ]
   [ Tertiary Storage (Tape/Optical) ]  <- slowest, largest
        |
        v  Slower, Larger, Cheaper

Levels (from fastest to slowest)

  1. Registers – inside the CPU; fastest, hold operands/results currently in use; capacity in bytes; accessed in 1 clock cycle.
  2. Cache memory (SRAM) – L1/L2/L3; holds frequently used data/instructions to bridge the CPU–RAM speed gap.
  3. Main memory (RAM, DRAM) – holds active programs and data; volatile; access in tens of nanoseconds.
  4. Secondary storage (SSD/HDD) – non-volatile, large capacity for permanent storage; access in microseconds/milliseconds.
  5. Tertiary/Backup storage (magnetic tape, optical disks) – very large, very slow, used for archival/backup.

Principle of Locality

The hierarchy works efficiently because of the principle of locality — programs tend to access a small portion of memory repeatedly (temporal locality) and nearby addresses (spatial locality). Frequently used data is kept in faster upper levels, so the system approaches the speed of the fastest memory at the cost close to the cheapest memory.

memoryhierarchy
9short5 marks

What is an interrupt? Explain the different types of interrupts.

Interrupt

An interrupt is a signal sent to the CPU (by hardware or software) that temporarily suspends the currently executing program, transfers control to a special routine called an Interrupt Service Routine (ISR), and resumes the original program afterward. Interrupts allow the CPU to respond promptly to events (e.g., I/O completion) without continuously polling, improving efficiency and concurrency.

Basic sequence: current instruction completes → CPU saves PC and status (PSW) onto the stack → loads the ISR address (via interrupt vector) → executes ISR → restores saved state → resumes the interrupted program.

Types of Interrupts

1. Hardware Interrupts

Generated by external hardware devices.

  • (a) Maskable interrupts: can be enabled/disabled (ignored) by the CPU using interrupt-enable flags (e.g., INTR line). Used by most peripherals.
  • (b) Non-maskable interrupts (NMI): cannot be disabled; reserved for critical events such as power failure or hardware error.

2. Software Interrupts

Generated by executing a special instruction within a program (e.g., INT n in x86, system calls/traps). Used to request OS services.

3. Internal Interrupts / Exceptions (Traps)

Generated internally by the processor due to exceptional conditions during execution, such as:

  • Division by zero
  • Invalid/illegal opcode
  • Arithmetic overflow
  • Page fault / memory-protection violation

Classification Summary

  • By source: internal (exceptions/traps) vs external (hardware) vs software.
  • By maskability: maskable vs non-maskable.
  • By vectoring: vectored (device supplies ISR address) vs non-vectored (fixed ISR address).
interrupt
10short5 marks

What is a system bus? Explain address bus, data bus and control bus.

System Bus

A system bus is a set of parallel conducting wires (communication pathway) that connects the major components of a computer — the CPU, main memory, and I/O devices — and carries data, addresses, and control signals among them. It is shared by all components, so only one device can drive the bus at a time (controlled by a bus arbiter). The system bus is logically divided into three functional groups:

1. Address Bus

  • Carries the memory or I/O address of the location the CPU wants to access.
  • It is unidirectional (from CPU to memory/I/O).
  • Its width determines the addressable memory size: an n-line address bus can address 2n2^n locations. (e.g., 32-bit address bus → 2322^{32} = 4 GB addressable space).

2. Data Bus

  • Carries the actual data being transferred between CPU, memory, and I/O.
  • It is bidirectional (data can flow both ways: read and write).
  • Its width determines how many bits are transferred at once (e.g., 32-bit or 64-bit), affecting performance.

3. Control Bus

  • Carries control and timing signals that coordinate and synchronize the operations of all components.
  • Typical signals: Memory Read, Memory Write, I/O Read, I/O Write, Clock, Interrupt Request (INTR), Bus Request/Grant, Reset.
  • It is generally bidirectional (different lines flow in different directions).

Summary

The address bus says where, the data bus says what, and the control bus says how/when a transfer happens. Together they enable communication across the system.

bus
11short5 marks

What is DMA? Explain how direct memory access transfers data without CPU intervention.

DMA (Direct Memory Access)

Direct Memory Access (DMA) is an I/O technique in which a dedicated controller — the DMA controller (DMAC) — transfers a block of data directly between an I/O device and main memory without involving the CPU for each word. This frees the CPU from the overhead of programmed I/O or interrupt-driven I/O, greatly increasing throughput for high-speed devices (disks, network cards).

How DMA Transfers Data Without CPU Intervention

  1. Setup (CPU involvement): The CPU programs the DMA controller by writing into its registers:

    • the memory address (where data should go/come from),
    • the word/byte count (number of words to transfer),
    • the direction (read from or write to memory), and a start command.
  2. Bus request: When the I/O device is ready, the DMAC requests control of the system bus by asserting the HOLD / Bus Request (BR) line to the CPU.

  3. Bus grant (cycle stealing / DMA): The CPU finishes its current bus cycle, releases (floats) the system bus, and acknowledges with HLDA / Bus Grant (BG). The CPU is temporarily disconnected from the bus.

  4. Data transfer: The DMAC now acts as bus master and transfers data directly between the I/O device and memory over the address/data buses, incrementing the address and decrementing the count after each word. No CPU instructions are executed for the transfer.

  5. Completion: When the word count reaches zero, the DMAC releases the bus (deasserts HOLD) and interrupts the CPU to signal that the transfer is complete. The CPU resumes normal operation.

Modes

  • Burst (block) mode: entire block transferred while DMAC holds the bus.
  • Cycle stealing: DMAC takes the bus for one word at a time, releasing it between words so the CPU can still work.
  • Transparent/hidden mode: transfers only when CPU is not using the bus.

Advantage: CPU is involved only at the start and end, so it is free to do other work during the bulk transfer — much faster than programmed/interrupt I/O for large blocks.

dma
12short5 marks

Differentiate between RISC and CISC architectures.

RISC vs CISC

RISC (Reduced Instruction Set Computer) uses a small set of simple, fixed-length instructions optimized for fast, pipelined execution. CISC (Complex Instruction Set Computer) provides a large set of powerful, variable-length instructions, some of which perform complex multi-step operations.

FeatureRISCCISC
Instruction set sizeSmall, simple instructionsLarge, complex instructions
Instruction lengthFixed lengthVariable length
Execution timeUsually 1 clock cycle per instructionMultiple cycles per instruction
Addressing modesFew (simple)Many (complex)
Memory accessOnly via LOAD/STORE (load-store architecture)Operands can be in memory directly
RegistersLarge number of general-purpose registersFewer registers
Control unitHardwired controlMicroprogrammed control
PipeliningSimple and highly efficientHarder to pipeline
Code sizeLarger (more instructions per program)Smaller (fewer, denser instructions)
ComplexityIn compiler (software)In hardware/microcode
ExamplesARM, MIPS, SPARC, RISC-V, PowerPCIntel x86, Motorola 68000, VAX

Summary

RISC pushes complexity to the compiler with many simple, uniform instructions that pipeline well, while CISC builds complex operations into hardware/microcode, reducing code size at the cost of slower, harder-to-pipeline execution. Modern x86 CPUs internally translate CISC instructions into RISC-like micro-operations, blending both philosophies.

risc-cisc

Frequently asked questions

Where can I find the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) question paper 2078?
The full BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2078 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Computer Architecture (BSc CSIT, CSC208) 2078 paper come with solutions?
Yes. Every question on this Computer Architecture (BSc CSIT, CSC208) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2078 paper?
The BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2078 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Computer Architecture (BSc CSIT, CSC208) past paper free?
Yes — reading and attempting this Computer Architecture (BSc CSIT, CSC208) past paper on Kekkei is completely free.