BSc CSIT (TU) Science Computer Architecture (BSc CSIT, CSC208) Question Paper 2074 Nepal

Q: Where can I find the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) question paper 2074?

The full BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2074 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Computer Architecture (BSc CSIT, CSC208) 2074 paper come with solutions?

Yes. Every question on this Computer Architecture (BSc CSIT, CSC208) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2074 paper?

The BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2074 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Computer Architecture (BSc CSIT, CSC208) past paper free?

Yes — reading and attempting this Computer Architecture (BSc CSIT, CSC208) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain the design of a binary adder-subtractor circuit. Describe Booth's multiplication algorithm and multiply (-5) x (3) using it.

aluarithmetic

Answer 1

Binary Adder-Subtractor Circuit

A single circuit can perform both addition and subtraction on two $n$ -bit numbers by exploiting two's-complement arithmetic, since $A - B = A + (\overline{B} + 1)$ .

Design:

Use $n$ full adders (a ripple-carry adder) for the two $n$ -bit operands $A$ and $B$ .
Insert one XOR gate in each $B$ -bit path with a common control line $M$ (Mode). The other XOR input is $M$ .
Connect $M$ to the carry-in $C_0$ of the least-significant full adder.

Operation:

When $M = 0$ : each XOR passes $B$ unchanged ( $B \oplus 0 = B$ ) and $C_0 = 0$ , so the circuit computes $A + B$ (addition).
When $M = 1$ : each XOR inverts $B$ ( $B \oplus 1 = \overline{B}$ ) and $C_0 = 1$ , so the circuit computes $A + \overline{B} + 1 = A - B$ (subtraction).

The final carry-out is the carry/borrow, and overflow is detected by $V = C_n \oplus C_{n-1}$ .

      B3 M   B2 M   B1 M   B0 M
       \ /    \ /    \ /    \ /
       XOR    XOR    XOR    XOR
        |      |      |      |
A3 --> FA <-- A2 -> FA <- A1 -> FA <- A0 -> FA <-- C0 = M
C4<-----+------+------+------+
       S3     S2     S1     S0

Booth's Multiplication Algorithm

Booth's algorithm multiplies two signed (two's-complement) numbers by examining pairs of bits of the multiplier and performing add/subtract/shift operations. It treats runs of 1's efficiently.

Registers: $A$ (accumulator, initially 0), $Q$ (multiplier), $Q_{-1}$ (extra bit, initially 0), $M$ (multiplicand). Count $= n$ .

Rule — inspect $Q_0 Q_{-1}$ :

$00$ or $11$ : no operation (only shift)
$01$ : $A \leftarrow A + M$
$10$ : $A \leftarrow A - M$

Then perform an arithmetic right shift (ASR) on $A, Q, Q_{-1}$ together and decrement count. Repeat $n$ times.

Multiply $(-5) \times (3)$ using Booth's Algorithm

Use 4 bits. Multiplicand $M = +3 = 0011$ , $-M = 1101$ . Multiplier $Q = -5 = 1011$ . $n = 4$ .

Initial: $A = 0000$ , $Q = 1011$ , $Q_{-1} = 0$ .

Step	$Q_0 Q_{-1}$	Operation	$A$	$Q$	$Q_{-1}$
Init			0000	1011	0
1	1 0	$A = A - M$	1101	1011	0
		ASR	1110	1101	1
2	1 1	shift only (ASR)	1111	0110	1
3	0 1	$A = A + M$	0010	0110	1
		ASR	0001	0011	0
4	1 0	$A = A - M$	1110	0011	0
		ASR	1111	0001	1

Result $= A Q = 1111\,0001$ (8-bit two's complement).

$1111\,0001_2 = -(0000\,1111_2) = -15$ .

Final answer: $(-5) \times (3) = -15$ . ✓

Answer 2

Control Unit

The control unit (CU) is the part of the CPU that directs the operation of the processor. It generates the timing and control signals needed to fetch, decode and execute instructions, coordinating the ALU, registers, memory and I/O. It interprets each instruction's opcode and produces a sequence of micro-operations (e.g. register transfers, memory read/write, ALU operations) in the correct order.

There are two ways to implement a control unit: hardwired and microprogrammed.

Hardwired Control Unit

Control signals are generated by fixed combinational logic — gates, decoders, flip-flops and a sequence counter. The instruction opcode, status flags and timing signals are inputs to this logic, whose outputs are the control signals.

Advantages:

Very fast, since signals come directly from logic gates (no memory access).
Suited to RISC processors and simple, regular instruction sets.

Disadvantages:

Complex and hard to design for large instruction sets.
Difficult to modify or extend — adding/changing an instruction means redesigning the logic.

Microprogrammed Control Unit

Control signals are stored as microinstructions in a special memory called the control memory (control store). Each machine instruction maps to a microroutine; a microprogram sequencer fetches microinstructions whose bits directly form (or are decoded into) the control signals.

Advantages:

Flexible — modifying control just means changing the microprogram in control memory.
Simpler, systematic design; well suited to CISC with large, complex instruction sets.

Disadvantages:

Slower, because each step requires reading the control memory.
Requires extra control memory, adding cost and area.

Comparison Table

Feature	Hardwired	Microprogrammed
Implementation	Combinational logic	Microinstructions in control memory
Speed	Fast	Slower
Flexibility	Low (hard to modify)	High (change microprogram)
Design complexity	High for complex ISA	Systematic, easier
Cost	Lower (no control memory)	Higher (control store)
Best suited for	RISC / simple ISA	CISC / complex ISA

Answer 3

Cache Memory

Cache memory is a small, very fast (typically SRAM) memory placed between the CPU and main memory. It holds copies of frequently/recently used instructions and data so the CPU can access them quickly, reducing average memory-access time. It works on the principle of locality of reference (temporal and spatial). When the CPU needs data, the cache is checked first: a hit means the data is found in cache; a miss means it must be fetched from main memory (and a block is brought into the cache).

Main memory is divided into blocks; cache is divided into lines (frames) of the same size. Mapping decides which block goes into which line.

1. Direct Mapping

Each main-memory block maps to exactly one cache line, given by:

\text{line} = (\text{block number}) \bmod (\text{number of cache lines})

The address is split into Tag | Line (Index) | Word (Offset).

  Address:  | Tag | Line | Word |
  Block 0,8,16... -> Line 0
  Block 1,9,17... -> Line 1   (cache line stores Tag + data)

Advantage: simple and cheap; only one comparison needed.
Disadvantage: high conflict misses — two heavily used blocks mapping to the same line keep evicting each other.

2. Associative (Fully Associative) Mapping

A block can be placed in any cache line. The address is split into only Tag | Word. The tag of every line must be compared (in parallel via comparators / CAM).

  Address: | Tag | Word |
  Any block -> any free / replaceable line

Advantage: most flexible; lowest conflict misses; best hit ratio.
Disadvantage: expensive — needs many comparators and a replacement policy (LRU/FIFO).

3. Set-Associative Mapping

A compromise: cache is divided into sets, each containing $k$ lines (a $k$ -way set-associative cache). A block maps to one set (direct-mapped) and within that set it can go to any line (associative):

\text{set} = (\text{block number}) \bmod (\text{number of sets})

Address is split into Tag | Set | Word. Only the $k$ tags in the chosen set are compared.

  Address: | Tag | Set | Word |
  Set 0: [line0][line1]   <- 2-way set
  Set 1: [line0][line1]

Advantage: fewer conflict misses than direct mapping, cheaper than fully associative — most widely used (e.g. 2-way, 4-way).
Disadvantage: needs $k$ comparators and a replacement policy within a set.

Summary

Technique	Block placement	Comparisons	Cost	Hit ratio
Direct	one fixed line	1	low	lower
Associative	any line	all lines	high	highest
Set-associative	any line in one set	$k$	medium	good

Answer 4

IEEE 754 Single-Precision Floating Point

The IEEE 754 single-precision format uses 32 bits divided into three fields:

Field	Bits	Width
Sign ( $S$ )	31	1 bit
Exponent ( $E$ )	30–23	8 bits (biased, bias = 127)
Mantissa / Fraction ( $M$ )	22–0	23 bits

The value represented (for normalized numbers) is:

(-1)^{S} \times 1.M \times 2^{(E - 127)}

The leading 1 before the binary point is implicit (hidden bit), giving 24 bits of precision.

Example: represent $-6.75$ .

Sign: number is negative, so $S = 1$ .
Convert magnitude to binary: $6.75 = 110.11_2$ .
Normalize: $110.11 = 1.1011 \times 2^{2}$ , so the unbiased exponent is $2$ and mantissa $= 1011$ .
Biased exponent: $E = 2 + 127 = 129 = 10000001_2$ .
Mantissa (23 bits, drop hidden 1): $1011\,0000\,0000\,0000\,0000\,000$ .

Result:

1 10000001 10110000000000000000000

In hex this is 0xC0D80000.

Special values: $E = 0$ , $M = 0 \Rightarrow \pm 0$ ; $E = 255$ , $M = 0 \Rightarrow \pm\infty$ ; $E = 255$ , $M \neq 0 \Rightarrow$ NaN; $E = 0$ , $M \neq 0 \Rightarrow$ denormalized number.

Answer 5

Memory Hierarchy

The memory hierarchy arranges different memory types in levels according to speed, cost per bit, and capacity. As we move up the hierarchy memories get faster, smaller and more expensive per bit; as we move down they get slower, larger and cheaper. The goal is to give the CPU memory that is, on average, almost as fast as the fastest level but as large and cheap as the slowest, exploiting locality of reference.

            ^  faster, smaller, costlier per bit
            |   +------------------+
            |   |    Registers     |   (inside CPU, fastest)
            |   +------------------+
            |   |   Cache (L1/L2/L3)|   (SRAM)
            |   +------------------+
            |   |   Main Memory     |   (DRAM / RAM)
            |   +------------------+
            |   | Secondary Storage |   (SSD / HDD)
            v   +------------------+
                | Tertiary/Backup   |   (magnetic tape, optical)
      slower, larger, cheaper per bit

Levels (top to bottom):

Registers — small, fastest storage inside the CPU for operands.
Cache memory — fast SRAM (L1, L2, L3) holding frequently used data/instructions.
Main memory (RAM) — DRAM holding currently running programs and data.
Secondary storage — SSD/HDD for permanent, large-capacity, non-volatile storage.
Tertiary / backup — magnetic tapes, optical disks for archival.

Key terms: the CPU first looks at the higher (faster) levels; a hit is found there, a miss forces access to a lower level. Effective access time is kept low because most accesses hit the upper levels due to locality.

Answer 6

Interrupt

An interrupt is a signal to the processor that temporarily suspends the normal execution of the current program so that the CPU can attend to a higher-priority event. When an interrupt occurs, the CPU saves its current state (program counter, registers), transfers control to an Interrupt Service Routine (ISR) through the interrupt vector, services the event, and then restores the saved state to resume the interrupted program. Interrupts allow the CPU to respond to events efficiently instead of continuously polling.

Types of Interrupts

1. Hardware Interrupts — generated by external hardware devices.

Maskable interrupts: can be ignored/delayed by the CPU (disabled via interrupt flag), e.g. signals from keyboard, printer, timer.
Non-maskable interrupts (NMI): cannot be disabled; used for critical events such as power failure or memory parity error.

2. Software Interrupts — generated by execution of an instruction in the program (e.g. INT n / system call / trap). Used by programs to request operating-system services.

3. Internal Interrupts / Exceptions (Traps) — caused by error conditions during instruction execution, e.g. divide-by-zero, invalid opcode, overflow, page fault. They arise from within the CPU.

Other classification: interrupts may also be classified as vectored (the device supplies the ISR address) vs non-vectored (fixed ISR address), and by priority when several occur together.

Answer 7

System Bus

A system bus is a set of parallel conductors (wires) that connects the major components of a computer — CPU, main memory and I/O modules — and carries data, addresses and control signals among them. It is a shared communication pathway; only one device may transmit at a time (controlled by bus arbitration). The system bus is logically divided into three functional buses.

1. Address Bus

Carries the address of the memory location or I/O port that the CPU wants to access.
It is unidirectional (CPU → memory/I/O).
Its width determines the addressing capacity: an $n$ -bit address bus can address $2^n$ locations (e.g. 32-bit bus → $2^{32}$ = 4 GB).

2. Data Bus

Carries the actual data/instructions transferred between CPU, memory and I/O.
It is bidirectional (data can flow both into and out of the CPU).
Its width determines how many bits are transferred at once (e.g. an 8/16/32/64-bit data bus), affecting performance.

3. Control Bus

Carries control and timing signals that coordinate and manage operations on the address and data buses.
It is essentially bidirectional (individual lines may be one-directional).
Typical signals: Memory Read, Memory Write, I/O Read, I/O Write, clock, interrupt request, bus grant/request, reset.

      +-------+   Address Bus (uni)   +--------+
      |  CPU  |----------------------->| Memory |
      |       |<====Data Bus========>|  /I/O  |
      |       |----Control Bus-------->|        |
      +-------+                        +--------+

Answer 8

Direct Memory Access (DMA)

DMA is an I/O technique in which a special hardware controller — the DMA controller (DMAC) — transfers a block of data directly between an I/O device and main memory without the CPU moving each word. It is used for fast, high-volume transfers (disk, network) because programmed I/O and interrupt-driven I/O waste CPU time moving data word by word.

How DMA Transfers Data Without CPU Intervention

The CPU only sets up the transfer; the DMA controller then carries it out:

Initialization: The CPU programs the DMA controller with the memory start address, the word/byte count (block size), the direction (read/write) and the device involved, then continues with other work.
Bus request: When the I/O device is ready, the DMA controller sends a Bus Request (HOLD / DRQ) to the CPU asking for control of the system buses.
Bus grant: The CPU finishes its current bus cycle, releases the address, data and control buses (tri-states them), and replies with a Bus Grant (HLDA / DACK). The CPU is now temporarily detached from the buses.
Data transfer: The DMA controller becomes bus master. It places addresses on the address bus and issues memory/I/O read-write control signals, transferring data directly between the device and memory, incrementing the address and decrementing the count for each word.
Completion: When the count reaches zero, the DMA controller releases the buses and raises an interrupt to inform the CPU that the transfer is complete.

Transfer modes: Burst (block) mode — entire block transferred while CPU is held; Cycle stealing — DMAC takes one bus cycle at a time between CPU cycles; Transparent mode — transfers only when the CPU is not using the bus.

Advantage: frees the CPU during large transfers, greatly improving throughput; the CPU and DMA share the bus (cycle stealing).

Answer 9

RISC vs CISC

CISC (Complex Instruction Set Computer) uses a large set of complex, variable-length instructions, many of which can perform multi-step operations or access memory directly; it emphasizes doing more per instruction (e.g. x86, VAX). RISC (Reduced Instruction Set Computer) uses a small set of simple, fixed-length instructions that execute in (about) one clock cycle, emphasizing simple hardware and compiler optimization (e.g. ARM, MIPS, SPARC).

Feature	RISC	CISC
Instruction set	Small, simple	Large, complex
Instruction length	Fixed	Variable
Execution per instruction	Mostly 1 clock cycle	Multiple clock cycles
Memory access	Only via LOAD/STORE	Many instructions access memory
Addressing modes	Few	Many
Registers	Many general-purpose	Fewer
Control unit	Usually hardwired	Usually microprogrammed
Pipelining	Easy and efficient	Harder
Code size	Larger (more instructions)	Smaller (fewer instructions)
Complexity location	In the compiler/software	In the hardware/microcode
Examples	ARM, MIPS, SPARC	Intel x86, VAX

Answer 10

Register Transfer Language (RTL)

Register Transfer Language (RTL), also called Register Transfer Notation, is a symbolic notation used to describe the micro-operations that transfer information among the registers of a digital system. It precisely specifies, in a concise hardware-oriented form, what data is moved and what operation is performed during each clock cycle.

Basic notation:

Registers are denoted by capital letters: $R1, R2, AR, PC, MAR$ .
The transfer (replacement) operator is $\leftarrow$ . Example: $R2 \leftarrow R1$ means the contents of $R1$ are copied into $R2$ (R1 unchanged).
A control condition is written before the statement with a colon: $P:\; R2 \leftarrow R1$ means the transfer happens only when control signal $P = 1$ .
Two transfers in the same clock are separated by a comma: $T:\; R2 \leftarrow R1,\; R1 \leftarrow R2$ .

Types of Micro-operations (with examples)

1. Register transfer micro-operation — moves data between registers:

R2 \leftarrow R1

2. Arithmetic micro-operations — add, subtract, increment, decrement, complement:

R3 \leftarrow R1 + R2,\quad R2 \leftarrow R2 - 1,\quad R1 \leftarrow \overline{R1} + 1\ (\text{2's complement})

3. Logic micro-operations — bitwise AND, OR, XOR, NOT:

R1 \leftarrow R1 \wedge R2,\quad R1 \leftarrow R1 \vee R2,\quad R1 \leftarrow \overline{R1}

4. Shift micro-operations — logical, circular, arithmetic shifts:

R \leftarrow shl\,R,\quad R \leftarrow shr\,R,\quad R \leftarrow ashr\,R

Example of a memory micro-operation: $DR \leftarrow M[AR]$ (read) and $M[AR] \leftarrow DR$ (write).

Answer 11

Restoring Division of Unsigned Integers

Restoring division mimics manual long division in binary. For an $n$ -bit dividend $Q$ and divisor $M$ , use registers $A$ (accumulator, initially 0), $Q$ (dividend), and $M$ (divisor). Repeat $n$ times.

Algorithm (per step):

A, Q <- arithmetic/logical left shift of (A,Q)   // shift left by 1
A <- A - M                                       // subtract divisor
if A < 0 (sign bit = 1):                         // restore
    Q0 <- 0
    A <- A + M                                   // restore A
else:
    Q0 <- 1

After $n$ iterations, $Q$ holds the quotient and $A$ holds the remainder.

Example: Divide $7 \div 3$ (4-bit)

$Q = 0111$ (dividend = 7), $M = 0011$ (divisor = 3), $A = 0000$ , $n = 4$ .

Step	Operation	$A$	$Q$
Init		0000	0111
1	Shift left	0000	111_
	$A - M$	1101	111_
	$A<0$ → restore, $Q_0{=}0$	0000	1110
2	Shift left	0001	110_
	$A - M$	1110	110_
	$A<0$ → restore, $Q_0{=}0$	0001	1100
3	Shift left	0011	100_
	$A - M$	0000	100_
	$A\ge 0$ → $Q_0{=}1$	0000	1001
4	Shift left	0001	001_
	$A - M$	1110	001_
	$A<0$ → restore, $Q_0{=}0$	0001	0010

Result: Quotient $Q = 0010 = 2$ , Remainder $A = 0001 = 1$ .

Check: $3 \times 2 + 1 = 7$ . ✓

Answer 12

Microprogrammed Control

In a microprogrammed control unit, the control signals required to execute each machine instruction are not generated by fixed logic but are stored as microinstructions in a special read-only memory called the control memory (control store). The collection of microinstructions that implements one machine instruction is a microroutine, and the whole set of microroutines is the microprogram.

Working:

The opcode of the fetched instruction is mapped to the starting address of its microroutine in control memory.
The Control Address Register (CAR) holds the address of the next microinstruction.
The microinstruction is read into the Control Buffer/Data Register; its bits (directly or after decoding) become the control signals.
The microprogram sequencer computes the next CAR value (increment, branch, or map) and the cycle repeats until the instruction is done.

Key components: Control Memory, CAR (sequencer), Control Data Register, and next-address logic.

Microinstruction Format

A microinstruction is a binary word divided into fields. A typical format has three parts:

+----------------+----------------+-----------------+
|  Control field |  Condition /   |  Branch / Next  |
| (micro-ops)    |  Branch field  |  Address field  |
+----------------+----------------+-----------------+

Control / micro-operation field: specifies the control signals / micro-operations to perform this cycle. It may be horizontal (one bit per control line — fast, wide) or vertical (encoded fields decoded into signals — compact, narrower).
Condition (branch) field: selects the status/condition (e.g. zero, carry, unconditional) that decides whether a branch is taken.
Address (next-address) field: gives the address of the next microinstruction in control memory when a branch is taken.

Horizontal vs Vertical: horizontal microinstructions are wider, allow more parallelism and are faster but use more memory; vertical microinstructions are encoded, shorter and need extra decoders.

Level	BSc CSIT (TU)
Stream	Science
Subject	Computer Architecture (BSc CSIT, CSC208)
Year	2074 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions

BSc CSIT (TU) Science Computer Architecture (BSc CSIT, CSC208) Question Paper 2074 Nepal

Section A: Long Answer Questions

Binary Adder-Subtractor Circuit

Booth's Multiplication Algorithm

Multiply $(-5) \times (3)$ using Booth's Algorithm

Control Unit

Hardwired Control Unit

Microprogrammed Control Unit

Comparison Table

Cache Memory

1. Direct Mapping

2. Associative (Fully Associative) Mapping

3. Set-Associative Mapping

Summary

Section B: Short Answer Questions

IEEE 754 Single-Precision Floating Point

Memory Hierarchy

Interrupt

Types of Interrupts

System Bus

1. Address Bus

2. Data Bus

3. Control Bus

Direct Memory Access (DMA)

How DMA Transfers Data Without CPU Intervention

RISC vs CISC

Register Transfer Language (RTL)

Types of Micro-operations (with examples)

Restoring Division of Unsigned Integers

Example: Divide $7 \div 3$ (4-bit)

Microprogrammed Control

Microinstruction Format

Frequently asked questions

Section A: Long Answer Questions

Binary Adder-Subtractor Circuit

Booth's Multiplication Algorithm

Multiply (−5)×(3)(-5) \times (3)(−5)×(3) using Booth's Algorithm

Control Unit

Hardwired Control Unit

Microprogrammed Control Unit

Comparison Table

Cache Memory

1. Direct Mapping

2. Associative (Fully Associative) Mapping

3. Set-Associative Mapping

Summary

Section B: Short Answer Questions

IEEE 754 Single-Precision Floating Point

Memory Hierarchy

Interrupt

Types of Interrupts

System Bus

1. Address Bus

2. Data Bus

3. Control Bus

Direct Memory Access (DMA)

How DMA Transfers Data Without CPU Intervention

RISC vs CISC

Register Transfer Language (RTL)

Types of Micro-operations (with examples)

Restoring Division of Unsigned Integers

Example: Divide 7÷37 \div 37÷3 (4-bit)

Microprogrammed Control

Microinstruction Format

Frequently asked questions

Multiply $(-5) \times (3)$ using Booth's Algorithm

Example: Divide $7 \div 3$ (4-bit)