BSc CSIT (TU) Science Computer Architecture (BSc CSIT, CSC208) Question Paper 2077 Nepal

Q: Where can I find the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) question paper 2077?

The full BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2077 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Computer Architecture (BSc CSIT, CSC208) 2077 paper come with solutions?

Yes. Every question on this Computer Architecture (BSc CSIT, CSC208) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2077 paper?

The BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2077 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Computer Architecture (BSc CSIT, CSC208) past paper free?

Yes — reading and attempting this Computer Architecture (BSc CSIT, CSC208) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain the different modes of data transfer between CPU and I/O devices: programmed I/O, interrupt-driven I/O and DMA.

iointerrupt

Answer 1

Modes of Data Transfer between CPU and I/O

Data transfer between the CPU/memory and I/O devices can be carried out in three modes.

1. Programmed I/O

The transfer is fully controlled by the CPU under program control. The CPU continuously polls (tests in a loop) a status flag in the interface until the device is ready, then transfers one word.

Loop: read status register
      if flag = 0 goto Loop   ; busy-wait
      transfer data word

Advantage: simple hardware.
Disadvantage: CPU is kept busy in the polling loop (busy-waiting), wasting CPU time. Suitable only for slow devices or small data.

2. Interrupt-Driven I/O

The CPU does not poll. When the device is ready it raises an interrupt request. The CPU finishes the current instruction, saves its state (PC, registers), and executes an Interrupt Service Routine (ISR) that performs the transfer, then returns to the interrupted program.

The CPU is free to do other work while the device is busy.
Still, the CPU executes the actual word-by-word transfer, so for high-speed bulk transfer the interrupt overhead becomes high.

3. Direct Memory Access (DMA)

A dedicated DMA controller transfers a whole block of data directly between memory and the device without CPU intervention for each word.

Steps:

CPU initializes the DMA controller with the memory address, word count, and direction (read/write), then continues other work.
The DMA controller requests the bus (bus request, BR); the CPU grants it (bus grant, BG) and relinquishes the system bus (cycle stealing or burst mode).
The DMA controller transfers data word-by-word directly to/from memory.
When the count reaches zero, the DMA controller interrupts the CPU to signal completion.

Advantage: fast, low CPU overhead; ideal for disks and high-speed devices.
Disadvantage: extra hardware; bus contention (CPU stalls when DMA holds the bus).

Comparison

Mode	CPU involvement	Speed	Use
Programmed I/O	High (polling)	Low	Slow/simple devices
Interrupt-driven	Moderate (per word, via ISR)	Medium	Moderate-speed devices
DMA	Low (per block)	High	High-speed/bulk transfer

Answer 2

Basic Organization of a CPU

The CPU is the component that fetches, decodes and executes instructions. Its three major parts are:

Arithmetic Logic Unit (ALU): performs arithmetic (add, subtract) and logic (AND, OR, NOT, shift) operations.
Control Unit (CU): generates control signals that sequence the operations of the ALU, registers and buses; it decodes instructions and directs data flow.
Register set: a small, fast set of storage locations used during execution.

Diagram (described): The ALU and the register set are connected by an internal common bus. The Control Unit issues control lines to all units. The CPU connects to memory via the address bus, data bus and control bus.

        +-----------+      +-----------+
        | Registers |<====>|    ALU    |
        +-----------+      +-----------+
              ^   internal bus   ^
              |                  |
          +------------------------+
          |     Control Unit        | --> control signals
          +------------------------+

Function of Registers

Register	Function
PC (Program Counter)	Holds the address of the next instruction to fetch
IR (Instruction Register)	Holds the current instruction being decoded/executed
MAR (Memory Address Register)	Holds the address sent to memory
MDR/MBR (Memory Data/Buffer Register)	Holds data read from / written to memory
AC (Accumulator)	Holds an operand and the result of ALU operations
SP (Stack Pointer)	Points to the top of the stack
General-purpose registers	Temporary storage of operands and results
Status/Flag register (PSW)	Holds condition flags (Z, C, S, V)

Instruction Cycle

Each instruction passes through the following phases:

Fetch: $MAR \leftarrow PC$ ; read memory; $IR \leftarrow MDR$ ; $PC \leftarrow PC + 1$ .
Decode: the CU interprets the opcode in IR and identifies operands.
Operand fetch / address calculation: compute the effective address and read operands (for memory-reference instructions).
Execute: the ALU performs the operation; the result is stored in a register or memory.
Interrupt check: if an interrupt is pending, save state and branch to the ISR.

The cycle then repeats with the next instruction, giving the fetch–decode–execute loop.

Answer 3

Flynn's Classification

Michael Flynn (1966) classified computer architectures according to the number of concurrent instruction streams and data streams.

1. SISD — Single Instruction, Single Data

A single processor executes a single instruction stream operating on a single data stream. This is the classical von Neumann (uniprocessor) machine; instructions execute sequentially (possibly pipelined).

Examples: traditional single-core PCs, early IBM mainframes, Intel 8086.

2. SIMD — Single Instruction, Multiple Data

One instruction is broadcast to many processing elements, each operating on a different data element simultaneously. Ideal for data-parallel/vector operations.

Examples: vector/array processors, GPUs, Intel SSE/AVX, the old ILLIAC IV.

3. MISD — Multiple Instruction, Single Data

Multiple instruction streams operate on the same single data stream. This is largely theoretical and rarely built.

Examples: systolic arrays and fault-tolerant redundant systems (e.g., the Space Shuttle flight-control computers) are sometimes cited.

4. MIMD — Multiple Instruction, Multiple Data

Multiple autonomous processors each execute their own instruction stream on their own data stream. The most general and common parallel architecture.

Examples: multi-core CPUs, multiprocessor servers, clusters, distributed systems.

Summary

Class	Instruction streams	Data streams	Example
SISD	1	1	Uniprocessor PC
SIMD	1	many	GPU, vector processor
MISD	many	1	Systolic array (rare)
MIMD	many	many	Multi-core / cluster

Answer 4

Register Transfer Language (RTL)

Register Transfer Language is a symbolic notation used to describe the micro-operations performed on the data stored in registers of a digital system. A statement

R2 \leftarrow R1

means the contents of register $R1$ are transferred (copied) into register $R2$ ; $R1$ is unchanged. Transfers are usually conditional on a control signal:

P: R2 \leftarrow R1 \quad\text{(transfer occurs only when control signal } P=1)

Micro-operations

A micro-operation is an elementary operation performed on data in registers in one clock pulse. The four categories are:

Register transfer: $R1 \leftarrow R2$
Arithmetic: $R3 \leftarrow R1 + R2$ , $R2 \leftarrow \overline{R2}+1$ (2's complement), $R1 \leftarrow R1 + 1$ (increment)
Logic: $R3 \leftarrow R1 \wedge R2$ (AND), $R3 \leftarrow R1 \vee R2$ (OR), $R1 \leftarrow \overline{R1}$ (complement)
Shift: $R1 \leftarrow shl\ R1$ (shift left), $R1 \leftarrow shr\ R1$ (shift right)

Memory transfers use $M$ with the address register $AR$ :

Read: $DR \leftarrow M[AR]$
Write: $M[AR] \leftarrow R1$

A comma separates simultaneous transfers, e.g. $T1: R1 \leftarrow R2,\ R3 \leftarrow R4$ .

Answer 5

Restoring Division of Unsigned Integers

Division is performed by repeated shift and subtract. Let the dividend be in register $Q$ , the divisor in $M$ , and $A$ a register initialised to 0. For $n$ -bit numbers, repeat the following $n$ times:

Shift the pair $(A, Q)$ left by one bit.
Subtract the divisor: $A \leftarrow A - M$ .
If $A < 0$ (sign bit = 1): set the new $Q_0 = 0$ and restore by adding the divisor back: $A \leftarrow A + M$ . Else (if $A \ge 0$ ): set $Q_0 = 1$ (no restore).

After $n$ iterations, $Q$ holds the quotient and $A$ holds the remainder.

Example: $7 \div 3$ (4-bit)

$M = 0011$ (=3), $Q = 0111$ (=7), $A = 0000$ , $n = 4$ .

Iter	Action	A	Q	$Q_0$
1	Shift left $(A,Q)$	0000	1110	—
	$A-M = 0000-0011$ → negative, restore	0000	1110	0
2	Shift left	0001	1100	—
	$A-M = 0001-0011$ → negative, restore	0001	1100	0
3	Shift left	0011	1000	—
	$A-M = 0011-0011 = 0$ → ≥0, set $Q_0=1$	0000	1001	1
4	Shift left	0001	0010	—
	$A-M = 0001-0011$ → negative, restore	0001	0010	0

Result: Quotient $Q = 0010 = 2$ , Remainder $A = 0001 = 1$ .

Check: $7 = 3 \times 2 + 1$ ✓. So $7 \div 3 \Rightarrow$ quotient = 2, remainder = 1.

Answer 6

Microprogrammed Control

In a microprogrammed control unit, the control signals required to execute each instruction are stored as words (microinstructions) in a special memory called the control memory (control store), usually a ROM. This contrasts with a hardwired control unit, where control logic is built from fixed gates and flip-flops.

How it works

Each machine instruction corresponds to a sequence of microinstructions (a microprogram / microroutine).
A Control Address Register (CAR) holds the address of the next microinstruction in control memory.
The microinstruction read out is placed in the Control Data/Buffer Register, and its control fields directly drive the control lines of the datapath.
A sequencer (next-address generator) computes the address of the next microinstruction (sequential, branch, or mapped from the opcode).

Advantages: flexible, easy to design and modify (just change the control store), good for complex instruction sets (CISC). Disadvantages: slower than hardwired control because of control-memory access.

Microinstruction Format

A microinstruction is divided into fields:

| Control fields (micro-operations) | Condition | Branch field (next address) |

Micro-operation / control field: specifies the control signals (e.g., ALU op, register load, bus select). May be horizontal (one bit per control signal, fast, wide) or vertical (encoded fields, narrow, needs decoders).
Condition field: selects the status bit to test (carry, zero, sign, unconditional).
Branch / next-address field: gives the address of the next microinstruction when a branch is taken.

Answer 7

Virtual Memory

Virtual memory is a memory-management technique that gives a program the illusion of a very large, contiguous main memory by using a combination of main memory (RAM) and secondary storage (disk). Only the portions of a program currently in use are kept in physical memory; the rest reside on disk and are brought in on demand. This allows programs larger than physical RAM to run and provides isolation between processes.

The CPU generates virtual (logical) addresses, which the Memory Management Unit (MMU) translates to physical addresses.
If a referenced item is not in main memory, a page fault occurs and the OS loads the required page from disk.

Paging

Paging divides the virtual address space into fixed-size blocks called pages, and physical memory into equal-size blocks called frames (e.g., 4 KB each). Any page can be placed in any free frame, eliminating external fragmentation.

A page table maps each virtual page number to a physical frame number plus control bits (valid, dirty, reference).
A virtual address is split as:

\text{Virtual address} = (\text{Page number},\ \text{Offset})

The page number indexes the page table to get the frame number; the offset is concatenated to form the physical address. A TLB (Translation Lookaside Buffer) caches recent translations to speed up address translation.

Advantages: no external fragmentation, easy allocation, supports virtual memory.

Answer 8

Instruction Format

An instruction format defines the layout of the bits of a machine instruction into fields. The basic fields are:

Opcode field: specifies the operation to be performed (ADD, LOAD, JMP …).
Operand / address field(s): specify the operands or the addresses of operands.
Mode field: specifies the addressing mode (how the operand address is interpreted).

| Opcode | Mode | Address / Operand |

The instruction length depends on the number of addresses, the word size, and the addressing modes supported.

Types Based on Number of Addresses

1. Three-address instructions

Format: OP A, B, C meaning $A \leftarrow B\ \text{op}\ C$ .

Example: ADD R1, R2, R3 → $R1 \leftarrow R2 + R3$ .
Short programs but long instructions.

2. Two-address instructions

Format: OP A, B meaning $A \leftarrow A\ \text{op}\ B$ (one operand is also the destination).

Example: ADD R1, R2 → $R1 \leftarrow R1 + R2$ . Most common.

3. One-address instructions

Use an implied accumulator (AC). Format: OP A meaning $AC \leftarrow AC\ \text{op}\ A$ .

Example: LOAD A, ADD B, STORE C.

4. Zero-address instructions

Use a stack; operands are implicitly the top of the stack.

Example: PUSH A, PUSH B, ADD (pops two, pushes sum).

Example — compute $X = (A+B)\times(C+D)$ : can be coded in any of the four styles; zero-address uses postfix A B + C D + *.

Answer 9

Set-Associative Mapping

Set-associative mapping is a compromise between direct mapping and fully associative mapping. The cache is divided into a number of sets, and each set contains $k$ lines (ways) — this is called a k-way set-associative cache.

A memory block maps to exactly one set (like direct mapping) but can be placed in any of the $k$ lines within that set (like associative mapping).
Set number is computed as: $\text{set} = (\text{block number}) \bmod (\text{number of sets})$ .

Address breakdown

The physical address is divided into three fields:

\text{Address} = (\text{Tag} \,|\, \text{Set index} \,|\, \text{Word/Block offset})

Set index selects the set; Tag is compared (in parallel) against the tags of all $k$ lines in that set; offset selects the word in the line.

Example

Suppose: main memory = 4 KB, cache = 256 bytes, block size = 16 bytes, and a 2-way set-associative cache.

Number of cache lines $= 256/16 = 16$ lines.
Lines per set $k = 2$ , so number of sets $= 16/2 = 8$ sets → 3 set-index bits.
Block offset = $\log_2 16 = 4$ bits.
Address bits = $\log_2 4096 = 12$ ; Tag = $12 - 3 - 4 = 5$ bits.

So the 12-bit address is split as Tag(5) | Set(3) | Offset(4). A block goes to set $=(\text{block no}) \bmod 8$ and may occupy either of the 2 lines in that set. On a miss, a replacement policy (e.g., LRU) chooses which line to evict.

Advantage: fewer conflict misses than direct mapping, and cheaper/faster lookup than fully associative.

Answer 10

IEEE 754 Single-Precision Floating Point

IEEE 754 single precision uses 32 bits divided into three fields:

| Sign (1 bit) | Exponent (8 bits, biased) | Mantissa/Fraction (23 bits) |

Sign (S): 0 = positive, 1 = negative.
Exponent (E): 8-bit value stored with a bias of 127 (so stored $E$ = actual exponent + 127). Range of actual exponent: $-126$ to $+127$ .
Mantissa (M): 23-bit fraction; for normalized numbers there is an implicit leading 1 (the hidden bit).

The value of a normalized number is:

(-1)^S \times 1.M \times 2^{(E-127)}

Special cases: $E=0$ → zero/denormals; $E=255$ → infinity (M=0) or NaN (M≠0).

Example: Represent $-6.75$

Sign $S = 1$ (negative).
Convert to binary: $6.75 = 110.11_2$ .
Normalize: $110.11 = 1.1011 \times 2^{2}$ . So actual exponent = 2.
Biased exponent $= 2 + 127 = 129 = 10000001_2$ .
Mantissa = fractional part after the leading 1 = $1011$ , padded to 23 bits: 10110000000000000000000.

Result:

1 10000001 10110000000000000000000

In hex this is 0xC0D80000.

Answer 11

Memory Hierarchy

The memory hierarchy organizes computer storage into levels that trade off speed, cost and capacity. As we move down the hierarchy, speed and cost-per-bit decrease while capacity increases. The goal is to give the illusion of a memory that is as large as the cheapest level but nearly as fast as the fastest level, exploiting the principle of locality (temporal and spatial).

Levels (fastest/smallest at top)

            ^  faster, smaller, costlier
            |
   +---------------------+
   |     CPU Registers   |   (fastest, ns, bytes)
   +---------------------+
   |   Cache (L1/L2/L3)  |   (SRAM, very fast, KB-MB)
   +---------------------+
   |  Main Memory (RAM)  |   (DRAM, ns-µs, GB)
   +---------------------+
   | Secondary Storage   |   (SSD / HDD, ms, TB)
   +---------------------+
   | Tertiary / Backup   |   (magnetic tape, optical)
   +---------------------+
            |
            v  slower, larger, cheaper

Level	Technology	Access time	Capacity	Cost/bit
Registers	Flip-flops	< 1 ns	bytes	highest
Cache	SRAM	1–10 ns	KB–MB	high
Main memory	DRAM	50–100 ns	GB	medium
Disk (SSD/HDD)	Flash/magnetic	µs–ms	TB	low
Tape	Magnetic	seconds	TB+	lowest

Inboard memory (registers, cache, main memory) is directly accessed by the CPU; outboard/offline levels (disk, tape) need I/O. Cache and registers are managed by hardware; main memory and disk by the OS (via virtual memory).

Answer 12

Interrupt

An interrupt is a signal to the processor that an event needs immediate attention, causing the CPU to suspend its current program, save its state, and transfer control to an Interrupt Service Routine (ISR). After the ISR completes, the CPU restores the saved state and resumes the interrupted program. Interrupts allow the CPU to respond to asynchronous events without continuous polling, improving efficiency.

Types of Interrupts

1. Hardware Interrupts

Generated by external hardware devices (I/O, timer, keyboard). Sub-types:

Maskable interrupts: can be enabled/disabled by the CPU (via the interrupt-enable flag). Example: most device I/O interrupts.
Non-maskable interrupts (NMI): cannot be disabled; used for critical events such as power failure or hardware error.

2. Software Interrupts

Generated by an instruction in the program (e.g., INT n / system calls / trap instructions). Used by programs to request OS services.

3. Internal Interrupts (Traps / Exceptions)

Generated inside the CPU due to an error or special condition during execution, such as divide-by-zero, overflow, invalid opcode, or page fault. They are synchronous with the instruction that caused them.

Other classifications

Vectored vs non-vectored: in vectored interrupts the device supplies the ISR address (vector); in non-vectored a fixed address is used.
By priority: when several interrupts occur, a priority scheme (e.g., daisy chaining or a priority interrupt controller) decides which is serviced first.

Level	BSc CSIT (TU)
Stream	Science
Subject	Computer Architecture (BSc CSIT, CSC208)
Year	2077 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions

Section A: Long Answer Questions