BSc CSIT (TU) Science Computer Architecture (BSc CSIT, CSC208) Question Paper 2080 Nepal
This is the official BSc CSIT (TU) (Science stream) Computer Architecture (BSc CSIT, CSC208) question paper for 2080, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Computer Architecture (BSc CSIT, CSC208) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) exam or solving previous years' question papers, this 2080 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
What is cache memory? Explain the different cache mapping techniques (direct, associative and set-associative) with suitable diagrams.
Cache Memory
Cache memory is a small, high-speed memory placed between the CPU and main memory (RAM). It stores copies of frequently accessed instructions and data so that the CPU can access them faster than fetching from slower main memory. It works on the principle of locality of reference (temporal and spatial locality).
When the CPU needs a word, it first checks the cache:
- Cache hit — the word is found in cache (fast access).
- Cache miss — the word is not in cache, so it is fetched from main memory and a block is copied into cache.
The ratio of hits to total references is the hit ratio, a key measure of cache performance.
Main memory is divided into blocks and cache into lines of the same size. A mapping function decides which main-memory block goes to which cache line.
Cache Mapping Techniques
1. Direct Mapping
Each main-memory block maps to exactly one fixed cache line, given by:
The physical address is split into three fields:
| Tag | Line (Index) | Word (Offset) |
|---|
- Word selects the byte/word within a block.
- Line selects the cache line directly.
- Tag is stored with the line and compared to verify the right block.
Diagram (in words): Each cache line has a tag field and a data block. The line field of the address indexes one line; its stored tag is compared with the address tag — equal means hit.
Advantage: Simple and cheap (one comparison). Disadvantage: High conflict misses — two blocks mapping to the same line repeatedly evict each other.
2. Associative (Fully Associative) Mapping
A main-memory block can be placed in any cache line. The address has only two fields:
| Tag | Word |
|---|
The tag must be compared with the tags of all cache lines simultaneously (using associative/content-addressable memory).
Diagram (in words): All cache-line tags are searched in parallel by comparators; any match is a hit.
Advantage: Most flexible, lowest conflict misses, best hit ratio. Disadvantage: Expensive hardware (many comparators) and needs a replacement algorithm (LRU/FIFO).
3. Set-Associative Mapping
A compromise between the two. The cache is divided into sets, each containing lines (a k-way set-associative cache). A block maps to one fixed set but can occupy any line within that set:
The address has three fields:
| Tag | Set | Word |
|---|
Diagram (in words): The set field indexes one set; the tag is compared in parallel with the lines of that set only.
Advantage: Lower conflict misses than direct mapping and far cheaper than fully associative ( comparators only). Most real caches use 2-way, 4-way or 8-way set-associative mapping.
Summary
Direct mapping is the special case ; fully associative is the case where the whole cache is one set. Set-associative balances cost and performance.
What is pipelining? Explain instruction pipelining with its stages. Discuss the different types of pipeline hazards and their solutions.
Pipelining
Pipelining is a technique of implementing instruction-level parallelism in which multiple instructions are overlapped in execution. The processor is divided into stages, each performing a part of an instruction, like an assembly line. While one instruction is being decoded, another can be fetched, and so on, increasing throughput (instructions completed per unit time).
For a -stage pipeline executing instructions, ideal speedup approaches :
Instruction Pipeline Stages
A classic RISC pipeline has five stages:
- IF (Instruction Fetch) — fetch the instruction from memory (using PC).
- ID (Instruction Decode) — decode the opcode and read source registers.
- EX (Execute) — perform the ALU operation or compute the effective address.
- MEM (Memory Access) — read from or write to data memory (for load/store).
- WB (Write Back) — write the result back to the register file.
In steady state one instruction finishes every clock cycle.
Pipeline Hazards and Solutions
Hazards prevent the next instruction from executing in its designated cycle.
1. Structural Hazard
Two instructions need the same hardware resource in the same cycle (e.g., a single memory port for IF and MEM).
- Solution: Provide separate instruction and data memories/caches (Harvard organization) or duplicate resources; stall if unavoidable.
2. Data Hazard
An instruction depends on the result of a previous instruction still in the pipeline (RAW — read after write being the most common).
- Solutions:
- Operand forwarding (bypassing): route the ALU result directly to a dependent instruction before write-back.
- Pipeline stalls / bubbles inserted by hardware.
- Compiler instruction reordering to fill the gap.
3. Control (Branch) Hazard
Caused by branch/jump instructions — the next instruction's address is not known until the branch is resolved.
- Solutions:
- Branch prediction (static or dynamic) and speculative execution.
- Delayed branch (fill the delay slot with a useful instruction).
- Flushing wrongly-fetched instructions when a branch is taken.
- Compute branch target early in the pipeline to reduce the penalty.
Conclusion
Pipelining greatly improves throughput, but hazards limit the ideal speedup; forwarding, stalling, branch prediction and resource duplication are used to mitigate them.
Explain the different modes of data transfer between CPU and I/O devices: programmed I/O, interrupt-driven I/O and DMA.
Modes of Data Transfer between CPU and I/O
Data transfer between the CPU/memory and I/O devices occurs in three modes: programmed I/O, interrupt-driven I/O, and DMA.
1. Programmed I/O
The CPU is fully responsible for the transfer. It continuously polls (checks) the status flag of the I/O interface in a loop until the device is ready, then transfers the data word.
Steps:
- CPU reads the device status register.
- If not ready, it loops back (busy-waiting).
- When ready, CPU reads/writes the data word through a CPU register.
Drawback: The CPU wastes most of its time busy-waiting; very inefficient. Suitable only for slow, simple systems.
2. Interrupt-Driven I/O
Instead of polling, the device interrupts the CPU when it is ready.
Steps:
- CPU issues an I/O command and continues other work.
- When the device is ready, it raises an interrupt request.
- CPU finishes the current instruction, saves state (PC, registers), and jumps to the Interrupt Service Routine (ISR).
- The ISR transfers the data word, then control returns to the interrupted program.
Advantage: CPU is free during the wait. Drawback: Each word still requires CPU intervention (one interrupt per word), so it is slow for high-speed bulk transfers.
3. Direct Memory Access (DMA)
A dedicated DMA controller transfers a whole block of data directly between memory and the I/O device without the CPU handling each word.
Steps:
- CPU programs the DMA controller with the starting memory address, word count, and direction (read/write).
- The DMA controller requests the system bus (bus request / HOLD); CPU grants it (bus grant / HLDA) and releases the bus.
- DMA transfers the block word-by-word directly to/from memory (cycle stealing or burst mode).
- On completion, the DMA controller interrupts the CPU to signal that the block transfer is done.
Advantage: Very fast, frees the CPU; ideal for disk and high-speed devices.
Comparison
| Feature | Programmed I/O | Interrupt-driven I/O | DMA |
|---|---|---|---|
| CPU involvement | Highest (polling) | Per word (ISR) | Lowest (per block) |
| Speed | Slow | Moderate | Fast |
| Best for | Few/slow devices | Moderate-speed devices | Bulk/high-speed transfers |
Section B: Short Answer Questions
Attempt any EIGHT questions.
What is a system bus? Explain address bus, data bus and control bus.
System Bus
A system bus is a set of parallel conducting wires (a shared communication pathway) that connects the CPU, main memory, and I/O devices, allowing them to transfer data, addresses, and control signals. It is logically divided into three functional groups:
1. Address Bus
- Carries the address of the memory location or I/O port to be accessed.
- Unidirectional (CPU → memory/I/O).
- Its width determines the addressable memory size: an -line address bus can address locations.
2. Data Bus
- Carries the actual data/instructions being transferred between CPU, memory, and I/O.
- Bidirectional (data flows both ways).
- Its width (e.g., 8, 16, 32, 64 bits) determines how many bits are transferred at once, affecting performance.
3. Control Bus
- Carries control and timing signals that coordinate and manage operations, e.g., Memory Read, Memory Write, I/O Read, I/O Write, clock, interrupt, and bus-request/grant signals.
- A mix of unidirectional lines that synchronise the devices.
What is DMA? Explain how direct memory access transfers data without CPU intervention.
Direct Memory Access (DMA)
DMA is a data-transfer technique in which a dedicated hardware unit called the DMA controller transfers a block of data directly between an I/O device and main memory without continuous CPU intervention. The CPU is only involved at the start and end of the transfer.
How DMA Transfers Data
- Setup: The CPU programs the DMA controller registers with the starting memory address, the word/byte count (block size), and the direction (read or write).
- Bus request: When the I/O device is ready, the DMA controller sends a bus request (HOLD) signal to the CPU.
- Bus grant: The CPU completes its current bus cycle, releases the system bus (address, data, control), and responds with a bus grant (HLDA). The CPU is now temporarily idle on the bus.
- Transfer: The DMA controller takes control of the buses and transfers data word-by-word directly between memory and the device, incrementing the address and decrementing the count after each word. Transfer can be by cycle stealing (one word at a time, returning the bus) or burst mode (entire block).
- Completion: When the count reaches zero, the DMA controller releases the bus and sends an interrupt to the CPU indicating the transfer is complete.
Because the CPU does not handle each individual word, DMA is much faster and far more efficient than programmed or interrupt-driven I/O for large/high-speed transfers (e.g., disk, network).
Differentiate between RISC and CISC architectures.
RISC vs CISC
| Feature | RISC (Reduced Instruction Set Computer) | CISC (Complex Instruction Set Computer) |
|---|---|---|
| Instruction set | Small, simple, fixed-length instructions | Large, complex, variable-length instructions |
| Instructions per program | More instructions needed | Fewer instructions (one does more) |
| Execution | Mostly single-cycle | Multi-cycle, varies per instruction |
| Addressing modes | Few, simple | Many, complex |
| Memory access | Only via load/store instructions | Memory operands allowed in many instructions |
| Control unit | Hardwired control | Usually microprogrammed control |
| Registers | Large number of registers | Fewer registers |
| Pipelining | Easy and efficient | Harder due to varying instruction lengths |
| Code size | Larger | Smaller (compact) |
| Examples | ARM, MIPS, SPARC, RISC-V | x86 (Intel/AMD), VAX, IBM System/360 |
Summary: RISC emphasises simple instructions executed quickly with a hardwired control unit and heavy use of registers and pipelining, putting complexity in the compiler. CISC emphasises powerful, complex instructions that reduce program length, putting complexity in the hardware/microcode.
Explain register transfer language with examples of micro-operations.
Register Transfer Language (RTL)
Register Transfer Language is a symbolic notation used to describe the operations and information flow between the registers of a digital system. A micro-operation is an elementary operation performed on the data stored in registers during one clock pulse.
Notation
- Registers are denoted by capital letters:
R1,R2,AC,PC,MAR. - A transfer is shown with the replacement operator
←. - Conditional transfers use a control function with
:.
Example of a conditional transfer (transfer R2 into R1 only when control signal P = 1):
P: R1 ← R2
Types of Micro-operations with Examples
1. Register transfer micro-operation — move data between registers:
R1 ← R2 ; copy contents of R2 into R1
2. Arithmetic micro-operations — add, subtract, increment, decrement:
R3 ← R1 + R2 ; addition
R3 ← R1 + R2' + 1 ; subtraction using 2's complement
R1 ← R1 + 1 ; increment
3. Logic micro-operations — bitwise AND, OR, XOR, complement:
R1 ← R1 ∧ R2 ; bitwise AND
R1 ← R1 ∨ R2 ; bitwise OR
R1 ← R1' ; complement
4. Shift micro-operations — logical, circular, arithmetic shifts:
R1 ← shl R1 ; shift left
R1 ← shr R1 ; shift right
RTL provides a concise, hardware-independent way to specify the internal organisation and control sequences of a computer.
Explain the algorithm for division of unsigned integers (restoring division) with an example.
Restoring Division of Unsigned Integers
Restoring division performs binary division of an -bit dividend by a divisor using repeated subtraction and shifting, with a register A (initially 0, the partial remainder), register Q (the dividend, ends holding the quotient), and register M (the divisor).
Algorithm
Repeat times (n = number of dividend bits):
- Shift left the combined register pair
A,Qby one bit. - Subtract:
A ← A − M. - Test sign of A:
- If
A ≥ 0(sign bit 0): set the least-significant bitQ0 ← 1. - If
A < 0(sign bit 1): restore byA ← A + Mand setQ0 ← 0.
- If
After iterations: Q = quotient, A = remainder.
Example: 7 ÷ 3 (4-bit)
Dividend Q = 0111, Divisor M = 0011 (so −M = 1101 in 2's complement). Start A = 0000.
| Step | Operation | A | Q |
|---|---|---|---|
| Init | 0000 | 0111 | |
| 1 | Shift left | 0000 | 111_ |
| A − M | 1101 | — | |
| A<0 → restore, Q0=0 | 0000 | 1110 | |
| 2 | Shift left | 0001 | 110_ |
| A − M | 1110 | — | |
| A<0 → restore, Q0=0 | 0001 | 1100 | |
| 3 | Shift left | 0011 | 100_ |
| A − M | 0000 | — | |
| A≥0 → Q0=1 | 0000 | 1001 | |
| 4 | Shift left | 0001 | 001_ |
| A − M | 1110 | — | |
| A<0 → restore, Q0=0 | 0001 | 0010 |
Result: Quotient Q = 0010 = 2, Remainder A = 0001 = 1.
Check: . ✓
Explain the concept of microprogrammed control and microinstruction format.
Microprogrammed Control
In a microprogrammed control unit, the control signals required to execute each instruction are generated by reading microinstructions stored in a special memory called the control memory (control store). Each machine instruction corresponds to a microprogram — a sequence of microinstructions. This contrasts with hardwired control, where control signals are produced by fixed logic gates.
Key Components
- Control Memory (ROM): stores all microprograms.
- Control Address Register (CAR): holds the address of the next microinstruction.
- Control Data Register (CDR): holds the microinstruction currently being executed.
- Sequencer / Next-address generator: determines the address of the next microinstruction (sequential, branch, or based on instruction opcode).
Operation: CAR addresses control memory → the microinstruction is read into CDR → its control field activates the control signals → the sequencer updates CAR for the next microinstruction.
Microinstruction Format
A typical microinstruction contains the following fields:
| Field | Purpose |
|---|---|
| Control / Micro-operation field | Bits specifying which control signals are activated (the micro-operations to perform) |
| Condition (CD) field | Specifies the status/condition to test for branching (e.g., zero, carry, unconditional) |
| Branch (BR) field | Type of branch/next-address selection (jump, call, return, map) |
| Address field | The address in control memory of the next microinstruction if a branch is taken |
Advantages: flexible, easy to modify/design, supports complex instructions. Disadvantage: slower than hardwired control due to control-memory access.
What is virtual memory? Explain the concept of paging.
Virtual Memory
Virtual memory is a memory-management technique that gives a program the illusion of a very large, contiguous main memory by using a combination of RAM and secondary storage (disk). Only the parts of a program currently in use are kept in physical RAM; the rest reside on disk and are brought in on demand. This allows programs larger than physical memory to run and provides isolation between processes.
The addresses generated by the CPU are logical (virtual) addresses, which are translated to physical addresses by the Memory Management Unit (MMU).
Paging
Paging is a virtual-memory scheme that removes the need for contiguous allocation:
- The virtual address space is divided into fixed-size blocks called pages.
- Physical memory is divided into blocks of the same size called frames.
- A page table maps each virtual page number to a physical frame number.
Address Translation
A virtual address is split into a page number and an offset:
Virtual address = | Page number | Offset |
The page number indexes the page table to get the frame number; the frame number combined with the offset gives the physical address.
- Page hit: the page is in a frame → access proceeds.
- Page fault: the page is on disk → the OS loads it into a free/replaced frame, updates the page table, and retries.
A TLB (Translation Lookaside Buffer) caches recent page-table entries to speed up translation. Paging eliminates external fragmentation but can cause internal fragmentation in the last page.
Explain the instruction format and the types of instructions based on the number of addresses.
Instruction Format
An instruction format defines the layout of bits in a machine instruction. It specifies the fields that the control unit interprets. A typical instruction contains:
| Opcode | Operand / Address field(s) | Addressing mode |
|---|
- Opcode (operation code): specifies the operation to perform (ADD, LOAD, JMP, etc.).
- Operand/Address field: specifies the data or the address of the operand(s).
- Mode field: specifies how the operand address is interpreted (direct, indirect, indexed, etc.).
The number of address fields gives rise to the following instruction types.
Types of Instructions by Number of Addresses
1. Three-address instruction — two source operands and a destination.
ADD R1, R2, R3 ; R1 ← R2 + R3
For X = (A+B)*(C+D): ADD T1,A,B ADD T2,C,D MUL X,T1,T2. Shorter programs but long instructions.
2. Two-address instruction — one operand is both source and destination.
ADD R1, R2 ; R1 ← R1 + R2
Most common in real machines.
3. One-address instruction — uses an implied accumulator (AC).
LOAD A ; AC ← A
ADD B ; AC ← AC + B
STORE X ; X ← AC
4. Zero-address instruction — operands are implicit on a stack (stack-based machine).
PUSH A
PUSH B
ADD ; pops A, B, pushes A+B
POP X
As the number of address fields decreases, instructions become shorter but more instructions are needed to perform the same task.
Explain set-associative mapping technique of cache memory with an example.
Set-Associative Mapping
Set-associative mapping is a cache mapping technique that combines the features of direct and fully associative mapping. The cache is divided into a number of sets, and each set contains lines. This is called a k-way set-associative cache.
- A main-memory block maps to one specific set (like direct mapping):
- Within that set, the block may be placed in any of the lines (like associative mapping).
The physical address is divided into three fields:
| Tag | Set | Word (offset) |
|---|
On access, the set field selects one set, and the tag is compared in parallel with the tags of all lines in that set only. A match is a hit.
Example
Suppose:
- Cache = 8 lines, 2-way set-associative ⇒ number of sets = 8 / 2 = 4 sets.
- Main memory has 256 blocks.
Then block number maps to set .
- Block 0 → set 0, Block 1 → set 1, Block 4 → set 0, Block 5 → set 1, etc.
Block 4 can be stored in either of the 2 lines of set 0. So blocks 0 and 4 (both mapping to set 0) can reside in cache simultaneously — which would conflict in direct mapping. This reduces conflict misses.
Advantages: Fewer conflict misses than direct mapping; far cheaper than fully associative (only comparators). A replacement policy (e.g., LRU) selects the victim line within the set. Most modern caches are 2-, 4-, or 8-way set-associative.
Frequently asked questions
- Where can I find the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) question paper 2080?
- The full BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2080 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Computer Architecture (BSc CSIT, CSC208) 2080 paper come with solutions?
- Yes. Every question on this Computer Architecture (BSc CSIT, CSC208) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2080 paper?
- The BSc CSIT (TU) Computer Architecture (BSc CSIT, CSC208) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Computer Architecture (BSc CSIT, CSC208) past paper free?
- Yes — reading and attempting this Computer Architecture (BSc CSIT, CSC208) past paper on Kekkei is completely free.