BSc CSIT (TU) Science Distributed System (BSc CSIT, CSC462) Question Paper 2074 Nepal
This is the official BSc CSIT (TU) (Science stream) Distributed System (BSc CSIT, CSC462) question paper for 2074, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Distributed System (BSc CSIT, CSC462) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) exam or solving previous years' question papers, this 2074 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
Explain clock synchronization in distributed systems. Discuss Lamport's logical clocks and vector clocks with examples, and describe how the happened-before relation orders events.
Clock Synchronization in Distributed Systems
In a distributed system there is no global clock; each node has its own physical clock that drifts at a slightly different rate. Many tasks (ordering events, make, distributed transactions, log timestamps) need a consistent notion of time. Clock synchronization is the problem of making the clocks of all nodes agree, either on real (physical) time or on a consistent order of events.
We distinguish:
- Physical clock synchronization — bringing clock values numerically close to one another / to UTC (e.g. Cristian's algorithm, Berkeley algorithm, NTP).
- Logical clock synchronization — we do not care about real time, only about agreeing on the order of events. This is enough for most coordination problems and is captured by the happened-before relation.
The Happened-Before Relation ()
Defined by Lamport, ('a happened before b') is the smallest relation satisfying:
- If and are events in the same process and occurs before , then .
- If is the sending of a message and is the receipt of that same message, then .
- Transitivity: if and then .
If neither nor , the events are concurrent () — they are causally independent. Happened-before is a partial order, because concurrent events are unordered.
Lamport's Logical Clocks
Each process keeps an integer counter . Rules:
- Before each event, increments: .
- A message carries the sender's timestamp .
- On receiving , the receiver sets .
This guarantees the clock condition: if then .
Example. Processes .
P1: a(1) ----m1---->
P2: b receives m1
stamps event with and sends with . , whose clock was at , sets for the receive event . Hence , consistent with .
Limitation: the converse does NOT hold — does not imply . Lamport clocks cannot detect concurrency or causal dependence.
Vector Clocks
Vector clocks fix this limitation. Each process keeps a vector (one entry per process), initially all zeros. Rules:
- Before an internal/send event: .
- A message carries the whole vector .
- On receipt, sets for all , then .
Ordering: iff for all and . Then
and iff neither nor . Thus vector clocks exactly characterise causality.
Example (3 processes). 's send event has and sends it. receives it, giving the receive event . Meanwhile does a local event with . Comparing and : neither dominates, so those two events are concurrent — correctly detected, whereas Lamport scalars could not tell.
Summary
Lamport clocks give a total ordering consistent with causality but lose concurrency information; vector clocks capture the full happened-before partial order at the cost of space per timestamp.
What is distributed mutual exclusion? Explain the centralized, token-ring and Ricart-Agrawala (distributed) algorithms, comparing them in terms of message complexity and fault tolerance.
Distributed Mutual Exclusion
Mutual exclusion ensures that when several processes in a distributed system contend for a shared resource (the critical section, CS), only one process is in the CS at a time. Because there is no shared memory and no global clock, classical OS mechanisms (semaphores, monitors) cannot be used directly; instead processes coordinate by message passing. A correct algorithm must guarantee:
- Safety / Mutual exclusion — at most one process in CS.
- Liveness / No starvation — every request eventually granted.
- Fairness / Ordering — requests honoured in happened-before order.
1. Centralized Algorithm
A single coordinator process manages access:
- To enter CS, a process sends a
REQUESTto the coordinator. - If CS is free, the coordinator replies
GRANT; otherwise it queues the request (no reply /DENY). - On exiting, the process sends
RELEASE; the coordinator grants the next queued request.
- Messages per CS entry: 3 (request, grant, release).
- Pros: simple, fair (FIFO), easy to implement.
- Cons: the coordinator is a single point of failure and a performance bottleneck.
2. Token-Ring Algorithm
Processes are logically arranged in a ring. A single token circulates around the ring.
- A process may enter the CS only when it holds the token.
- After leaving the CS (or if it does not need it) it passes the token to its successor.
- Messages per CS entry: between 1 and — when load is low the token may circulate many times before anyone needs it; throughput is bounded by ring traversal time.
- Pros: no starvation (bounded wait = one ring traversal), simple.
- Cons: lost token must be detected and regenerated; token holder crash breaks the ring and requires reconstruction.
3. Ricart–Agrawala (Distributed) Algorithm
A fully distributed algorithm using Lamport timestamps and an all-to-all scheme. To enter CS, :
- Sends a timestamped
REQUEST(ts, i)to all other processes. - Enters CS only after receiving
OK/REPLYfrom all of them.
On receiving a request, process decides:
- If is not interested in CS → reply
OKimmediately. - If is in CS → queue the request, reply later.
- If also wants CS → compare timestamps; the lower timestamp wins (ties broken by process id). The loser replies
OK; the winner queues the request and replies after exiting.
On exit, sends OK to every queued request.
- Messages per CS entry: — requests + replies.
- Pros: no single point of failure; ordering by Lamport timestamps ensures fairness.
- Cons: points of failure (any crash blocks progress unless detected); high message overhead; every process must know the full membership.
Comparison
| Algorithm | Messages / CS entry | Single point of failure | Fault tolerance |
|---|---|---|---|
| Centralized | 3 | Yes (coordinator) | Poor — coordinator crash halts all |
| Token-Ring | 1 … ∞ | Token / any node | Moderate — must handle lost token & node crash |
| Ricart–Agrawala | nodes | Poor without failure detection — any crash blocks |
Conclusion: Centralized is cheapest in messages but least robust; the token ring bounds waiting but wastes messages under low load and is fragile to token loss; Ricart–Agrawala is genuinely distributed and fair but costly ( messages) and equally sensitive to any single crash.
Explain inter-process communication in distributed systems. Discuss the Remote Procedure Call (RPC) mechanism in detail, including parameter passing, binding and marshalling.
Inter-Process Communication (IPC) in Distributed Systems
In a distributed system, processes run on different machines with no shared memory, so they cooperate by exchanging messages over a network. IPC mechanisms include:
- Message passing —
send/receiveprimitives, which may be synchronous (blocking) or asynchronous (non-blocking). - Sockets — endpoints over TCP (reliable, connection-oriented) or UDP (unreliable, datagram).
- Remote Procedure Call (RPC) and Remote Method Invocation (RMI) — higher-level abstractions that hide message passing.
- Message-Oriented Middleware (MOM) — persistent, often queue-based, asynchronous messaging.
Remote Procedure Call (RPC)
RPC (Birrell & Nelson, 1984) lets a program call a procedure on a remote machine as if it were local, hiding the underlying message passing. It provides access and location transparency.
Working / Control Flow
Client ── call ──> Client Stub ── marshal ──> [network] ── unmarshal ──> Server Stub ── call ──> Server procedure
Client <─ return <─ Client Stub <─ unmarshal <─ [network] <─ marshal <─ Server Stub <─ return <─ Server procedure
- The client calls a local client stub (proxy) as an ordinary procedure.
- The stub marshals the parameters into a message and traps to the OS, which sends it.
- The message travels to the server; the server stub unmarshals the parameters.
- The server stub calls the actual server procedure; the result returns by the reverse path.
Parameter Passing
- Call-by-value is straightforward: the value is copied into the message.
- Call-by-reference is a problem — pointers/addresses are meaningless on a remote machine. Solutions: copy/restore (copy the referenced data in, copy modified data back) or flatten structures.
- Data heterogeneity: different machines use different byte order (big- vs little-endian) and representations. A common format such as XDR (External Data Representation), CDR, or ASN.1 is used so both sides agree on encoding.
Marshalling
Marshalling is packing the procedure name and parameters into a flat, machine-independent message; unmarshalling is the reverse on receipt. The stubs do this automatically, usually generated from an IDL (Interface Definition Language) specification.
Binding
Binding is how a client locates a suitable server. Typically a binder / directory service (e.g. portmapper, RPC registry) is used:
- The server registers (exports) its interface, name and address with the binder.
- The client queries (imports) the binder to obtain the server's address.
- The client then sends RPCs directly to that server. Binding may be static (fixed at compile time) or dynamic (resolved at run time via the binder).
Failure Semantics
Because networks/servers can fail, RPC defines call semantics:
- At-least-once (retry; safe only for idempotent operations),
- At-most-once (duplicate filtering; no double execution),
- Exactly-once (ideal but hard), and maybe (no guarantee).
Advantages / Limitations
- Advantages: simple, familiar procedure-call abstraction; transparency; supports heterogeneity.
- Limitations: call-by-reference and global variables are hard; partial failures and slower-than-local calls must be handled by the programmer.
Section B: Short Answer Questions
Attempt any EIGHT questions.
What is a distributed system? Explain the goals and characteristics of a distributed system.
Distributed System
A distributed system is a collection of independent (autonomous) computers connected by a network that appears to its users as a single coherent system, cooperating through message passing to achieve a common goal (Tanenbaum). There is no shared memory and no global clock.
Goals
- Resource sharing — share hardware, data, files, printers, and services.
- Transparency — hide the distribution (access, location, replication, etc.) from users.
- Openness — use standard, well-defined interfaces/protocols so components from different vendors interoperate and the system is extensible.
- Scalability — work efficiently as the number of users, resources, and geographic spread grows.
- Reliability / Fault tolerance — keep working despite partial failures (via replication and redundancy).
- High performance & availability through parallelism and load distribution.
Characteristics
- Concurrency — many components execute in parallel and share resources.
- No global clock — coordination relies on message passing and logical/physical clock synchronization.
- Independent / partial failures — components fail independently; the system must tolerate this without total failure.
- Heterogeneity — different hardware, OS, networks, and languages, bridged by middleware.
- Resource sharing and scalability as above.
Explain Lamport's logical clock with an example.
Lamport's Logical Clock
Lamport's logical clock assigns a monotonically increasing integer timestamp to events so that the timestamps respect the happened-before relation (). It does not measure real time; it only orders events causally. Each process keeps a counter .
Rules:
- Before each event in : .
- When sends a message , it attaches the timestamp .
- When receives : .
This enforces the clock condition: if then .
Example. Two processes, both starting at 0.
P1: e1=1 e2=2 ──m(ts=2)──┐
P2: e3 receive ──> C2 = max(0,2)+1 = 3
- : event gets , event gets , then sends with .
- (clock 0) receives : sets , so the receive event is stamped . Since (send before receive), indeed , satisfying the clock condition.
Note: does not imply , so Lamport clocks cannot by themselves detect concurrent events (vector clocks are needed for that).
What is a Remote Procedure Call (RPC)? Explain its working with a diagram.
Remote Procedure Call (RPC)
RPC is a communication mechanism that lets a program invoke a procedure on a remote machine as if it were a local call, hiding all message passing. It provides access and location transparency (Birrell & Nelson, 1984).
Working (with diagram)
CLIENT MACHINE SERVER MACHINE
┌──────────────┐ ┌──────────────┐
│ Client │ │ Server │
│ (1) call │ │ (5) execute │
│ ▼ │ │ ▲ │
│ Client Stub │ │ Server Stub │
│ (2) marshal │ │ (4) unmarshal│
└──────┬───────┘ └──────▲───────┘
│ (3) request msg ───────────────► │
▲ ◄─────────────── (6) reply msg │
[ Network ]
Steps:
- The client makes an ordinary procedure call to the client stub.
- The client stub marshals the parameters into a message.
- The message is sent over the network to the server.
- The server stub receives and unmarshals the parameters.
- The server stub calls the actual server procedure, which executes.
- The result is marshalled, sent back, unmarshalled by the client stub, and returned to the client as the call's return value.
The client and server are blocked (synchronous call) until the result returns. A binding step (via a directory/binder) lets the client locate the server, and call semantics (at-least-once / at-most-once) define behaviour under failures.
Differentiate between centralized and distributed mutual exclusion algorithms.
Centralized vs Distributed Mutual Exclusion
Both ensure only one process is in the critical section at a time, but differ in coordination.
| Aspect | Centralized algorithm | Distributed (e.g. Ricart–Agrawala) algorithm |
|---|---|---|
| Control | A single coordinator grants/denies access | All processes participate; decision is collective |
| Decision | Coordinator alone decides (queues requests) | Based on Lamport timestamps; lowest timestamp wins |
| Messages per CS entry | 3 (request, grant, release) | (request + reply to/from all others) |
| Single point of failure | Yes — coordinator crash halts the system | No single coordinator, but any node crash can block progress ( points of failure) |
| Implementation | Simple, easy | Complex; each node needs full membership and a request queue |
| Bottleneck | Coordinator is a performance bottleneck | Load spread across nodes |
| Fairness | FIFO order by coordinator | Total order by timestamps (ties by process id) |
Summary: the centralized scheme is simpler and uses fewer messages but is fragile and can bottleneck; the distributed scheme removes the single coordinator and is fairer, at the cost of far more messages and greater complexity, while still requiring failure detection.
Explain distributed deadlock detection. What is a wait-for graph?
Distributed Deadlock Detection
A deadlock occurs when a set of processes are each blocked waiting for a resource held by another process in the set, so none can proceed. In a distributed system the processes and resources are spread across multiple sites, so no single node has the complete picture — this makes detection harder than on a single machine.
Wait-For Graph (WFG)
A wait-for graph is a directed graph in which:
- Each node represents a process.
- A directed edge means "process is waiting for a resource currently held by ."
A cycle in the WFG indicates a deadlock. In a distributed system the graph is split into per-site local WFGs; a deadlock may form a cycle spanning several sites (a global cycle) that no single local graph reveals.
Detection approaches
- Centralized: a coordinator collects local WFGs into a global WFG and searches for cycles (can give false deadlocks if updates arrive out of order).
- Distributed: sites cooperate, e.g. the Chandy–Misra–Haas edge-chasing algorithm sends special probe messages along waiting edges; if a probe returns to its initiator, a cycle (deadlock) exists.
- Hierarchical: sites organised in a tree, each handling deadlocks in its subtree.
Once a cycle is found, the deadlock is resolved by aborting / rolling back a victim process and releasing its resources.
Explain the Bully algorithm for electing a coordinator.
Bully Algorithm (Garcia-Molina)
The Bully algorithm elects a coordinator among processes after the current one fails. It assumes each process has a unique id, processes can communicate reliably, and the process with the highest id should become coordinator. Any process that notices the coordinator is not responding starts an election.
Steps
When process detects the coordinator has failed, holds an election:
- sends an ELECTION message to all processes with a higher id than itself.
- If no higher process replies (within a timeout), wins: it becomes the coordinator and sends a COORDINATOR message to all processes announcing itself.
- If a higher process replies with an OK message, that process takes over the election; drops out and waits.
- Each higher process that receives an ELECTION replies OK and then starts its own election among still-higher processes.
- Eventually the highest-id live process wins and broadcasts COORDINATOR to everyone.
If a previously crashed process with a higher id recovers, it starts a fresh election and 'bullies' the current coordinator out — hence the name.
Example
Processes , was coordinator and crashes. notices and sends ELECTION to . and reply OK; stops. and run their own elections; gets no reply from (crashed), so becomes coordinator and broadcasts COORDINATOR to .
Message complexity: in the worst case.
What are the different kinds of transparency in a distributed system?
Kinds of Transparency in a Distributed System
Transparency means hiding the fact that a system's resources are distributed, so that it appears to users and applications as a single coherent system. The ISO/ANSA standard identifies the following kinds:
- Access transparency — hide differences in data representation and how a resource is accessed; local and remote resources are accessed in the same way.
- Location transparency — hide where a resource is physically located (users use a name, not an address).
- Migration (mobility) transparency — hide that a resource may move to another location; its name stays valid.
- Relocation transparency — hide that a resource may move while in use.
- Replication transparency — hide that several copies of a resource exist; the user sees one logical resource.
- Concurrency transparency — hide that a resource may be shared by several users at the same time (no interference).
- Failure transparency — hide the failure and recovery of a resource.
- Persistence transparency — hide whether a resource is in memory or on disk.
Full transparency is not always desirable — hiding everything can hurt performance, so designers balance transparency against efficiency.
What is middleware? Explain its role in a distributed system.
Middleware
Middleware is a software layer that sits between the operating system / network and the distributed applications, on top of the OS of each machine. It hides the heterogeneity and complexity of the underlying network, OS, hardware and programming languages, and offers a uniform, high-level programming model so that a distributed application appears as a single coherent system.
┌──────────────────────────────────────┐
│ Distributed Applications │
├──────────────────────────────────────┤
│ MIDDLEWARE │ ← provides distribution transparency
├──────────────┬───────────────────────┤
│ OS / Network │ OS / Network │ ... │
└──────────────┴───────────────────────┘
Role in a Distributed System
- Provides transparency — hides location, access, replication, failure, and migration differences from the application.
- Masks heterogeneity of hardware, OS, networks and languages, enabling interoperability.
- Offers high-level communication abstractions — RPC, RMI, message queues / MOM, publish-subscribe.
- Provides common services — naming/directory, security/authentication, transactions, concurrency control, persistence, replication and fault tolerance.
- Eases development by giving programmers a single uniform API instead of low-level socket programming.
Examples: CORBA, Java RMI, Microsoft DCOM/.NET Remoting, gRPC, and message brokers such as RabbitMQ/JMS.
Explain Cristian's algorithm for physical clock synchronization.
Cristian's Algorithm (Physical Clock Synchronization)
Cristian's algorithm (1989) synchronizes a client's clock to a time server that has access to an accurate reference time source (e.g. UTC/an atomic clock). It is suitable when round-trip times are small relative to the required accuracy.
Procedure
- The client records its local send time and sends a request to the time server.
- The time server replies with its current time (often denoted ).
- The client records the reply-arrival time .
- The client estimates the message round-trip delay and sets its clock to:
The term assumes the request and reply legs take equal time, so half the round-trip is added to compensate for the propagation delay of the reply.
Improving accuracy
If the server's interrupt-handling time is known, a better estimate is:
The error is bounded by , where is the minimum possible one-way transit time. Several requests can be made and the one with the smallest round-trip time chosen for best accuracy.
Limitations
- Assumes symmetric (equal) network delays, which may not hold.
- The single time server is a single point of failure and can be a bottleneck — addressed by the Berkeley algorithm / NTP.
- A clock must never be set backward; instead it is slowed or sped up gradually to converge.
Example
If , reply arrives at , and the server time in the reply is , then
Frequently asked questions
- Where can I find the BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) question paper 2074?
- The full BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2074 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Distributed System (BSc CSIT, CSC462) 2074 paper come with solutions?
- Yes. Every question on this Distributed System (BSc CSIT, CSC462) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2074 paper?
- The BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2074 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Distributed System (BSc CSIT, CSC462) past paper free?
- Yes — reading and attempting this Distributed System (BSc CSIT, CSC462) past paper on Kekkei is completely free.