BSc CSIT (TU) Science Distributed System (BSc CSIT, CSC462) Question Paper 2074 Nepal

Q: Where can I find the BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) question paper 2074?

The full BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2074 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Distributed System (BSc CSIT, CSC462) 2074 paper come with solutions?

Yes. Every question on this Distributed System (BSc CSIT, CSC462) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2074 paper?

The BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2074 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Distributed System (BSc CSIT, CSC462) past paper free?

Yes — reading and attempting this Distributed System (BSc CSIT, CSC462) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain clock synchronization in distributed systems. Discuss Lamport's logical clocks and vector clocks with examples, and describe how the happened-before relation orders events.

clock-synchronizationlogical-clock

Answer 1

Clock Synchronization in Distributed Systems

In a distributed system there is no global clock; each node has its own physical clock that drifts at a slightly different rate. Many tasks (ordering events, make, distributed transactions, log timestamps) need a consistent notion of time. Clock synchronization is the problem of making the clocks of all nodes agree, either on real (physical) time or on a consistent order of events.

We distinguish:

Physical clock synchronization — bringing clock values numerically close to one another / to UTC (e.g. Cristian's algorithm, Berkeley algorithm, NTP).
Logical clock synchronization — we do not care about real time, only about agreeing on the order of events. This is enough for most coordination problems and is captured by the happened-before relation.

The Happened-Before Relation ( $\rightarrow$ )

Defined by Lamport, $a \rightarrow b$ ('a happened before b') is the smallest relation satisfying:

If $a$ and $b$ are events in the same process and $a$ occurs before $b$ , then $a \rightarrow b$ .
If $a$ is the sending of a message and $b$ is the receipt of that same message, then $a \rightarrow b$ .
Transitivity: if $a \rightarrow b$ and $b \rightarrow c$ then $a \rightarrow c$ .

If neither $a \rightarrow b$ nor $b \rightarrow a$ , the events are concurrent ( $a \parallel b$ ) — they are causally independent. Happened-before is a partial order, because concurrent events are unordered.

Lamport's Logical Clocks

Each process $P_i$ keeps an integer counter $C_i$ . Rules:

Before each event, $P_i$ increments: $C_i := C_i + 1$ .
A message $m$ carries the sender's timestamp $ts(m) = C_i$ .
On receiving $m$ , the receiver sets $C_j := \max(C_j, ts(m)) + 1$ .

This guarantees the clock condition: if $a \rightarrow b$ then $C(a) < C(b)$ .

Example. Processes $P_1, P_2$ .

P1:  a(1) ----m1----> 
P2:        b receives m1

$P_1$ stamps event $a$ with $1$ and sends $m_1$ with $ts=1$ . $P_2$ , whose clock was at $0$ , sets $C_2 = \max(0,1)+1 = 2$ for the receive event $b$ . Hence $C(a)=1 < C(b)=2$ , consistent with $a \rightarrow b$ .

Limitation: the converse does NOT hold — $C(a) < C(b)$ does not imply $a \rightarrow b$ . Lamport clocks cannot detect concurrency or causal dependence.

Vector Clocks

Vector clocks fix this limitation. Each process $P_i$ keeps a vector $V_i[1..n]$ (one entry per process), initially all zeros. Rules:

Before an internal/send event: $V_i[i] := V_i[i] + 1$ .
A message carries the whole vector $V_i$ .
On receipt, $P_j$ sets $V_j[k] := \max(V_j[k], V_{msg}[k])$ for all $k$ , then $V_j[j] := V_j[j] + 1$ .

Ordering: $V(a) < V(b)$ iff $V(a)[k] \le V(b)[k]$ for all $k$ and $V(a) \ne V(b)$ . Then

a \rightarrow b \iff V(a) < V(b),

and $a \parallel b$ iff neither $V(a) < V(b)$ nor $V(b) < V(a)$ . Thus vector clocks exactly characterise causality.

Example (3 processes). $P_1$ 's send event has $V=(1,0,0)$ and sends it. $P_2$ receives it, giving the receive event $V=(1,1,0)$ . Meanwhile $P_3$ does a local event with $V=(0,0,1)$ . Comparing $(1,1,0)$ and $(0,0,1)$ : neither dominates, so those two events are concurrent — correctly detected, whereas Lamport scalars could not tell.

Summary

Lamport clocks give a total ordering consistent with causality but lose concurrency information; vector clocks capture the full happened-before partial order at the cost of $O(n)$ space per timestamp.

Answer 2

Distributed Mutual Exclusion

Mutual exclusion ensures that when several processes in a distributed system contend for a shared resource (the critical section, CS), only one process is in the CS at a time. Because there is no shared memory and no global clock, classical OS mechanisms (semaphores, monitors) cannot be used directly; instead processes coordinate by message passing. A correct algorithm must guarantee:

Safety / Mutual exclusion — at most one process in CS.
Liveness / No starvation — every request eventually granted.
Fairness / Ordering — requests honoured in happened-before order.

1. Centralized Algorithm

A single coordinator process manages access:

To enter CS, a process sends a REQUEST to the coordinator.
If CS is free, the coordinator replies GRANT; otherwise it queues the request (no reply / DENY).
On exiting, the process sends RELEASE; the coordinator grants the next queued request.

Messages per CS entry: 3 (request, grant, release).
Pros: simple, fair (FIFO), easy to implement.
Cons: the coordinator is a single point of failure and a performance bottleneck.

2. Token-Ring Algorithm

Processes are logically arranged in a ring. A single token circulates around the ring.

A process may enter the CS only when it holds the token.
After leaving the CS (or if it does not need it) it passes the token to its successor.

Messages per CS entry: between 1 and $\infty$ — when load is low the token may circulate many times before anyone needs it; throughput is bounded by ring traversal time.
Pros: no starvation (bounded wait = one ring traversal), simple.
Cons: lost token must be detected and regenerated; token holder crash breaks the ring and requires reconstruction.

3. Ricart–Agrawala (Distributed) Algorithm

A fully distributed algorithm using Lamport timestamps and an all-to-all scheme. To enter CS, $P_i$ :

Sends a timestamped REQUEST(ts, i) to all other $n-1$ processes.
Enters CS only after receiving OK/REPLY from all of them.

On receiving a request, process $P_j$ decides:

If $P_j$ is not interested in CS → reply OK immediately.
If $P_j$ is in CS → queue the request, reply later.
If $P_j$ also wants CS → compare timestamps; the lower timestamp wins (ties broken by process id). The loser replies OK; the winner queues the request and replies after exiting.

On exit, $P_i$ sends OK to every queued request.

Messages per CS entry: $2(n-1)$ — $(n-1)$ requests + $(n-1)$ replies.
Pros: no single point of failure; ordering by Lamport timestamps ensures fairness.
Cons: $n$ points of failure (any crash blocks progress unless detected); high message overhead; every process must know the full membership.

Comparison

Algorithm	Messages / CS entry	Single point of failure	Fault tolerance
Centralized	3	Yes (coordinator)	Poor — coordinator crash halts all
Token-Ring	1 … ∞	Token / any node	Moderate — must handle lost token & node crash
Ricart–Agrawala	$2(n-1)$	$n$ nodes	Poor without failure detection — any crash blocks

Conclusion: Centralized is cheapest in messages but least robust; the token ring bounds waiting but wastes messages under low load and is fragile to token loss; Ricart–Agrawala is genuinely distributed and fair but costly ( $O(n)$ messages) and equally sensitive to any single crash.

Answer 3

Inter-Process Communication (IPC) in Distributed Systems

In a distributed system, processes run on different machines with no shared memory, so they cooperate by exchanging messages over a network. IPC mechanisms include:

Message passing — send/receive primitives, which may be synchronous (blocking) or asynchronous (non-blocking).
Sockets — endpoints over TCP (reliable, connection-oriented) or UDP (unreliable, datagram).
Remote Procedure Call (RPC) and Remote Method Invocation (RMI) — higher-level abstractions that hide message passing.
Message-Oriented Middleware (MOM) — persistent, often queue-based, asynchronous messaging.

Remote Procedure Call (RPC)

RPC (Birrell & Nelson, 1984) lets a program call a procedure on a remote machine as if it were local, hiding the underlying message passing. It provides access and location transparency.

Working / Control Flow

Client ── call ──> Client Stub ── marshal ──> [network] ── unmarshal ──> Server Stub ── call ──> Server procedure
Client <─ return <─ Client Stub <─ unmarshal <─ [network] <─ marshal <─ Server Stub <─ return <─ Server procedure

The client calls a local client stub (proxy) as an ordinary procedure.
The stub marshals the parameters into a message and traps to the OS, which sends it.
The message travels to the server; the server stub unmarshals the parameters.
The server stub calls the actual server procedure; the result returns by the reverse path.

Parameter Passing

Call-by-value is straightforward: the value is copied into the message.
Call-by-reference is a problem — pointers/addresses are meaningless on a remote machine. Solutions: copy/restore (copy the referenced data in, copy modified data back) or flatten structures.
Data heterogeneity: different machines use different byte order (big- vs little-endian) and representations. A common format such as XDR (External Data Representation), CDR, or ASN.1 is used so both sides agree on encoding.

Marshalling

Marshalling is packing the procedure name and parameters into a flat, machine-independent message; unmarshalling is the reverse on receipt. The stubs do this automatically, usually generated from an IDL (Interface Definition Language) specification.

Binding

Binding is how a client locates a suitable server. Typically a binder / directory service (e.g. portmapper, RPC registry) is used:

The server registers (exports) its interface, name and address with the binder.
The client queries (imports) the binder to obtain the server's address.
The client then sends RPCs directly to that server. Binding may be static (fixed at compile time) or dynamic (resolved at run time via the binder).

Failure Semantics

Because networks/servers can fail, RPC defines call semantics:

At-least-once (retry; safe only for idempotent operations),
At-most-once (duplicate filtering; no double execution),
Exactly-once (ideal but hard), and maybe (no guarantee).

Advantages / Limitations

Advantages: simple, familiar procedure-call abstraction; transparency; supports heterogeneity.
Limitations: call-by-reference and global variables are hard; partial failures and slower-than-local calls must be handled by the programmer.

Answer 4

Distributed System

A distributed system is a collection of independent (autonomous) computers connected by a network that appears to its users as a single coherent system, cooperating through message passing to achieve a common goal (Tanenbaum). There is no shared memory and no global clock.

Goals

Resource sharing — share hardware, data, files, printers, and services.
Transparency — hide the distribution (access, location, replication, etc.) from users.
Openness — use standard, well-defined interfaces/protocols so components from different vendors interoperate and the system is extensible.
Scalability — work efficiently as the number of users, resources, and geographic spread grows.
Reliability / Fault tolerance — keep working despite partial failures (via replication and redundancy).
High performance & availability through parallelism and load distribution.

Characteristics

Concurrency — many components execute in parallel and share resources.
No global clock — coordination relies on message passing and logical/physical clock synchronization.
Independent / partial failures — components fail independently; the system must tolerate this without total failure.
Heterogeneity — different hardware, OS, networks, and languages, bridged by middleware.
Resource sharing and scalability as above.

Answer 5

Lamport's Logical Clock

Lamport's logical clock assigns a monotonically increasing integer timestamp to events so that the timestamps respect the happened-before relation ( $\rightarrow$ ). It does not measure real time; it only orders events causally. Each process $P_i$ keeps a counter $C_i$ .

Rules:

Before each event in $P_i$ : $C_i := C_i + 1$ .
When $P_i$ sends a message $m$ , it attaches the timestamp $ts(m) = C_i$ .
When $P_j$ receives $m$ : $C_j := \max(C_j, ts(m)) + 1$ .

This enforces the clock condition: if $a \rightarrow b$ then $C(a) < C(b)$ .

Example. Two processes, both starting at 0.

P1:  e1=1   e2=2 ──m(ts=2)──┐
P2:                 e3 receive ──> C2 = max(0,2)+1 = 3

$P_1$ : event $e_1$ gets $1$ , event $e_2$ gets $2$ , then sends $m$ with $ts=2$ .
$P_2$ (clock 0) receives $m$ : sets $C_2 = \max(0,2)+1 = 3$ , so the receive event is stamped $3$ . Since $e_2 \rightarrow e_3$ (send before receive), indeed $C(e_2)=2 < C(e_3)=3$ , satisfying the clock condition.

Note: $C(a)<C(b)$ does not imply $a \rightarrow b$ , so Lamport clocks cannot by themselves detect concurrent events (vector clocks are needed for that).

Answer 6

Remote Procedure Call (RPC)

RPC is a communication mechanism that lets a program invoke a procedure on a remote machine as if it were a local call, hiding all message passing. It provides access and location transparency (Birrell & Nelson, 1984).

Working (with diagram)

  CLIENT MACHINE                          SERVER MACHINE
 ┌──────────────┐                        ┌──────────────┐
 │   Client     │                        │   Server     │
 │  (1) call    │                        │  (5) execute │
 │      ▼       │                        │      ▲       │
 │ Client Stub  │                        │ Server Stub  │
 │ (2) marshal  │                        │ (4) unmarshal│
 └──────┬───────┘                        └──────▲───────┘
        │ (3) request msg  ───────────────►      │
        ▲      ◄─────────────── (6) reply msg    │
                       [ Network ]

Steps:

The client makes an ordinary procedure call to the client stub.
The client stub marshals the parameters into a message.
The message is sent over the network to the server.
The server stub receives and unmarshals the parameters.
The server stub calls the actual server procedure, which executes.
The result is marshalled, sent back, unmarshalled by the client stub, and returned to the client as the call's return value.

The client and server are blocked (synchronous call) until the result returns. A binding step (via a directory/binder) lets the client locate the server, and call semantics (at-least-once / at-most-once) define behaviour under failures.

Answer 7

Centralized vs Distributed Mutual Exclusion

Both ensure only one process is in the critical section at a time, but differ in coordination.

Aspect	Centralized algorithm	Distributed (e.g. Ricart–Agrawala) algorithm
Control	A single coordinator grants/denies access	All processes participate; decision is collective
Decision	Coordinator alone decides (queues requests)	Based on Lamport timestamps; lowest timestamp wins
Messages per CS entry	3 (request, grant, release)	$2(n-1)$ (request + reply to/from all others)
Single point of failure	Yes — coordinator crash halts the system	No single coordinator, but any node crash can block progress ( $n$ points of failure)
Implementation	Simple, easy	Complex; each node needs full membership and a request queue
Bottleneck	Coordinator is a performance bottleneck	Load spread across nodes
Fairness	FIFO order by coordinator	Total order by timestamps (ties by process id)

Summary: the centralized scheme is simpler and uses fewer messages but is fragile and can bottleneck; the distributed scheme removes the single coordinator and is fairer, at the cost of far more messages and greater complexity, while still requiring failure detection.

Answer 8

Distributed Deadlock Detection

A deadlock occurs when a set of processes are each blocked waiting for a resource held by another process in the set, so none can proceed. In a distributed system the processes and resources are spread across multiple sites, so no single node has the complete picture — this makes detection harder than on a single machine.

Wait-For Graph (WFG)

A wait-for graph is a directed graph in which:

Each node represents a process.
A directed edge $P_i \rightarrow P_j$ means "process $P_i$ is waiting for a resource currently held by $P_j$ ."

A cycle in the WFG indicates a deadlock. In a distributed system the graph is split into per-site local WFGs; a deadlock may form a cycle spanning several sites (a global cycle) that no single local graph reveals.

Detection approaches

Centralized: a coordinator collects local WFGs into a global WFG and searches for cycles (can give false deadlocks if updates arrive out of order).
Distributed: sites cooperate, e.g. the Chandy–Misra–Haas edge-chasing algorithm sends special probe messages along waiting edges; if a probe returns to its initiator, a cycle (deadlock) exists.
Hierarchical: sites organised in a tree, each handling deadlocks in its subtree.

Once a cycle is found, the deadlock is resolved by aborting / rolling back a victim process and releasing its resources.

Answer 9

Bully Algorithm (Garcia-Molina)

The Bully algorithm elects a coordinator among processes after the current one fails. It assumes each process has a unique id, processes can communicate reliably, and the process with the highest id should become coordinator. Any process that notices the coordinator is not responding starts an election.

Steps

When process $P$ detects the coordinator has failed, $P$ holds an election:

$P$ sends an ELECTION message to all processes with a higher id than itself.
If no higher process replies (within a timeout), $P$ wins: it becomes the coordinator and sends a COORDINATOR message to all processes announcing itself.
If a higher process replies with an OK message, that process takes over the election; $P$ drops out and waits.
Each higher process that receives an ELECTION replies OK and then starts its own election among still-higher processes.
Eventually the highest-id live process wins and broadcasts COORDINATOR to everyone.

If a previously crashed process with a higher id recovers, it starts a fresh election and 'bullies' the current coordinator out — hence the name.

Example

Processes $1..6$ , $6$ was coordinator and crashes. $P_3$ notices and sends ELECTION to $4,5,6$ . $4$ and $5$ reply OK; $P_3$ stops. $4$ and $5$ run their own elections; $5$ gets no reply from $6$ (crashed), so $P_5$ becomes coordinator and broadcasts COORDINATOR to $1..4$ .

Message complexity: $O(n^2)$ in the worst case.

Answer 10

Kinds of Transparency in a Distributed System

Transparency means hiding the fact that a system's resources are distributed, so that it appears to users and applications as a single coherent system. The ISO/ANSA standard identifies the following kinds:

Access transparency — hide differences in data representation and how a resource is accessed; local and remote resources are accessed in the same way.
Location transparency — hide where a resource is physically located (users use a name, not an address).
Migration (mobility) transparency — hide that a resource may move to another location; its name stays valid.
Relocation transparency — hide that a resource may move while in use.
Replication transparency — hide that several copies of a resource exist; the user sees one logical resource.
Concurrency transparency — hide that a resource may be shared by several users at the same time (no interference).
Failure transparency — hide the failure and recovery of a resource.
Persistence transparency — hide whether a resource is in memory or on disk.

Full transparency is not always desirable — hiding everything can hurt performance, so designers balance transparency against efficiency.

Answer 11

Middleware

Middleware is a software layer that sits between the operating system / network and the distributed applications, on top of the OS of each machine. It hides the heterogeneity and complexity of the underlying network, OS, hardware and programming languages, and offers a uniform, high-level programming model so that a distributed application appears as a single coherent system.

┌──────────────────────────────────────┐
│        Distributed Applications        │
├──────────────────────────────────────┤
│              MIDDLEWARE                │  ← provides distribution transparency
├──────────────┬───────────────────────┤
│  OS / Network │  OS / Network  │ ...   │
└──────────────┴───────────────────────┘

Role in a Distributed System

Provides transparency — hides location, access, replication, failure, and migration differences from the application.
Masks heterogeneity of hardware, OS, networks and languages, enabling interoperability.
Offers high-level communication abstractions — RPC, RMI, message queues / MOM, publish-subscribe.
Provides common services — naming/directory, security/authentication, transactions, concurrency control, persistence, replication and fault tolerance.
Eases development by giving programmers a single uniform API instead of low-level socket programming.

Examples: CORBA, Java RMI, Microsoft DCOM/.NET Remoting, gRPC, and message brokers such as RabbitMQ/JMS.

Answer 12

Cristian's Algorithm (Physical Clock Synchronization)

Cristian's algorithm (1989) synchronizes a client's clock to a time server that has access to an accurate reference time source (e.g. UTC/an atomic clock). It is suitable when round-trip times are small relative to the required accuracy.

Procedure

The client $P$ records its local send time $T_0$ and sends a request to the time server.
The time server replies with its current time $T_{server}$ (often denoted $C_{UTC}$ ).
The client records the reply-arrival time $T_1$ .
The client estimates the message round-trip delay and sets its clock to:

T_{client} = T_{server} + \frac{T_1 - T_0}{2}

The term $\dfrac{T_1 - T_0}{2}$ assumes the request and reply legs take equal time, so half the round-trip is added to compensate for the propagation delay of the reply.

Improving accuracy

If the server's interrupt-handling time $I$ is known, a better estimate is:

T_{client} = T_{server} + \frac{(T_1 - T_0) - I}{2}.

The error is bounded by $\pm\left(\dfrac{T_1 - T_0}{2} - T_{min}\right)$ , where $T_{min}$ is the minimum possible one-way transit time. Several requests can be made and the one with the smallest round-trip time chosen for best accuracy.

Limitations

Assumes symmetric (equal) network delays, which may not hold.
The single time server is a single point of failure and can be a bottleneck — addressed by the Berkeley algorithm / NTP.
A clock must never be set backward; instead it is slowed or sped up gradually to converge.

Example

If $T_0 = 100$ , reply arrives at $T_1 = 120$ , and the server time in the reply is $T_{server} = 500$ , then

T_{client} = 500 + \frac{120 - 100}{2} = 500 + 10 = 510.

Level	BSc CSIT (TU)
Stream	Science
Subject	Distributed System (BSc CSIT, CSC462)
Year	2074 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions

Section A: Long Answer Questions

Clock Synchronization in Distributed Systems

The Happened-Before Relation (→\rightarrow→)

Lamport's Logical Clocks

Vector Clocks

Summary

Distributed Mutual Exclusion

1. Centralized Algorithm

2. Token-Ring Algorithm

3. Ricart–Agrawala (Distributed) Algorithm

Comparison

Inter-Process Communication (IPC) in Distributed Systems

Remote Procedure Call (RPC)

Working / Control Flow

Parameter Passing

Marshalling

Binding

Failure Semantics

Advantages / Limitations

Section B: Short Answer Questions

Distributed System

Goals

Characteristics

Lamport's Logical Clock

Remote Procedure Call (RPC)

Working (with diagram)

Centralized vs Distributed Mutual Exclusion

Distributed Deadlock Detection

Wait-For Graph (WFG)

Detection approaches

Bully Algorithm (Garcia-Molina)

Steps

Example

Kinds of Transparency in a Distributed System

Middleware

Role in a Distributed System

Cristian's Algorithm (Physical Clock Synchronization)

Procedure

Improving accuracy

Limitations

Example

Frequently asked questions

The Happened-Before Relation ( $\rightarrow$ )