BSc CSIT (TU) Science Distributed System (BSc CSIT, CSC462) Question Paper 2081 Nepal

Q: Where can I find the BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) question paper 2081?

The full BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2081 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Distributed System (BSc CSIT, CSC462) 2081 paper come with solutions?

Yes. Every question on this Distributed System (BSc CSIT, CSC462) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2081 paper?

The BSc CSIT (TU) Distributed System (BSc CSIT, CSC462) 2081 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Distributed System (BSc CSIT, CSC462) past paper free?

Yes — reading and attempting this Distributed System (BSc CSIT, CSC462) past paper on Kekkei is completely free.

Question

1Long answer10 marks

What is data replication in distributed systems? Explain different consistency models and replica management techniques used to maintain consistency.

replicationconsistency

Answer 1

Data Replication in Distributed Systems

Data replication is the technique of maintaining multiple copies (replicas) of the same data on different machines (nodes) in a distributed system. The main reasons for replication are:

Reliability/availability – if one replica fails, others can serve requests.
Performance – clients access a nearby replica, reducing latency, and load is shared among replicas (scalability).

The key challenge is consistency: when one replica is updated, all copies must eventually reflect the change. This creates a trade-off between consistency, availability and performance.

Consistency Models

A consistency model is a contract between the data store and processes about how reads and writes behave.

1. Data-centric models

Strict consistency – any read returns the result of the most recent write (requires global time; impossible in practice).
Sequential consistency – the result of any execution is the same as if all operations were executed in some sequential order that respects each process's program order.
Causal consistency – writes that are causally related must be seen in the same order by all processes; concurrent writes may be seen in different orders.
FIFO (PRAM) consistency – writes from a single process are seen by others in the order issued.
Weak / release / entry consistency – use synchronization variables (acquire/release) so consistency is enforced only at synchronization points.

2. Client-centric models (for eventual-consistency stores):

Eventual consistency – if no new updates occur, all replicas eventually converge.
Monotonic reads, monotonic writes, read-your-writes, writes-follow-reads – guarantees for a single client across replicas.

Replica Management Techniques

Replica placement

Permanent replicas (e.g., mirror servers).
Server-initiated replicas (created dynamically near demand, e.g., CDNs).
Client-initiated replicas (caches).

Update propagation

Push (server-based) vs Pull (client-based), or a lease-based hybrid.
Propagate the update operation vs propagate the modified data vs invalidate.

Consistency protocols

Primary-based protocols – a primary copy serializes all writes (remote-write or local-write).
Replicated-write protocols – active replication (write sent to all replicas, often with totally ordered multicast) or quorum-based protocols where a read quorum $N_R$ and write quorum $N_W$ must satisfy $N_R + N_W > N$ and $N_W > N/2$ to prevent read–write and write–write conflicts.

Conclusion

Replication improves availability and performance but requires a chosen consistency model and an appropriate replica-management/consistency protocol to keep copies coherent.

Answer 2

Distributed File System (DFS) Architecture

A distributed file system allows files stored on remote servers to be accessed by clients over a network as if they were local, providing access, location and migration transparency.

General architecture components:

Client module – intercepts file system calls (often via a Virtual File System / VFS layer) and forwards remote operations to servers.
File service / file server – stores files and serves read/write/lookup requests.
Directory (name) service – maps human-readable path names to file identifiers.
Caching – client-side caches reduce network traffic and latency; require a cache-consistency mechanism.

Services are usually built on RPC, are typically stateless or stateful, and aim for transparency, scalability and fault tolerance.

Network File System (NFS)

NFS (by Sun) is the classic DFS model.

Architecture: Client and server communicate via RPC. The VFS layer on the client distinguishes local from remote files; an NFS client maps VFS operations to NFS RPC calls handled by the NFS server.
Mounting: A remote directory is mounted into the client's local namespace; after mounting it is accessed transparently.
Statelessness (classic NFSv2/v3): The server keeps no per-client open-file state. Every request (e.g., read, write) is self-contained and carries a file handle plus offset. This makes crash recovery simple — after a server reboot, clients just retry. Operations are made idempotent so retries are safe.
Caching: Clients cache data and attributes; consistency is approximate (validated using timestamps), so NFS provides only near one-copy semantics.
Access model: Remote access model (operations executed on the server).

Andrew File System (AFS) — alternative

Design goal: scalability for thousands of clients.
Whole-file caching: When a file is opened, the entire file is copied to the client's local disk; reads/writes happen locally; on close, the file is written back to the server (the upload/download model).
Callbacks: The server keeps state and issues a callback promise; if another client modifies the file, the server sends a callback invalidating the cached copy. This gives stronger consistency than NFS while minimizing server load.
Session semantics: Changes become visible to others only after the file is closed.

Comparison

Feature	NFS (v3)	AFS
Caching unit	Blocks	Whole file (on local disk)
Server state	Stateless	Stateful (callbacks)
Consistency	Validation by timestamps	Callback + session semantics
Scalability	Moderate	High

Answer 3

Fault Tolerance

Fault tolerance is the ability of a distributed system to continue providing correct service (possibly degraded) even when some of its components fail. A system is dependable if it is available, reliable, safe and maintainable. The basic terms: a fault (defect) may cause an error (incorrect state) which may lead to a failure (system deviates from correct service).

Failure Models

Failures are classified by how a process/server can behave incorrectly:

Crash (fail-stop) failure – a process halts and stays halted; it was working correctly until then.
Omission failure – a server fails to respond (receive omission: it doesn't receive incoming messages; send omission: it doesn't send replies).
Timing failure – the response is correct but arrives outside the specified time interval (too late/early).
Response failure – the response is incorrect (value failure: wrong value; state-transition failure: wrong control flow).
Arbitrary / Byzantine failure – the most severe: a process may produce arbitrary, even malicious or contradictory, outputs at arbitrary times.

Redundancy

Fault tolerance is achieved primarily through redundancy (masking failures):

Information redundancy – extra bits, e.g., error-correcting codes, checksums.
Time redundancy – repeat an operation (retransmission, re-execution), useful for transient faults.
Physical (hardware/software) redundancy – replicate components/processes. Example: Triple Modular Redundancy (TMR) uses three units feeding voters that mask one faulty unit. To mask up to $k$ crash faults, $k+1$ replicas suffice; to mask $k$ Byzantine faults, $2k+1$ replicas are needed.

Agreement in Faulty Systems: Byzantine Generals Problem

Processes must reach agreement (consensus) on a value despite some processes being faulty/traitorous. In the Byzantine Generals Problem, several generals must agree on a common plan (attack/retreat), but some generals are traitors who send conflicting messages.

Lamport's result: Agreement among $N$ processes tolerating $m$ Byzantine (arbitrary) faulty processes is possible only if

N \ge 3m + 1.

Thus with $m=1$ traitor at least $4$ generals are required; with 3 generals and 1 traitor, no solution exists. With reliable signed messages (authentication) this bound can be relaxed. Reaching agreement also requires enough rounds of message exchange.

Conclusion

Fault tolerance combines a clear failure model, redundancy to mask faults, and agreement protocols (subject to the $3m+1$ bound) to keep distributed systems dependable.

Answer 4

A distributed system is a collection of independent computers (nodes) that appears to its users as a single coherent system, with the nodes coordinating their actions by passing messages over a network (no shared memory or global clock).

Goals:

Resource sharing – share hardware, software and data among users.
Transparency – hide distribution (access, location, replication, etc.) so the system looks like one machine.
Openness – use standard interfaces/protocols, enabling interoperability and portability.
Scalability – handle growth in users, data and geographic size without major redesign.

Characteristics:

No global clock – processes coordinate using message passing and logical clocks.
Concurrency – many components execute in parallel and share resources.
Independent failures – components can fail independently; the rest should keep working (fault tolerance).
Heterogeneity – different hardware, OS and networks, hidden by middleware.
Autonomy of nodes – each node manages its own resources.

Answer 5

Lamport's Logical Clock

Because distributed systems lack a global physical clock, Lamport's logical clock assigns numeric timestamps to events so that the happened-before relation ( $\rightarrow$ ) is captured, giving a consistent ordering of events.

Happened-before rules:

If $a$ and $b$ are in the same process and $a$ occurs before $b$ , then $a \rightarrow b$ .
If $a$ is the sending of a message and $b$ is its receipt, then $a \rightarrow b$ .
Transitivity: if $a \rightarrow b$ and $b \rightarrow c$ then $a \rightarrow c$ .

Clock update rules (each process keeps a counter $C$ ):

Before each event, increment the local clock: $C := C + 1$ .
A message carries the sender's timestamp $t$ .
On receipt, the receiver sets $C := \max(C, t) + 1$ .

This guarantees the clock condition: if $a \rightarrow b$ then $C(a) < C(b)$ .

Example

Processes $P_1$ and $P_2$ . $P_1$ sends message $m$ at its event with timestamp.

$P_1$ : event $a$ (send $m$ ), $C_1: 0 \to 1$ , so $C(a)=1$ .
$P_2$ had local clock $C_2 = 4$ when it does an internal event ( $\to 5$ ), then receives $m$ (timestamp $1$ ): $C_2 := \max(5, 1) + 1 = 6$ .

Since the send (1) happened-before the receive (6), $1 < 6$ — the ordering is preserved.

Note: Lamport clocks give $a \rightarrow b \Rightarrow C(a) < C(b)$ , but not the converse; vector clocks are needed to detect concurrency.

Answer 6

Remote Procedure Call (RPC)

RPC is a communication mechanism that allows a program to call a procedure that executes on a different (remote) machine, while making the call look like an ordinary local procedure call. It hides the underlying message passing, providing access transparency.

Working (steps)

Client machine                         Server machine
  client                                  server
     | 1. call                                |
  client stub  -- 2. marshal & send -->  server stub
     |                                       | 3. unmarshal, call
 RPC runtime  ----- network ------>   RPC runtime
     |                                       | 4. server executes
  client stub <-- 6. send reply -----  server stub <- 5. result
     | 7. unmarshal                           |
  return value

The client calls the client stub as if it were a normal local procedure.
The client stub marshals (packs) the parameters into a message and asks the local OS/RPC runtime to send it.
The message travels over the network to the server; the server stub unmarshals the parameters.
The server stub calls the actual server procedure, which executes and produces a result.
The result is returned to the server stub, which marshals it into a reply message.
The reply is sent back to the client machine.
The client stub unmarshals the result and returns it to the client, which resumes as if the call returned locally.

Key points

Stubs hide marshalling/networking; generated from an IDL interface definition.
Parameter passing uses call by value (binding by reference is hard across address spaces).
Binding locates the server (e.g., via a binder/port-mapper).
Failure semantics: at-least-once, at-most-once, or exactly-once (ideal but hard); idempotent operations ease retries.

Answer 7

Centralized vs Distributed Mutual Exclusion

Mutual exclusion ensures that only one process at a time enters a critical section (CS) accessing a shared resource.

Centralized algorithm: One process acts as a coordinator. A process sends a REQUEST to the coordinator, which grants permission (sends OK) if the CS is free, otherwise queues the request; the process sends RELEASE on exit, and the coordinator grants the next queued request.

Distributed algorithm (e.g., Ricart–Agrawala): A process wishing to enter the CS sends a timestamped REQUEST to all other processes and enters only after receiving OK/REPLY from all of them. Decisions are made collectively using logical-clock timestamps to order requests.

Aspect	Centralized	Distributed
Decision making	Single coordinator decides	All processes participate
Messages per CS entry	3 (request, grant, release)	2(N−1) (request + reply to/from others)
Single point of failure	Yes (coordinator)	No single coordinator; but failure of any node can block (needs failure handling)
Bottleneck	Coordinator can be a bottleneck	Load distributed
Implementation	Simple, easy	Complex; needs synchronized logical clocks
Fairness/ordering	FIFO by coordinator	By timestamp ordering
Number of failure points	1	N

Summary: The centralized scheme is simple and message-efficient but has a single point of failure and bottleneck; the distributed scheme removes the single coordinator but costs more messages and is more complex and sensitive to individual node failures.

Answer 8

Distributed Deadlock Detection

A deadlock occurs when a set of processes are each blocked waiting for resources held by others in the set, so none can proceed. In a distributed system, processes and resources are spread across nodes, so no single site has the complete picture, making detection harder.

Wait-For Graph (WFG)

A wait-for graph is a directed graph used to detect deadlock:

Nodes = processes.
An edge $P_i \rightarrow P_j$ means " $P_i$ is waiting for a resource currently held by $P_j$ ."

A deadlock exists if and only if the WFG contains a cycle (with single-instance resources). Example: $P_1 \rightarrow P_2 \rightarrow P_3 \rightarrow P_1$ is a cycle, so $P_1, P_2, P_3$ are deadlocked.

In a distributed system the global WFG is the union of local WFGs at each site connected by edges crossing sites.

Detection approaches

Centralized detection – a coordinator maintains the global WFG; sites report local edges. Simple but a single point of failure and can detect phantom (false) deadlocks due to message delays.
Distributed detection – each site participates; e.g., edge-chasing / probe-based algorithms (Chandy–Misra–Haas): a blocked process sends a probe message along outgoing wait-for edges. If a probe returns to its initiator, a cycle (deadlock) is detected.
Hierarchical detection – sites organized in a tree; a parent detects deadlocks among its children.

Resolution: break the cycle by aborting/rolling back a victim process and releasing its resources.

Answer 9

Bully Algorithm (Coordinator Election)

The Bully algorithm (Garcia-Molina) elects the process with the highest ID as the new coordinator after the current coordinator fails. It assumes each process has a unique ID, processes know each other's IDs, and message delivery is reliable.

Algorithm

When a process $P$ notices the coordinator is not responding, it starts an election:

$P$ sends an ELECTION message to all processes with a higher ID than itself.
If no higher process responds, $P$ wins and becomes the coordinator; it sends a COORDINATOR message to all other processes announcing itself.
If a higher process responds (with an OK/ALIVE message), that process takes over the election; $P$ steps down and waits.
Each higher process that received an ELECTION then repeats the procedure (holds its own election among still-higher processes).
Eventually the highest-ID alive process becomes coordinator and broadcasts the COORDINATOR message.

If a previously crashed process with a higher ID recovers, it starts a new election and "bullies" the current coordinator out — hence the name.

Example

Processes $\{1,2,3,4,5,6,7\}$ ; coordinator $7$ crashes. Process $4$ detects this and sends ELECTION to $5,6,7$ . Processes $5$ and $6$ reply OK and start their own elections; $7$ does not reply. $6$ sends ELECTION to $7$ , gets no reply, so $6$ becomes the new coordinator and broadcasts COORDINATOR to all.

Message cost: $O(N^2)$ in the worst case.

Answer 10

Kinds of Transparency in a Distributed System

Transparency means hiding the fact that resources are distributed across multiple computers, so the system appears as a single coherent system to users and applications. The main kinds (ISO/ANSA classification) are:

Access transparency – hides differences in data representation and in how a resource is accessed; local and remote resources are accessed using identical operations.
Location transparency – hides where a resource is physically located (names do not reveal location).
Migration transparency – hides that a resource may move to another location; its name stays the same after moving.
Relocation transparency – hides that a resource may be moved while in use.
Replication transparency – hides that multiple copies of a resource exist; the user sees a single logical resource.
Concurrency transparency – hides that a resource may be shared by several competing users; access appears exclusive/consistent.
Failure transparency – hides the failure and recovery of a resource; the user is unaware of faults.
Persistence transparency – hides whether a resource is in memory or on disk.

Full transparency is not always desirable (e.g., it can hurt performance or conflict with location-awareness), so it is balanced against other goals.

Answer 11

Middleware

Middleware is a software layer that sits between the operating system/network and the distributed applications. It provides a common programming abstraction across a heterogeneous collection of computers, networks and operating systems, hiding their differences from application developers.

   Application 1     Application 2     Application 3
  ---------------------------------------------------
        Middleware (common services & abstractions)
  ---------------------------------------------------
     OS/Network        OS/Network        OS/Network

Role / Functions in a Distributed System

Provides transparency – masks heterogeneity of hardware, OS and networks, and offers access/location/replication transparency so distributed resources look local.
Communication abstractions – offers high-level mechanisms such as RPC, Remote Method Invocation (RMI), message-oriented middleware (MOM) and publish/subscribe, instead of raw sockets.
Common services – naming/directory, security/authentication, transactions, concurrency control, persistence, and replication services.
Interoperability & openness – standard interfaces (e.g., CORBA, RMI, web services, gRPC) let components written in different languages/platforms work together.
Eases development – application programmers focus on business logic, not low-level networking and synchronization.

Examples: CORBA, Java RMI, Microsoft DCOM/.NET Remoting, gRPC, message brokers (RabbitMQ/Kafka).

Answer 12

Cristian's Algorithm (Physical Clock Synchronization)

Cristian's algorithm synchronizes a client's clock with a time server that has access to an accurate (UTC) time source. It is suitable when round-trip times are short compared to the required accuracy.

Procedure

The client process $P$ sends a request to the time server at local time $T_0$ asking for the current time.
The server, on receipt, replies with its time $T_{\text{server}}$ .
The client receives the reply at local time $T_1$ .
The client estimates the message round-trip delay as $T_{\text{round}} = T_1 - T_0$ and assumes the reply took half of it, so it sets its clock to:

T_{\text{client}} = T_{\text{server}} + \frac{T_1 - T_0}{2}.

If the (minimum) one-way message time $I$ (interrupt-handling/processing time at server) is known, a better estimate is:

T_{\text{client}} = T_{\text{server}} + \frac{(T_1 - T_0) - I}{2}.

Accuracy bound

The time value lies within the interval $[T_{\text{server}} + I,\; T_{\text{server}} + (T_1 - T_0) - I]$ , so the error is at most

\pm\left(\frac{T_1 - T_0}{2} - I\right).

Example

If $T_0 = 100$ ms, reply received at $T_1 = 120$ ms, and server time $= 500$ , then

T_{\text{client}} = 500 + \frac{120 - 100}{2} = 500 + 10 = 510.

Limitation: Relies on a single time server (single point of failure) and assumes symmetric, small network delays; clocks must never be set backward (adjust gradually instead).

Level	BSc CSIT (TU)
Stream	Science
Subject	Distributed System (BSc CSIT, CSC462)
Year	2081 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions

BSc CSIT (TU) Science Distributed System (BSc CSIT, CSC462) Question Paper 2081 Nepal

Section A: Long Answer Questions

Data Replication in Distributed Systems

Consistency Models

Replica Management Techniques

Conclusion

Distributed File System (DFS) Architecture

Network File System (NFS)

Andrew File System (AFS) — alternative

Comparison

Fault Tolerance

Failure Models

Redundancy

Agreement in Faulty Systems: Byzantine Generals Problem

Conclusion

Section B: Short Answer Questions

Lamport's Logical Clock

Example

Remote Procedure Call (RPC)

Working (steps)

Key points

Centralized vs Distributed Mutual Exclusion

Distributed Deadlock Detection

Wait-For Graph (WFG)

Detection approaches

Bully Algorithm (Coordinator Election)

Algorithm

Example

Kinds of Transparency in a Distributed System

Middleware

Role / Functions in a Distributed System

Cristian's Algorithm (Physical Clock Synchronization)

Procedure

Accuracy bound

Example

Frequently asked questions