BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) Question Paper 2078 Nepal
This is the official BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) question paper for 2078, as set in the regular annual examination. It carries 80 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Database Management System (IOE, CT 652) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) exam or solving previous years' question papers, this 2078 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt all / any as specified.
A university wants to computerize its academic records. The system must track students (with roll number, name, semester and program), courses (with course code, title and credit hours), teachers (with employee id and name) and departments. A student enrolls in many courses and obtains a grade in each; a course is taught by one or more teachers; every teacher and every course belongs to exactly one department.
(a) Draw a complete Entity-Relationship (ER) diagram for the above system, clearly showing entity sets, attributes, primary keys, relationship sets and the cardinality/participation constraints. [8]
(b) Convert your ER diagram into a set of relational schemas, underlining the primary keys and indicating the foreign keys. [4]
(a) Entity-Relationship Diagram (8 marks)
Since an ER diagram cannot be drawn here, it is described precisely below; the components map directly to standard ER notation (rectangles = entity sets, ovals = attributes, diamonds = relationship sets, underlined attribute = primary key).
Entity sets and attributes (primary key underlined):
- STUDENT( roll_no, name, semester, program )
- COURSE( course_code, title, credit_hours )
- TEACHER( emp_id, name )
- DEPARTMENT( dept_id, dept_name )
Relationship sets and constraints:
| Relationship | Connects | Cardinality | Descriptive attribute |
|---|---|---|---|
| ENROLLS | STUDENT — COURSE | many-to-many (M:N) | grade |
| TEACHES | TEACHER — COURSE | many-to-many (M:N) | — |
| OFFERS (course belongs to dept) | DEPARTMENT — COURSE | one-to-many (1:N) | — |
| WORKS_IN (teacher belongs to dept) | DEPARTMENT — TEACHER | one-to-many (1:N) | — |
Cardinality / participation:
- A student enrolls in many courses and a course is taken by many students, with
gradeas an attribute on the ENROLLS relationship (drawn as an oval attached to the diamond). - A course is taught by one or more teachers and a teacher may teach many courses ⇒ M:N TEACHES.
- Every course belongs to exactly one department and every teacher belongs to exactly one department ⇒ the COURSE-side and TEACHER-side of OFFERS/WORKS_IN have total participation (double line) with cardinality 1 on the DEPARTMENT side and many on the course/teacher side.
[STUDENT]==(ENROLLS:grade)==[COURSE]==(TEACHES)==[TEACHER]
| |
(OFFERS 1:N) (WORKS_IN 1:N)
| |
+====[DEPARTMENT]=====+
(b) Relational schemas (4 marks)
Entity sets become tables; the M:N relationships ENROLLS and TEACHES become separate tables; the 1:N relationships are captured by a foreign key on the many side.
- STUDENT( roll_no, name, semester, program )
- DEPARTMENT( dept_id, dept_name )
- COURSE( course_code, title, credit_hours, dept_id ) —
dept_idFK → DEPARTMENT - TEACHER( emp_id, name, dept_id ) —
dept_idFK → DEPARTMENT - ENROLLS( roll_no, course_code, grade ) —
roll_noFK → STUDENT,course_codeFK → COURSE - TEACHES( emp_id, course_code ) —
emp_idFK → TEACHER,course_codeFK → COURSE
(Underlined = primary key; italic / noted = foreign key.)
Consider the relation R(A, B, C, D, E) with the set of functional dependencies
(a) Compute the closure and and determine all the candidate keys of R. [5]
(b) Explain the difference between 3NF and BCNF. Determine the highest normal form satisfied by R and justify your answer. [4]
(c) If R is not in BCNF, decompose it into BCNF relations and state whether your decomposition is lossless and dependency-preserving. [3]
(a) Closures and candidate keys (5 marks)
Given .
Closure :
- Start . By : add .
- By : add .
- By : present, add .
so A is a (super)key.
Closure :
- Start . By : add .
- No other FD’s left side is contained in ( needs ).
so B is not a key.
Finding all candidate keys. Note the chain , , so also determines everything: . Check the others:
- ✔, ✔.
- : gives , then gives , then gives ✔ so BC is a key.
- : , , ✔ so CD is a key.
So the candidate keys are: , , , . (Each is minimal — no proper subset is a key.)
Prime attributes = — every attribute appears in some candidate key, so all attributes are prime.
(b) 3NF vs BCNF and highest normal form of R (4 marks)
3NF: for every non-trivial FD , either is a superkey or every attribute in is a prime attribute.
BCNF: for every non-trivial FD , must be a superkey (stricter — the prime-attribute exception is removed).
Highest normal form of R. Check each FD:
- : is a key ✔
- : is a key ✔
- : is a key ✔
- : is not a superkey. So BCNF is violated.
- But is a prime attribute (in candidate key ), so the 3NF condition holds.
Hence R is in 3NF but not in BCNF; the violating dependency is .
(c) BCNF decomposition (3 marks)
The BCNF-violating FD is . Compute . Decompose on it:
- with FD — key , in BCNF.
- = minus — its FDs (projected): , , and (since determines ). Candidate keys of are ; every FD’s LHS is a key ⇒ is in BCNF.
Lossless? Yes — the common attribute is a key of (), satisfying the lossless-join condition .
Dependency-preserving? preserves ; preserves , . The FD has its LHS split across the two relations and is not directly preserved, so this BCNF decomposition is lossless but not (fully) dependency-preserving — the classic trade-off, since not every schema admits a decomposition that is simultaneously BCNF and dependency-preserving.
(a) State and explain the ACID properties of a transaction with a suitable example for each. [4]
(b) Given the schedule below for transactions and , draw the precedence (serialization) graph and determine whether the schedule is conflict-serializable. If it is, give an equivalent serial schedule.
| Time | ||
|---|---|---|
| 1 | R(A) | |
| 2 | R(A) | |
| 3 | W(A) | |
| 4 | W(A) | |
| 5 | R(B) | |
| 6 | W(B) |
[4]
(c) Explain how two-phase locking (2PL) guarantees serializability and discuss why basic 2PL can still lead to deadlock and cascading rollback. [4]
(a) ACID properties (4 marks)
| Property | Meaning | Example |
|---|---|---|
| Atomicity | A transaction is all-or-nothing; either every operation commits or none does. | A funds transfer that debits A and credits B must do both or neither — if the credit fails, the debit is rolled back. |
| Consistency | A transaction takes the database from one valid state to another, preserving all integrity constraints. | After the transfer, the total money A+B is unchanged. |
| Isolation | Concurrent transactions do not interfere; the result is as if they ran serially. | Two transfers running at once do not read each other’s uncommitted balances. |
| Durability | Once committed, changes survive crashes (persisted via logs/redo). | After the transfer commits, a power failure cannot lose it. |
(b) Conflict-serializability of the schedule (4 marks)
Operations on A: :R(A) @1, :R(A) @2, :W(A) @3, :W(A) @4. Operations on B: only .
Conflicting pairs (different transactions, same data item, at least one write):
- :R(A)@2 before :W(A)@3 ⇒ edge (read-before-write).
- :R(A)@1 before :W(A)@4 ⇒ edge .
- :W(A)@3 before :W(A)@4 ⇒ edge .
Precedence graph: nodes with edges and .
T1 <==> T2 (cycle)
The graph contains a cycle . Therefore the schedule is NOT conflict-serializable, and no equivalent serial schedule exists.
(c) 2PL, serializability, deadlock and cascading rollback (4 marks)
How 2PL guarantees serializability. In two-phase locking each transaction has a growing phase (it may acquire locks but not release any) followed by a shrinking phase (it may release locks but not acquire any). Because every transaction obtains all its locks before releasing any, the order of lock points defines a serialization order; one can prove the resulting schedule is conflict-serializable (its precedence graph is acyclic).
Why deadlock still occurs. 2PL controls the order of locking but not the requests. Two transactions can each hold a lock the other needs: holds and waits for , while holds and waits for — a circular wait ⇒ deadlock. 2PL must be combined with deadlock detection (wait-for graph) or prevention (wait-die/wound-wait).
Why cascading rollback occurs. Basic 2PL may release locks during the shrinking phase before commit. Another transaction can then read this uncommitted (dirty) data; if the first transaction later aborts, every transaction that read its data must also abort — a cascading rollback. This is fixed by strict 2PL, which holds all exclusive (write) locks until commit/abort, ensuring recoverable, cascadeless schedules.
Consider the following relational schema for a banking database (primary keys underlined):
Customer(cust_id, name, city)Account(acc_no, branch, balance)Depositor(cust_id, acc_no)
Write queries for the following:
(a) In relational algebra, find the names of all customers who live in 'Kathmandu' and hold an account with a balance greater than 50000. [3]
(b) In SQL, list each branch with the total balance of all accounts in that branch, showing only branches whose total balance exceeds 1,000,000. [3]
(c) In SQL, find the names of customers who hold more than one account. [4]
(a) Relational algebra — Kathmandu customers with balance > 50000 (3 marks)
Join the three relations, select on city and balance, then project the name.
(The natural joins match on cust_id between Customer and Depositor, and on acc_no between Depositor and Account.)
(b) SQL — branches whose total balance exceeds 1,000,000 (3 marks)
SELECT branch, SUM(balance) AS total_balance
FROM Account
GROUP BY branch
HAVING SUM(balance) > 1000000;
The GROUP BY aggregates per branch and HAVING filters groups after aggregation (a WHERE clause cannot be used on an aggregate).
(c) SQL — customers holding more than one account (4 marks)
SELECT c.name
FROM Customer c
JOIN Depositor d ON c.cust_id = d.cust_id
GROUP BY c.cust_id, c.name
HAVING COUNT(d.acc_no) > 1;
We group the depositor rows by customer and keep only those whose account count exceeds one. (Counting acc_no per cust_id and testing > 1 is the key idea.)
Section B: Short Answer Questions
Attempt all / any as specified.
Define super key, candidate key, primary key and foreign key. Using a suitable example, explain how a foreign key enforces referential integrity, and describe what happens on a DELETE with the ON DELETE CASCADE option.
Definitions
- Super key: any set of one or more attributes that uniquely identifies each tuple in a relation. (May contain extra attributes.) E.g. in
STUDENT(roll_no, name, …),{roll_no}and{roll_no, name}are both super keys. - Candidate key: a minimal super key — no proper subset is itself a super key. E.g.
{roll_no}. - Primary key: the one candidate key chosen by the designer to identify tuples; it cannot be NULL and must be unique.
- Foreign key: an attribute (or set) in one relation that references the primary key of another (or the same) relation, linking the two tables.
Referential integrity (example)
Given ACCOUNT(acc_no, branch) and DEPOSITOR(cust_id, acc_no) where DEPOSITOR.acc_no is a foreign key referencing ACCOUNT.acc_no:
Referential integrity requires that every value of
DEPOSITOR.acc_nomust already exist as anacc_noinACCOUNT(or be NULL). This prevents a depositor from pointing to a non-existent account — i.e. no “dangling” reference.
ON DELETE CASCADE
When a referenced row is deleted, the action on the child rows is governed by the referential action. With ON DELETE CASCADE, deleting a parent row automatically deletes all child rows that reference it. For example, deleting an ACCOUNT row also deletes every DEPOSITOR row holding that acc_no, keeping the database referentially consistent. (Other options are ON DELETE SET NULL and ON DELETE RESTRICT/NO ACTION, which block the delete.)
(a) Differentiate between a primary (clustered) index and a secondary (non-clustered) index. [3]
(b) Explain why a B+ tree is preferred over a B tree for database indexing, and state one advantage of B+ tree indexing over a simple ordered index file. [3]
(a) Primary (clustered) vs Secondary (non-clustered) index (3 marks)
| Aspect | Primary / Clustered index | Secondary / Non-clustered index |
|---|---|---|
| Ordering | Built on the attribute by which the data file is physically sorted (usually the key). | Built on a non-ordering field; data file is not sorted on it. |
| Number per table | At most one (file can be sorted on only one key). | Many allowed. |
| Density | Usually sparse (one entry per data block). | Must be dense (one entry per record/search-key value). |
| Access | Fast range and equality search; records with adjacent keys are physically together. | Equality search via pointers; range scans are slower (records scattered). |
(b) B+ tree vs B tree, and vs ordered index file (3 marks)
Why B+ tree is preferred over B tree:
- In a B+ tree all data pointers reside in the leaf nodes, while internal nodes store only keys; in a B tree, records/pointers may sit at internal nodes too. Keeping internal nodes key-only gives a higher fan-out, hence a shorter, shallower tree and fewer disk I/Os.
- B+ tree leaves are linked sequentially, so range queries and ordered traversal are efficient (follow leaf pointers); a B tree must do in-order traversal up and down the tree.
- Search cost is uniform — every key search reaches a leaf — giving predictable performance.
Advantage of B+ tree over a simple ordered (sorted) index file: a B+ tree is a dynamic, self-balancing structure; insertions and deletions are handled by node split/merge in time, so performance does not degrade as the file grows. An ordered index file requires costly reorganization (shifting/overflow blocks) on every insert, degrading over time.
Explain static hashing and discuss the problem of bucket overflow. How does dynamic (extendible) hashing overcome the limitations of static hashing? Illustrate with the role of the directory and the global/local depth.
Static hashing
In static hashing a fixed number of buckets is allocated, and a record with search-key is placed in bucket using a hash function . Lookup, insert and delete on the key require computing the hash and accessing a single bucket — ideally O(1) disk access. The number of buckets is decided in advance and does not change as the file grows or shrinks.
Bucket overflow problem
Because is fixed, problems arise as the data set changes:
- Bucket overflow occurs when more records hash to a bucket than it can hold. Causes: insufficient buckets, an uneven (skewed) hash distribution, or steady growth of the file.
- Overflow is handled by overflow chaining (linking extra overflow buckets to the full bucket). Long overflow chains turn the O(1) lookup into a costly chain scan, degrading performance.
- If the file shrinks, many buckets become nearly empty, wasting space. The only remedy is a full rehash/reorganization, which is expensive.
Dynamic (extendible) hashing
Extendible hashing lets the bucket structure grow and shrink on demand, eliminating periodic full reorganization.
- It uses a directory: an array of pointers, where is the global depth. The first bits of the hash value index into the directory, which points to the bucket holding the record.
- Each bucket has a local depth , the number of leading hash bits that all records in that bucket share.
- On overflow of a bucket with local depth :
- If : split the bucket, redistribute its records by their -th bit, set the two new buckets’ local depth to , and update the affected directory pointers. No directory growth needed.
- If : double the directory (increment global depth ), then split the bucket as above.
This way only the overflowing bucket is split — most of the file is untouched — so growth is incremental and graceful, avoiding both the bucket-overflow chains and the wasteful full rehashing of static hashing.
(a) Explain the log-based recovery technique using deferred and immediate database modification. [3]
(b) What is a checkpoint and how does it reduce the work done during recovery after a system crash? [3]
(a) Log-based recovery: deferred vs immediate modification (3 marks)
A log is a sequence of records on stable storage describing every update, e.g. , written before the change is applied (write-ahead logging).
Deferred database modification:
- All writes of a transaction are recorded only in the log; the actual database is not modified until the transaction commits (after is logged).
- On recovery: redo all transactions that have both start and commit records; transactions without a commit need no undo (database was never changed). Only a redo is required — simpler.
Immediate database modification:
- Updates are written to the database as they happen (after logging, but possibly before commit).
- On recovery: undo transactions that started but did not commit (using ), and redo committed transactions (using ). Both undo and redo lists are needed.
(b) Checkpoint (3 marks)
A checkpoint is a point at which the DBMS periodically writes a record to the log after: (1) flushing all log records in main memory to stable storage, and (2) flushing all modified (dirty) buffer blocks to disk, where is the list of transactions active at that moment.
How it reduces recovery work: without checkpoints, recovery would have to scan the entire log from the beginning. With a checkpoint, the recovery manager need only consider transactions that were active at or started after the last checkpoint — everything committed and flushed before the checkpoint is already safely on disk and need not be redone. So recovery scans the log backward only up to the most recent checkpoint, drastically cutting the redo/undo workload and recovery time after a crash.
(a) What is a deadlock? State the conditions necessary for a deadlock to occur. [2]
(b) Compare deadlock prevention schemes wait-die and wound-wait, clearly explaining the action taken when an older transaction requests a lock held by a younger one and vice versa. [4]
(a) Deadlock and necessary conditions (2 marks)
A deadlock is a situation in which two or more transactions are each waiting indefinitely for a data item locked by another, so none can proceed (a circular wait of resources).
The four necessary (Coffman) conditions are:
- Mutual exclusion — a resource (lock) is held in a non-shareable mode.
- Hold and wait — a transaction holds at least one resource while waiting for others.
- No preemption — a held lock cannot be forcibly taken away; it is released only voluntarily.
- Circular wait — a cycle of transactions exists, each waiting for a resource held by the next.
(b) Wait-die vs Wound-wait (4 marks)
Both are deadlock-prevention schemes using transaction timestamps; an older transaction has a smaller timestamp. Suppose requests a lock held by .
| Scheme | older (smaller TS) requests lock held by younger | younger requests lock held by older |
|---|---|---|
| Wait-die (non-preemptive) | is allowed to wait. | dies (rolls back) and restarts later with its same timestamp. |
| Wound-wait (preemptive) | wounds — is rolled back (preempted), and gets the lock. | is allowed to wait. |
Key idea: Both ensure that waits only go in one direction of timestamp order, so no cycle (circular wait) can form ⇒ no deadlock. In wait-die an older transaction waits and a younger one is killed; in wound-wait an older transaction preempts (wounds) the younger and the younger one waits. In both, rolled-back transactions keep their original timestamp so they are not starved — eventually they become the oldest and complete.
(a) Explain the different levels at which database security must be enforced. [3]
(b) Describe the use of the SQL GRANT and REVOKE statements for authorization, and explain what the WITH GRANT OPTION clause does. [3]
(a) Levels of database security (3 marks)
Security must be enforced at several levels — a weakness at any one level compromises the whole system:
- Database / DBMS level (authorization): privileges on tables, views, rows — GRANT/REVOKE, roles, views to restrict visible data.
- Operating-system level: OS file permissions and accounts must protect the database files; an OS breach bypasses DBMS controls.
- Network level: data in transit must be protected (encryption, SSL/TLS) against eavesdropping and tampering.
- Physical level: the servers/disks must be in a secured location to prevent theft or destruction.
- Human level: users must be properly authenticated and trained to prevent social-engineering / careless disclosure of credentials.
(b) GRANT, REVOKE and WITH GRANT OPTION (3 marks)
GRANT gives privileges (SELECT, INSERT, UPDATE, DELETE, etc.) on database objects to users or roles:
GRANT SELECT, UPDATE ON Account TO clerk;
REVOKE withdraws previously granted privileges:
REVOKE UPDATE ON Account FROM clerk;
WITH GRANT OPTION lets the recipient further grant the same privilege to other users:
GRANT SELECT ON Account TO manager WITH GRANT OPTION;
Here manager can not only read Account but also pass SELECT on Account to others. If the original grant is later revoked, the privileges that were propagated from it are normally revoked in cascade as well (to remove the whole authorization chain).
Explain the JOIN operation in relational algebra. Differentiate between a natural join, a theta join and an outer join with a small example for each.
JOIN in relational algebra
A join combines tuples from two relations based on a condition relating their attributes; it is essentially a Cartesian product followed by a selection (and possibly a projection). Let Employee(eid, name, dept_id) and Department(dept_id, dname).
Theta join ()
Combines tuples that satisfy an arbitrary condition (using ). Common attributes are not merged.
Example: pairs each employee with its matching department row; the result keeps both dept_id columns. (When uses only =, it is an equi-join.)
Natural join ()
A special equi-join on all attributes with the same name, automatically keeping just one copy of each common attribute.
Joins on the common attribute dept_id and outputs (eid, name, dept_id, dname) with a single dept_id. Only employees whose dept_id matches a department appear (unmatched tuples are dropped).
Outer join
Extends the join to preserve unmatched tuples, padding missing attributes with NULL:
- Left outer join (): keeps all left-relation tuples; e.g. an employee with no department still appears with
dname = NULL. - Right outer join (): keeps all right-relation tuples; e.g. a department with no employees appears with employee fields NULL.
- Full outer join (): keeps unmatched tuples from both sides.
Summary: theta join = condition-based, keeps duplicate cols; natural join = auto-match same-named cols, one copy; outer join = like a join but retains non-matching rows with NULLs.
Explain the following extended-ER concepts with suitable examples: (a) weak entity set and its identifying relationship, (b) specialization and generalization, and (c) the aggregation abstraction.
(a) Weak entity set and identifying relationship (2 marks)
A weak entity set is one that does not have a primary key of its own; it cannot be uniquely identified by its own attributes alone. It depends on a strong (owner) entity for identification. Its only partial key is a discriminator (partial key), shown with a dashed underline; the full key = owner’s primary key + discriminator.
The relationship that links the weak entity to its owner is the identifying relationship (drawn as a double diamond), and the weak entity has total participation in it.
Example: DEPENDENT(name, relationship) of an EMPLOYEE. A dependent’s name is not unique across all employees; it is identified by (employee.emp_id, dependent.name). The relationship HAS_DEPENDENT is the identifying relationship.
(b) Specialization and generalization (2 marks)
- Specialization (top-down): taking a higher-level (super) entity set and defining sub-groups (subclasses) that have additional, distinguishing attributes. Example:
EMPLOYEEis specialized intoSECRETARY,ENGINEER,MANAGER, each with extra attributes. Shown with an ISA triangle pointing to the superclass. - Generalization (bottom-up): the reverse — recognizing common attributes among several entity sets and combining them into a higher-level entity set. Example:
CARandTRUCKare generalized intoVEHICLEwith shared attributes (reg_no,model).
Subclasses inherit the attributes and relationships of the superclass. Constraints include disjoint vs overlapping and total vs partial participation.
(c) Aggregation (2 marks)
Aggregation is an abstraction that treats a relationship (together with its participating entities) as a single higher-level entity, so that this relationship can itself participate in another relationship. It overcomes the ER-model limitation that a relationship cannot directly link to another relationship.
Example: the relationship WORKS_ON between EMPLOYEE and PROJECT is aggregated into one unit; a MANAGER entity then has a MANAGES relationship to this aggregated WORKS_ON, expressing that a manager supervises the assignment of an employee to a project (not just an employee or a project separately). It is drawn by enclosing the WORKS_ON relationship and its entities in a box and connecting MANAGES to that box.
Frequently asked questions
- Where can I find the BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) question paper 2078?
- The full BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) 2078 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Database Management System (IOE, CT 652) 2078 paper come with solutions?
- Yes. Every question on this Database Management System (IOE, CT 652) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) 2078 paper?
- The BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) 2078 paper carries 80 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Database Management System (IOE, CT 652) past paper free?
- Yes — reading and attempting this Database Management System (IOE, CT 652) past paper on Kekkei is completely free.