Browse papers
A

Section A: Long Answer Questions

Attempt all / any as specified.

4 questions
1long12 marks

A university wants to computerize its academic records. The system must track students (with roll number, name, semester and program), courses (with course code, title and credit hours), teachers (with employee id and name) and departments. A student enrolls in many courses and obtains a grade in each; a course is taught by one or more teachers; every teacher and every course belongs to exactly one department.

(a) Draw a complete Entity-Relationship (ER) diagram for the above system, clearly showing entity sets, attributes, primary keys, relationship sets and the cardinality/participation constraints. [8]

(b) Convert your ER diagram into a set of relational schemas, underlining the primary keys and indicating the foreign keys. [4]

(a) Entity-Relationship Diagram (8 marks)

Since an ER diagram cannot be drawn here, it is described precisely below; the components map directly to standard ER notation (rectangles = entity sets, ovals = attributes, diamonds = relationship sets, underlined attribute = primary key).

Entity sets and attributes (primary key underlined):

  • STUDENT( roll_no, name, semester, program )
  • COURSE( course_code, title, credit_hours )
  • TEACHER( emp_id, name )
  • DEPARTMENT( dept_id, dept_name )

Relationship sets and constraints:

RelationshipConnectsCardinalityDescriptive attribute
ENROLLSSTUDENT — COURSEmany-to-many (M:N)grade
TEACHESTEACHER — COURSEmany-to-many (M:N)
OFFERS (course belongs to dept)DEPARTMENT — COURSEone-to-many (1:N)
WORKS_IN (teacher belongs to dept)DEPARTMENT — TEACHERone-to-many (1:N)

Cardinality / participation:

  • A student enrolls in many courses and a course is taken by many students, with grade as an attribute on the ENROLLS relationship (drawn as an oval attached to the diamond).
  • A course is taught by one or more teachers and a teacher may teach many courses ⇒ M:N TEACHES.
  • Every course belongs to exactly one department and every teacher belongs to exactly one department ⇒ the COURSE-side and TEACHER-side of OFFERS/WORKS_IN have total participation (double line) with cardinality 1 on the DEPARTMENT side and many on the course/teacher side.
[STUDENT]==(ENROLLS:grade)==[COURSE]==(TEACHES)==[TEACHER]
                                |                    |
                            (OFFERS 1:N)        (WORKS_IN 1:N)
                                |                    |
                                +====[DEPARTMENT]=====+

(b) Relational schemas (4 marks)

Entity sets become tables; the M:N relationships ENROLLS and TEACHES become separate tables; the 1:N relationships are captured by a foreign key on the many side.

  • STUDENT( roll_no, name, semester, program )
  • DEPARTMENT( dept_id, dept_name )
  • COURSE( course_code, title, credit_hours, dept_id )  — dept_id FK → DEPARTMENT
  • TEACHER( emp_id, name, dept_id )  — dept_id FK → DEPARTMENT
  • ENROLLS( roll_no, course_code, grade )  — roll_no FK → STUDENT, course_code FK → COURSE
  • TEACHES( emp_id, course_code )  — emp_id FK → TEACHER, course_code FK → COURSE

(Underlined = primary key; italic / noted = foreign key.)

er-modelingdatabase-design
2long12 marks

Consider the relation R(A, B, C, D, E) with the set of functional dependencies

F={ABC,  CDE,  BD,  EA}F = \{\,A \rightarrow BC,\; CD \rightarrow E,\; B \rightarrow D,\; E \rightarrow A\,\}

(a) Compute the closure (A)+(A)^+ and (B)+(B)^+ and determine all the candidate keys of R. [5]

(b) Explain the difference between 3NF and BCNF. Determine the highest normal form satisfied by R and justify your answer. [4]

(c) If R is not in BCNF, decompose it into BCNF relations and state whether your decomposition is lossless and dependency-preserving. [3]

(a) Closures and candidate keys (5 marks)

Given F={ABC,  CDE,  BD,  EA}F=\{A\rightarrow BC,\; CD\rightarrow E,\; B\rightarrow D,\; E\rightarrow A\}.

Closure (A)+(A)^+:

  • Start {A}\{A\}. By ABCA\rightarrow BC: add B,C{A,B,C}B,C \Rightarrow \{A,B,C\}.
  • By BDB\rightarrow D: add D{A,B,C,D}D \Rightarrow \{A,B,C,D\}.
  • By CDECD\rightarrow E: C,DC,D present, add E{A,B,C,D,E}E \Rightarrow \{A,B,C,D,E\}.
(A)+={A,B,C,D,E}=R\boxed{(A)^+ = \{A,B,C,D,E\} = R}

so A is a (super)key.

Closure (B)+(B)^+:

  • Start {B}\{B\}. By BDB\rightarrow D: add D{B,D}D \Rightarrow \{B,D\}.
  • No other FD’s left side is contained in {B,D}\{B,D\} (CDCD needs CC).
(B)+={B,D}\boxed{(B)^+ = \{B,D\}}

so B is not a key.

Finding all candidate keys. Note the chain ABCA\rightarrow BC, EAE\rightarrow A, so EE also determines everything: (E)+=EABCDE=R(E)^+ = E\rightarrow A\rightarrow BCD\rightarrow E = R. Check the others:

  • A+=RA^+ = R ✔, E+=RE^+ = R ✔.
  • (BC)+(BC)^+: BDB\rightarrow D gives {B,C,D}\{B,C,D\}, then CDECD\rightarrow E gives EE, then EAE\rightarrow A gives ARA \Rightarrow R ✔ so BC is a key.
  • (CD)+(CD)^+: CDECD\rightarrow E, EAE\rightarrow A, ABCRA\rightarrow BC \Rightarrow R ✔ so CD is a key.

So the candidate keys are: AA, EE, BCBC, CDCD. (Each is minimal — no proper subset is a key.)

Prime attributes = {A,B,C,D,E}\{A, B, C, D, E\} — every attribute appears in some candidate key, so all attributes are prime.

(b) 3NF vs BCNF and highest normal form of R (4 marks)

3NF: for every non-trivial FD XYX\rightarrow Y, either XX is a superkey or every attribute in YY is a prime attribute.

BCNF: for every non-trivial FD XYX\rightarrow Y, XX must be a superkey (stricter — the prime-attribute exception is removed).

Highest normal form of R. Check each FD:

  • ABCA\rightarrow BC: AA is a key ✔
  • CDECD\rightarrow E: CDCD is a key ✔
  • EAE\rightarrow A: EE is a key ✔
  • BDB\rightarrow D: BB is not a superkey. So BCNF is violated.
    • But DD is a prime attribute (in candidate key CDCD), so the 3NF condition holds.

Hence R is in 3NF but not in BCNF; the violating dependency is BDB\rightarrow D.

(c) BCNF decomposition (3 marks)

The BCNF-violating FD is BDB\rightarrow D. Compute (B)+={B,D}(B)^+ = \{B,D\}. Decompose on it:

  • R1(B,D)R_1(\underline{B}, D) with FD BDB\rightarrow D — key BB, in BCNF.
  • R2(A,B,C,E)R_2(A,B,C,E) = RR minus DD — its FDs (projected): ABCA\rightarrow BC, EAE\rightarrow A, and BCEBC\rightarrow E (since BCBC determines EE). Candidate keys of R2R_2 are A,E,BCA, E, BC; every FD’s LHS is a key ⇒ R2R_2 is in BCNF.

Lossless? Yes — the common attribute BB is a key of R1R_1 (BBDB\rightarrow BD), satisfying the lossless-join condition R1R2R1R_1\cap R_2 \rightarrow R_1.

Dependency-preserving? R1R_1 preserves BDB\rightarrow D; R2R_2 preserves ABCA\rightarrow BC, EAE\rightarrow A. The FD CDECD\rightarrow E has its LHS split across the two relations and is not directly preserved, so this BCNF decomposition is lossless but not (fully) dependency-preserving — the classic trade-off, since not every schema admits a decomposition that is simultaneously BCNF and dependency-preserving.

normalizationfunctional-dependenciesbcnf
3long12 marks

(a) State and explain the ACID properties of a transaction with a suitable example for each. [4]

(b) Given the schedule below for transactions T1T_1 and T2T_2, draw the precedence (serialization) graph and determine whether the schedule is conflict-serializable. If it is, give an equivalent serial schedule.

TimeT1T_1T2T_2
1R(A)
2R(A)
3W(A)
4W(A)
5R(B)
6W(B)

[4]

(c) Explain how two-phase locking (2PL) guarantees serializability and discuss why basic 2PL can still lead to deadlock and cascading rollback. [4]

(a) ACID properties (4 marks)

PropertyMeaningExample
AtomicityA transaction is all-or-nothing; either every operation commits or none does.A funds transfer that debits A and credits B must do both or neither — if the credit fails, the debit is rolled back.
ConsistencyA transaction takes the database from one valid state to another, preserving all integrity constraints.After the transfer, the total money A+B is unchanged.
IsolationConcurrent transactions do not interfere; the result is as if they ran serially.Two transfers running at once do not read each other’s uncommitted balances.
DurabilityOnce committed, changes survive crashes (persisted via logs/redo).After the transfer commits, a power failure cannot lose it.

(b) Conflict-serializability of the schedule (4 marks)

Operations on A: T1T_1:R(A) @1, T2T_2:R(A) @2, T1T_1:W(A) @3, T2T_2:W(A) @4. Operations on B: only T1T_1.

Conflicting pairs (different transactions, same data item, at least one write):

  • T2T_2:R(A)@2 before T1T_1:W(A)@3 ⇒ edge T2T1T_2 \rightarrow T_1 (read-before-write).
  • T1T_1:R(A)@1 before T2T_2:W(A)@4 ⇒ edge T1T2T_1 \rightarrow T_2.
  • T1T_1:W(A)@3 before T2T_2:W(A)@4 ⇒ edge T1T2T_1 \rightarrow T_2.

Precedence graph: nodes {T1,T2}\{T_1,T_2\} with edges T1T2T_1\rightarrow T_2 and T2T1T_2\rightarrow T_1.

T1  <==>  T2     (cycle)

The graph contains a cycle T1T2T1T_1\rightarrow T_2\rightarrow T_1. Therefore the schedule is NOT conflict-serializable, and no equivalent serial schedule exists.

(c) 2PL, serializability, deadlock and cascading rollback (4 marks)

How 2PL guarantees serializability. In two-phase locking each transaction has a growing phase (it may acquire locks but not release any) followed by a shrinking phase (it may release locks but not acquire any). Because every transaction obtains all its locks before releasing any, the order of lock points defines a serialization order; one can prove the resulting schedule is conflict-serializable (its precedence graph is acyclic).

Why deadlock still occurs. 2PL controls the order of locking but not the requests. Two transactions can each hold a lock the other needs: T1T_1 holds XX and waits for YY, while T2T_2 holds YY and waits for XX — a circular wait ⇒ deadlock. 2PL must be combined with deadlock detection (wait-for graph) or prevention (wait-die/wound-wait).

Why cascading rollback occurs. Basic 2PL may release locks during the shrinking phase before commit. Another transaction can then read this uncommitted (dirty) data; if the first transaction later aborts, every transaction that read its data must also abort — a cascading rollback. This is fixed by strict 2PL, which holds all exclusive (write) locks until commit/abort, ensuring recoverable, cascadeless schedules.

transaction-managementconcurrency-controlserializability
4long10 marks

Consider the following relational schema for a banking database (primary keys underlined):

  • Customer(cust_id, name, city)
  • Account(acc_no, branch, balance)
  • Depositor(cust_id, acc_no)

Write queries for the following:

(a) In relational algebra, find the names of all customers who live in 'Kathmandu' and hold an account with a balance greater than 50000. [3]

(b) In SQL, list each branch with the total balance of all accounts in that branch, showing only branches whose total balance exceeds 1,000,000. [3]

(c) In SQL, find the names of customers who hold more than one account. [4]

(a) Relational algebra — Kathmandu customers with balance > 50000 (3 marks)

Join the three relations, select on city and balance, then project the name.

πname(  σcity=Kathmandubalance>50000(  CustomerDepositorAccount  ))\pi_{name}\Big(\;\sigma_{city='Kathmandu'\,\wedge\,balance>50000}\big(\;Customer \bowtie Depositor \bowtie Account\;\big)\Big)

(The natural joins match on cust_id between Customer and Depositor, and on acc_no between Depositor and Account.)

(b) SQL — branches whose total balance exceeds 1,000,000 (3 marks)

SELECT branch, SUM(balance) AS total_balance
FROM   Account
GROUP BY branch
HAVING SUM(balance) > 1000000;

The GROUP BY aggregates per branch and HAVING filters groups after aggregation (a WHERE clause cannot be used on an aggregate).

(c) SQL — customers holding more than one account (4 marks)

SELECT c.name
FROM   Customer c
JOIN   Depositor d ON c.cust_id = d.cust_id
GROUP BY c.cust_id, c.name
HAVING COUNT(d.acc_no) > 1;

We group the depositor rows by customer and keep only those whose account count exceeds one. (Counting acc_no per cust_id and testing > 1 is the key idea.)

sql-queriesrelational-algebra
B

Section B: Short Answer Questions

Attempt all / any as specified.

8 questions
5short6 marks

Define super key, candidate key, primary key and foreign key. Using a suitable example, explain how a foreign key enforces referential integrity, and describe what happens on a DELETE with the ON DELETE CASCADE option.

Definitions

  • Super key: any set of one or more attributes that uniquely identifies each tuple in a relation. (May contain extra attributes.) E.g. in STUDENT(roll_no, name, …), {roll_no} and {roll_no, name} are both super keys.
  • Candidate key: a minimal super key — no proper subset is itself a super key. E.g. {roll_no}.
  • Primary key: the one candidate key chosen by the designer to identify tuples; it cannot be NULL and must be unique.
  • Foreign key: an attribute (or set) in one relation that references the primary key of another (or the same) relation, linking the two tables.

Referential integrity (example)

Given ACCOUNT(acc_no, branch) and DEPOSITOR(cust_id, acc_no) where DEPOSITOR.acc_no is a foreign key referencing ACCOUNT.acc_no:

Referential integrity requires that every value of DEPOSITOR.acc_no must already exist as an acc_no in ACCOUNT (or be NULL). This prevents a depositor from pointing to a non-existent account — i.e. no “dangling” reference.

ON DELETE CASCADE

When a referenced row is deleted, the action on the child rows is governed by the referential action. With ON DELETE CASCADE, deleting a parent row automatically deletes all child rows that reference it. For example, deleting an ACCOUNT row also deletes every DEPOSITOR row holding that acc_no, keeping the database referentially consistent. (Other options are ON DELETE SET NULL and ON DELETE RESTRICT/NO ACTION, which block the delete.)

relational-modelkeys
6short6 marks

(a) Differentiate between a primary (clustered) index and a secondary (non-clustered) index. [3]

(b) Explain why a B+ tree is preferred over a B tree for database indexing, and state one advantage of B+ tree indexing over a simple ordered index file. [3]

(a) Primary (clustered) vs Secondary (non-clustered) index (3 marks)

AspectPrimary / Clustered indexSecondary / Non-clustered index
OrderingBuilt on the attribute by which the data file is physically sorted (usually the key).Built on a non-ordering field; data file is not sorted on it.
Number per tableAt most one (file can be sorted on only one key).Many allowed.
DensityUsually sparse (one entry per data block).Must be dense (one entry per record/search-key value).
AccessFast range and equality search; records with adjacent keys are physically together.Equality search via pointers; range scans are slower (records scattered).

(b) B+ tree vs B tree, and vs ordered index file (3 marks)

Why B+ tree is preferred over B tree:

  • In a B+ tree all data pointers reside in the leaf nodes, while internal nodes store only keys; in a B tree, records/pointers may sit at internal nodes too. Keeping internal nodes key-only gives a higher fan-out, hence a shorter, shallower tree and fewer disk I/Os.
  • B+ tree leaves are linked sequentially, so range queries and ordered traversal are efficient (follow leaf pointers); a B tree must do in-order traversal up and down the tree.
  • Search cost is uniform — every key search reaches a leaf — giving predictable performance.

Advantage of B+ tree over a simple ordered (sorted) index file: a B+ tree is a dynamic, self-balancing structure; insertions and deletions are handled by node split/merge in O(logn)O(\log n) time, so performance does not degrade as the file grows. An ordered index file requires costly reorganization (shifting/overflow blocks) on every insert, degrading over time.

indexingb-treefile-organization
7short6 marks

Explain static hashing and discuss the problem of bucket overflow. How does dynamic (extendible) hashing overcome the limitations of static hashing? Illustrate with the role of the directory and the global/local depth.

Static hashing

In static hashing a fixed number of buckets BB is allocated, and a record with search-key KK is placed in bucket h(K)modBh(K) \bmod B using a hash function hh. Lookup, insert and delete on the key require computing the hash and accessing a single bucket — ideally O(1) disk access. The number of buckets is decided in advance and does not change as the file grows or shrinks.

Bucket overflow problem

Because BB is fixed, problems arise as the data set changes:

  • Bucket overflow occurs when more records hash to a bucket than it can hold. Causes: insufficient buckets, an uneven (skewed) hash distribution, or steady growth of the file.
  • Overflow is handled by overflow chaining (linking extra overflow buckets to the full bucket). Long overflow chains turn the O(1) lookup into a costly chain scan, degrading performance.
  • If the file shrinks, many buckets become nearly empty, wasting space. The only remedy is a full rehash/reorganization, which is expensive.

Dynamic (extendible) hashing

Extendible hashing lets the bucket structure grow and shrink on demand, eliminating periodic full reorganization.

  • It uses a directory: an array of 2d2^{d} pointers, where dd is the global depth. The first dd bits of the hash value h(K)h(K) index into the directory, which points to the bucket holding the record.
  • Each bucket has a local depth ddd'\le d, the number of leading hash bits that all records in that bucket share.
  • On overflow of a bucket with local depth dd':
    • If d<dd' < d: split the bucket, redistribute its records by their (d+1)(d'+1)-th bit, set the two new buckets’ local depth to d+1d'+1, and update the affected directory pointers. No directory growth needed.
    • If d=dd' = d: double the directory (increment global depth dd+1d\to d+1), then split the bucket as above.

This way only the overflowing bucket is split — most of the file is untouched — so growth is incremental and graceful, avoiding both the bucket-overflow chains and the wasteful full rehashing of static hashing.

hashingindexing
8short6 marks

(a) Explain the log-based recovery technique using deferred and immediate database modification. [3]

(b) What is a checkpoint and how does it reduce the work done during recovery after a system crash? [3]

(a) Log-based recovery: deferred vs immediate modification (3 marks)

A log is a sequence of records on stable storage describing every update, e.g. Ti,X,Vold,Vnew\langle T_i, X, V_{old}, V_{new}\rangle, written before the change is applied (write-ahead logging).

Deferred database modification:

  • All writes of a transaction are recorded only in the log; the actual database is not modified until the transaction commits (after Ti,commit\langle T_i, commit\rangle is logged).
  • On recovery: redo all transactions that have both start and commit records; transactions without a commit need no undo (database was never changed). Only a redo is required — simpler.

Immediate database modification:

  • Updates are written to the database as they happen (after logging, but possibly before commit).
  • On recovery: undo transactions that started but did not commit (using VoldV_{old}), and redo committed transactions (using VnewV_{new}). Both undo and redo lists are needed.

(b) Checkpoint (3 marks)

A checkpoint is a point at which the DBMS periodically writes a checkpoint,L\langle checkpoint, L\rangle record to the log after: (1) flushing all log records in main memory to stable storage, and (2) flushing all modified (dirty) buffer blocks to disk, where LL is the list of transactions active at that moment.

How it reduces recovery work: without checkpoints, recovery would have to scan the entire log from the beginning. With a checkpoint, the recovery manager need only consider transactions that were active at or started after the last checkpoint — everything committed and flushed before the checkpoint is already safely on disk and need not be redone. So recovery scans the log backward only up to the most recent checkpoint, drastically cutting the redo/undo workload and recovery time after a crash.

recoveryloggingcheckpointing
9short6 marks

(a) What is a deadlock? State the conditions necessary for a deadlock to occur. [2]

(b) Compare deadlock prevention schemes wait-die and wound-wait, clearly explaining the action taken when an older transaction requests a lock held by a younger one and vice versa. [4]

(a) Deadlock and necessary conditions (2 marks)

A deadlock is a situation in which two or more transactions are each waiting indefinitely for a data item locked by another, so none can proceed (a circular wait of resources).

The four necessary (Coffman) conditions are:

  1. Mutual exclusion — a resource (lock) is held in a non-shareable mode.
  2. Hold and wait — a transaction holds at least one resource while waiting for others.
  3. No preemption — a held lock cannot be forcibly taken away; it is released only voluntarily.
  4. Circular wait — a cycle of transactions exists, each waiting for a resource held by the next.

(b) Wait-die vs Wound-wait (4 marks)

Both are deadlock-prevention schemes using transaction timestamps; an older transaction has a smaller timestamp. Suppose TiT_i requests a lock held by TjT_j.

SchemeTiT_i older (smaller TS) requests lock held by younger TjT_jTiT_i younger requests lock held by older TjT_j
Wait-die (non-preemptive)TiT_i is allowed to wait.TiT_i dies (rolls back) and restarts later with its same timestamp.
Wound-wait (preemptive)TiT_i wounds TjT_jTjT_j is rolled back (preempted), and TiT_i gets the lock.TiT_i is allowed to wait.

Key idea: Both ensure that waits only go in one direction of timestamp order, so no cycle (circular wait) can form ⇒ no deadlock. In wait-die an older transaction waits and a younger one is killed; in wound-wait an older transaction preempts (wounds) the younger and the younger one waits. In both, rolled-back transactions keep their original timestamp so they are not starved — eventually they become the oldest and complete.

concurrency-controldeadlock
10short6 marks

(a) Explain the different levels at which database security must be enforced. [3]

(b) Describe the use of the SQL GRANT and REVOKE statements for authorization, and explain what the WITH GRANT OPTION clause does. [3]

(a) Levels of database security (3 marks)

Security must be enforced at several levels — a weakness at any one level compromises the whole system:

  1. Database / DBMS level (authorization): privileges on tables, views, rows — GRANT/REVOKE, roles, views to restrict visible data.
  2. Operating-system level: OS file permissions and accounts must protect the database files; an OS breach bypasses DBMS controls.
  3. Network level: data in transit must be protected (encryption, SSL/TLS) against eavesdropping and tampering.
  4. Physical level: the servers/disks must be in a secured location to prevent theft or destruction.
  5. Human level: users must be properly authenticated and trained to prevent social-engineering / careless disclosure of credentials.

(b) GRANT, REVOKE and WITH GRANT OPTION (3 marks)

GRANT gives privileges (SELECT, INSERT, UPDATE, DELETE, etc.) on database objects to users or roles:

GRANT SELECT, UPDATE ON Account TO clerk;

REVOKE withdraws previously granted privileges:

REVOKE UPDATE ON Account FROM clerk;

WITH GRANT OPTION lets the recipient further grant the same privilege to other users:

GRANT SELECT ON Account TO manager WITH GRANT OPTION;

Here manager can not only read Account but also pass SELECT on Account to others. If the original grant is later revoked, the privileges that were propagated from it are normally revoked in cascade as well (to remove the whole authorization chain).

database-securityauthorization
11short4 marks

Explain the JOIN operation in relational algebra. Differentiate between a natural join, a theta join and an outer join with a small example for each.

JOIN in relational algebra

A join combines tuples from two relations based on a condition relating their attributes; it is essentially a Cartesian product followed by a selection (and possibly a projection). Let Employee(eid, name, dept_id) and Department(dept_id, dname).

Theta join (θ\bowtie_\theta)

Combines tuples that satisfy an arbitrary condition θ\theta (using =,<,>,,,=,<,>,\le,\ge,\ne). Common attributes are not merged.

EmployeeEmployee.dept_id=Department.dept_idDepartmentEmployee \bowtie_{Employee.dept\_id = Department.dept\_id} Department

Example: pairs each employee with its matching department row; the result keeps both dept_id columns. (When θ\theta uses only =, it is an equi-join.)

Natural join (\bowtie)

A special equi-join on all attributes with the same name, automatically keeping just one copy of each common attribute.

EmployeeDepartmentEmployee \bowtie Department

Joins on the common attribute dept_id and outputs (eid, name, dept_id, dname) with a single dept_id. Only employees whose dept_id matches a department appear (unmatched tuples are dropped).

Outer join

Extends the join to preserve unmatched tuples, padding missing attributes with NULL:

  • Left outer join (): keeps all left-relation tuples; e.g. an employee with no department still appears with dname = NULL.
  • Right outer join (): keeps all right-relation tuples; e.g. a department with no employees appears with employee fields NULL.
  • Full outer join (): keeps unmatched tuples from both sides.

Summary: theta join = condition-based, keeps duplicate cols; natural join = auto-match same-named cols, one copy; outer join = like a join but retains non-matching rows with NULLs.

relational-algebra
12short6 marks

Explain the following extended-ER concepts with suitable examples: (a) weak entity set and its identifying relationship, (b) specialization and generalization, and (c) the aggregation abstraction.

(a) Weak entity set and identifying relationship (2 marks)

A weak entity set is one that does not have a primary key of its own; it cannot be uniquely identified by its own attributes alone. It depends on a strong (owner) entity for identification. Its only partial key is a discriminator (partial key), shown with a dashed underline; the full key = owner’s primary key + discriminator.

The relationship that links the weak entity to its owner is the identifying relationship (drawn as a double diamond), and the weak entity has total participation in it.

Example: DEPENDENT(name, relationship) of an EMPLOYEE. A dependent’s name is not unique across all employees; it is identified by (employee.emp_id, dependent.name). The relationship HAS_DEPENDENT is the identifying relationship.

(b) Specialization and generalization (2 marks)

  • Specialization (top-down): taking a higher-level (super) entity set and defining sub-groups (subclasses) that have additional, distinguishing attributes. Example: EMPLOYEE is specialized into SECRETARY, ENGINEER, MANAGER, each with extra attributes. Shown with an ISA triangle pointing to the superclass.
  • Generalization (bottom-up): the reverse — recognizing common attributes among several entity sets and combining them into a higher-level entity set. Example: CAR and TRUCK are generalized into VEHICLE with shared attributes (reg_no, model).

Subclasses inherit the attributes and relationships of the superclass. Constraints include disjoint vs overlapping and total vs partial participation.

(c) Aggregation (2 marks)

Aggregation is an abstraction that treats a relationship (together with its participating entities) as a single higher-level entity, so that this relationship can itself participate in another relationship. It overcomes the ER-model limitation that a relationship cannot directly link to another relationship.

Example: the relationship WORKS_ON between EMPLOYEE and PROJECT is aggregated into one unit; a MANAGER entity then has a MANAGES relationship to this aggregated WORKS_ON, expressing that a manager supervises the assignment of an employee to a project (not just an employee or a project separately). It is drawn by enclosing the WORKS_ON relationship and its entities in a box and connecting MANAGES to that box.

er-modelingspecialization

Frequently asked questions

Where can I find the BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) question paper 2078?
The full BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) 2078 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Database Management System (IOE, CT 652) 2078 paper come with solutions?
Yes. Every question on this Database Management System (IOE, CT 652) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) 2078 paper?
The BE Computer Engineering (IOE, TU) Database Management System (IOE, CT 652) 2078 paper carries 80 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Database Management System (IOE, CT 652) past paper free?
Yes — reading and attempting this Database Management System (IOE, CT 652) past paper on Kekkei is completely free.