Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Define the complexity classes P and NP. Explain NP-completeness and the concept of polynomial-time reduction with the example of the Satisfiability (SAT) problem.

Complexity Classes P and NP

Class P (Polynomial time): P is the set of all decision problems (languages) that can be decided by a deterministic Turing machine in time bounded by a polynomial in the input size nn. Formally, P=k0TIME(nk)P = \bigcup_{k\ge 0} \text{TIME}(n^k). These are the problems regarded as efficiently solvable. Examples: shortest path, sorting, primality testing, 2-SAT.

Class NP (Nondeterministic Polynomial time): NP is the set of decision problems that can be decided by a nondeterministic Turing machine in polynomial time, or equivalently the set of problems whose YES-instances have a certificate (proof) that can be verified by a deterministic TM in polynomial time. Examples: SAT, Hamiltonian cycle, vertex cover, subset-sum.

Clearly PNPP \subseteq NP (a solver can ignore the certificate). Whether P=NPP = NP is the famous open problem.

Polynomial-Time Reduction

A language AA is polynomial-time (many-one / Karp) reducible to BB, written ApBA \le_p B, if there is a function ff computable in polynomial time such that for every string ww:

wA    f(w)B.w \in A \iff f(w) \in B.

If ApBA \le_p B and BPB \in P, then APA \in P. Reductions let us prove a new problem is hard by transforming a known hard problem into it.

NP-Completeness

A language BB is NP-complete if:

  1. BNPB \in NP, and
  2. NP-hardness: every language ANPA \in NP satisfies ApBA \le_p B.

NP-complete problems are the hardest problems in NP: if any one of them has a polynomial-time algorithm, then P=NPP = NP. To prove a new problem CC is NP-complete we show CNPC \in NP and reduce a known NP-complete problem to CC (knownpC\text{known} \le_p C).

SAT and the Cook–Levin Theorem

Boolean Satisfiability (SAT): Given a Boolean formula ϕ\phi over variables x1,,xnx_1,\dots,x_n, decide whether there is a truth assignment making ϕ\phi true.

  • SAT \in NP: a satisfying assignment is a certificate; substituting it and evaluating ϕ\phi takes polynomial time.
  • Cook–Levin Theorem (1971): SAT is NP-complete. The proof shows that for any ANPA \in NP decided by a polynomial-time NTM MM, the entire accepting computation (a tableau of configurations) can be encoded by a Boolean formula ϕw\phi_w of polynomial size that is satisfiable iff MM accepts ww. Hence ApSATA \le_p \text{SAT} for all ANPA \in NP.

Thus SAT was the first proven NP-complete problem; all other NP-completeness proofs (3-SAT, clique, vertex cover, etc.) follow by chains of polynomial reductions starting from SAT.

complexitynp-completeness
2long10 marks

Minimize the given DFA using the partitioning (table-filling) method and explain each step of the minimization algorithm with a suitable example.

DFA Minimization by the Table-Filling (Partitioning) Method

Goal: Given a DFA M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_0, F), produce an equivalent DFA with the minimum number of states by merging indistinguishable states.

Two states p,qp, q are distinguishable if there exists a string ww such that exactly one of δ^(p,w), δ^(q,w)\hat\delta(p,w),\ \hat\delta(q,w) is final.

Algorithm (steps)

  1. Remove unreachable states from q0q_0.
  2. Build a table of all unordered pairs {p,q}\{p,q\}.
  3. Basis: Mark {p,q}\{p,q\} as distinguishable if one is final and the other is non-final.
  4. Induction: For every unmarked pair {p,q}\{p,q\} and each symbol aa, if {δ(p,a),δ(q,a)}\{\delta(p,a),\delta(q,a)\} is already marked, mark {p,q}\{p,q\}.
  5. Repeat step 4 until no new pair gets marked.
  6. Merge every pair left unmarked — these are equivalent states — into a single state. Define transitions on the merged blocks; q0q_0's block is the start, blocks containing final states are final.

Worked Example

Consider DFA with states A,B,C,D,EA,B,C,D,E, alphabet {0,1}\{0,1\}, start AA, final {E}\{E\}:

State01
→ABC
BBD
CBC
DBE
EBC

Basis: EE is final, all others non-final, so every pair containing EE is marked: {A,E},{B,E},{C,E},{D,E}\{A,E\},\{B,E\},\{C,E\},\{D,E\}.

Induction: Check remaining pairs.

  • {A,C}\{A,C\}: 0(B,B)0\to(B,B) same; 1(C,C)1\to(C,C) same → not marked.
  • {B,D}\{B,D\}: 1(D,E)1\to(D,E) and {D,E}\{D,E\} is marked → mark {B,D}\{B,D\}.
  • {A,B}\{A,B\}: 1(C,D)1\to(C,D); {C,D}\{C,D\}? 1(C,E)1\to(C,E) marked ⇒ {C,D}\{C,D\} marked ⇒ {A,B}\{A,B\} marked.
  • Continuing, the only pair that stays unmarked is {A,C}\{A,C\}.

Merge: AA and CC are equivalent → merge into state ACAC. Minimized DFA states: {AC,B,D,E}\{AC, B, D, E\}:

State01
→ACBAC
BBD
DBE
*EBAC

This 4-state DFA is the minimal DFA equivalent to the original 5-state DFA.

Note: By the Myhill–Nerode theorem, the minimal DFA is unique up to renaming of states.

dfa-minimizationfinite-automata
3long10 marks

State and prove the Pumping Lemma for context-free languages. Use it to show that L = { a^i b^j c^k | i = j = k } is not context-free.

Pumping Lemma for Context-Free Languages

Statement: If LL is a context-free language, then there exists a constant p1p \ge 1 (the pumping length) such that every string zLz \in L with zp|z| \ge p can be written as

z=uvwxyz = uvwxy

satisfying:

  1. vwxp|vwx| \le p,
  2. vx1|vx| \ge 1 (i.e. vv and xx are not both empty), and
  3. for all i0i \ge 0,  uviwxiyL\ uv^iwx^iy \in L.

Proof Sketch

Let L=L(G)L = L(G) for a CFG GG in Chomsky Normal Form with mm variables. Take p=2mp = 2^{m}. Any string zLz\in L with zp|z|\ge p has a parse tree of height >m> m. Along the longest root-to-leaf path there are more than mm variable nodes, so by the pigeonhole principle some variable AA repeats. Let the upper AA derive vwxvwx and the lower AA derive ww, with AvAxvwxA \Rightarrow^* vAx \Rightarrow^* vwx. Then:

  • Replacing the lower subtree by the upper one ii times gives AviwxiA \Rightarrow^* v^i w x^i, hence zi=uviwxiyLz_i = uv^iwx^iy \in L for all i0i\ge0 (condition 3).
  • Because CNF rules branch into two, the repetition can be found within the bottom portion so vwxp|vwx|\le p (condition 1).
  • The rule AvAxA \Rightarrow vAx uses a binary production, so vxvx produces at least one symbol, giving vx1|vx|\ge 1 (condition 2).

Showing L={aibjcki=j=k}L = \{a^i b^j c^k \mid i = j = k\} is NOT Context-Free

Assume for contradiction LL is CFL with pumping length pp. Choose

z=apbpcpL,z=3pp.z = a^p b^p c^p \in L,\quad |z| = 3p \ge p.

Write z=uvwxyz = uvwxy with vwxp|vwx|\le p and vx1|vx|\ge 1.

Since vwxp|vwx| \le p, the substring vwxvwx spans at most two of the three symbol blocks (it cannot contain aa's and cc's simultaneously, because they are separated by pp b's).

Pump with i=2i = 2, giving z2=uv2wx2yz_2 = uv^2wx^2y:

  • If vxvx contains only aa's and bb's (or any mix avoiding cc), then z2z_2 has more aa's and/or bb's but the same number of cc's, so the counts are no longer equal.
  • Symmetrically, if vxvx avoids aa's, the number of aa's stays pp while bb/cc counts increase.

In every case at least one symbol's count differs from the others, so z2Lz_2 \notin L. This contradicts the pumping lemma.

Conclusion: No such pp exists, therefore L={aibjcki=j=k}L = \{a^ib^jc^k \mid i=j=k\} is not context-free. \blacksquare

pumping-lemmacontext-free-languages
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

Construct a DFA that accepts all binary strings ending with '00'.

DFA Accepting Binary Strings Ending in '00'

Let M=(Q,Σ,δ,q0,F)M = (Q,\Sigma,\delta,q_0,F) with Σ={0,1}\Sigma=\{0,1\}.

  • q0q_0: no useful suffix yet (last symbol not a 0 toward goal),
  • q1q_1: the string so far ends in a single '0',
  • q2q_2: the string ends in '00' (accepting).

Transition table:

State01
q0q_0q1q_1q0q_0
q1q_1q2q_2q0q_0
*q2q_2q2q_2q0q_0

Start state q0q_0; accepting state F={q2}F=\{q_2\}.

Idea: On a '1' the suffix is broken, so go back to q0q_0. On a '0' advance (q0q1q2q_0\to q_1\to q_2); once in q2q_2, a further '0' keeps it in q2q_2 (suffix still '00').

Check: 100q0q0q1q2q_0\to q_0\to q_1\to q_2 (accept). 1001 → ends in q0q_0 (reject). Correct.

dfa
5short5 marks

Explain epsilon-closure with a suitable example.

ε-closure

In an ε-NFA, the ε-closure of a state qq, written ε-CLOSURE(q)\varepsilon\text{-CLOSURE}(q), is the set of all states reachable from qq using only ε (empty) transitions, including qq itself.

Formally it is the smallest set SS such that qSq \in S and if rSr \in S then every state in δ(r,ε)\delta(r,\varepsilon) is also in SS. For a set of states TT, ε-CLOSURE(T)=qTε-CLOSURE(q)\varepsilon\text{-CLOSURE}(T)=\bigcup_{q\in T}\varepsilon\text{-CLOSURE}(q).

It is used in subset (NFA-to-DFA) construction and in defining the extended transition δ^\hat\delta for ε-NFAs.

Example

Consider states with ε-moves: q0εq1q_0 \xrightarrow{\varepsilon} q_1, q1εq2q_1 \xrightarrow{\varepsilon} q_2, and q2q_2 has no ε-move.

  • ε-CLOSURE(q0)={q0,q1,q2}\varepsilon\text{-CLOSURE}(q_0) = \{q_0, q_1, q_2\} (reach q1q_1 then q2q_2 via ε).
  • ε-CLOSURE(q1)={q1,q2}\varepsilon\text{-CLOSURE}(q_1) = \{q_1, q_2\}.
  • ε-CLOSURE(q2)={q2}\varepsilon\text{-CLOSURE}(q_2) = \{q_2\} (no outgoing ε-edge).

Thus from q0q_0 the machine can be in any of {q0,q1,q2}\{q_0,q_1,q_2\} without consuming input.

epsilon-nfa
6short5 marks

Write a regular expression for strings over {0,1} that contain at least one '0' and at least one '1'.

Regular Expression: at least one 0 AND at least one 1

Over Σ={0,1}\Sigma=\{0,1\}, a string must contain a 0 somewhere and a 1 somewhere. A 0 and a 1 must appear in some order, so:

(0+1)0(0+1)1(0+1)   +   (0+1)1(0+1)0(0+1)(0+1)^*\,0\,(0+1)^*\,1\,(0+1)^*\ \;+\;\ (0+1)^*\,1\,(0+1)^*\,0\,(0+1)^*
  • First term: a 00 appears before a 11.
  • Second term: a 11 appears before a 00.

Together they cover every string having at least one 0 and at least one 1.

Equivalent compact form (complement of "all 0's or all 1's"):

R=Σ0Σ1Σ+Σ1Σ0Σ,Σ=(0+1).R = \Sigma^*0\Sigma^*1\Sigma^* + \Sigma^*1\Sigma^*0\Sigma^*,\quad \Sigma=(0+1).

Examples accepted: 01, 10, 1100, 0110. Rejected: 000, 111, \varepsilon.

regular-expression
7short5 marks

State Arden's theorem and explain its use in obtaining a regular expression from a finite automaton.

Arden's Theorem

Statement: Let PP and QQ be regular expressions over an alphabet, where PP does not contain the empty string ε\varepsilon (i.e. εL(P)\varepsilon \notin L(P)). Then the equation

R=Q+RPR = Q + RP

has the unique solution

R=QP.R = QP^*.

(The condition εL(P)\varepsilon\notin L(P) guarantees uniqueness.)

Use: Finding a Regular Expression from a Finite Automaton

For a DFA/NFA, write one equation per state qiq_i describing the set of strings that lead from the start state into qiq_i (or, in the alternative formulation, from qiq_i to a final state):

qi=jqjaji  (+  ε if qi=q0).q_i = \sum_j q_j \,a_{ji} \;(+\;\varepsilon \text{ if } q_i = q_0).

Wherever an equation has the self-referential form q=qP+Qq = q P + Q, apply Arden's theorem to replace it by q=QPq = QP^*. Substituting back and repeatedly eliminating variables, the equation for the final state(s) yields a regular expression equal to \sum of the languages, i.e. L(M)L(M).

Short Example

FA: start AA with AaAA\xrightarrow{a}A, AbBA\xrightarrow{b}B, BB final. Equations: A=ε+AaA = \varepsilon + Aa, B=AbB = Ab. Apply Arden's to A=ε+AaA = \varepsilon + Aa (here P=aP=a, Q=εQ=\varepsilon): A=εa=aA = \varepsilon\,a^* = a^*. Then B=abB = a^*b. So L(M)=abL(M) = a^*b.

arden-theoremregular-expression
8short5 marks

What is an ambiguous grammar? Show with an example that a grammar is ambiguous.

Ambiguous Grammar

A context-free grammar GG is ambiguous if there exists at least one string wL(G)w \in L(G) that has two or more distinct parse trees (equivalently, two distinct leftmost — or two distinct rightmost — derivations). Ambiguity is a property of the grammar, not the language.

Example

Consider the expression grammar:

EE+EEEidE \rightarrow E + E \mid E * E \mid id

Take the string w=id+ididw = id + id * id.

Parse tree 1 (treats ++ as outer operator — corresponds to id+(idid)id + (id * id)):

        E
      / | \
     E  +  E
     |    /|\
    id   E * E
         |   |
        id  id

Leftmost derivation: EE+Eid+Eid+EEid+idEid+ididE \Rightarrow E+E \Rightarrow id+E \Rightarrow id+E*E \Rightarrow id+id*E \Rightarrow id+id*id.

Parse tree 2 (treats * as outer operator — corresponds to (id+id)id(id + id) * id):

        E
      / | \
     E  *  E
    /|\    |
   E + E  id
   |   |
  id  id

Leftmost derivation: EEEE+EEid+EEid+idEid+ididE \Rightarrow E*E \Rightarrow E+E*E \Rightarrow id+E*E \Rightarrow id+id*E \Rightarrow id+id*id.

The same string id+ididid+id*id has two different parse trees / leftmost derivations, so the grammar is ambiguous. (Ambiguity here can be removed by introducing precedence/associativity via separate non-terminals for term and factor.)

ambiguitycfg
9short5 marks

Eliminate left recursion from the grammar A -> Aa | b.

Eliminating Left Recursion from AAabA \rightarrow Aa \mid b

The production is immediately left-recursive because the right side AaAa begins with the non-terminal AA itself.

General rule: For AAαβA \rightarrow A\alpha \mid \beta (where β\beta does not start with AA), rewrite as

AβAAαAε.A \rightarrow \beta A' \qquad A' \rightarrow \alpha A' \mid \varepsilon.

Here α=a\alpha = a and β=b\beta = b. Introducing a new non-terminal AA':

AbAA \rightarrow b\,A' AaAεA' \rightarrow a\,A' \mid \varepsilon

Verification: The original generates bab a^* (a bb followed by zero or more aa's). The new grammar: AbAbaAbanA \Rightarrow bA' \Rightarrow b\,a\,A' \Rightarrow \dots \Rightarrow b a^n, also generating {bann0}\{ba^n \mid n\ge 0\}. The languages match and the new grammar is right-recursive (no left recursion), suitable for top-down (LL) parsing.

cfgleft-recursion
10short5 marks

Explain the conversion of a CFG into Greibach Normal Form (GNF).

Greibach Normal Form (GNF)

A CFG is in Greibach Normal Form if every production has the form

AaαA \rightarrow a\,\alpha

where aa is a single terminal and α\alpha is a (possibly empty) string of non-terminals (αV\alpha \in V^*). Thus each production starts with exactly one terminal followed by zero or more variables. (For a language containing ε\varepsilon, SεS\to\varepsilon is allowed with SS not on any RHS.)

GNF guarantees each derivation step generates exactly one terminal symbol, which is useful for showing CFL = NPDA and for top-down parsing.

Conversion Procedure

  1. Convert to Chomsky Normal Form (CNF) first: remove ε-productions, unit productions, and useless symbols, and ensure productions are ABCA\to BC or AaA\to a.
  2. Order the non-terminals A1,A2,,AnA_1, A_2, \dots, A_n.
  3. Remove left recursion / substitute so that every production AiAjγA_i \to A_j \gamma has j>ij > i: for each ii, for each j<ij<i, substitute the productions of AjA_j into AiAjγA_i \to A_j\gamma. Then eliminate any immediate left recursion on AiA_i by introducing a new variable BiB_i (AiAiαA_i \to A_i\alpha becomes AiβBiA_i\to\beta B_i, BiαBiαB_i\to\alpha B_i \mid \alpha).
  4. After this, AnA_n's productions already start with a terminal. Back-substitute: working downward (An1,,A1A_{n-1}, \dots, A_1 and the BiB_i), replace a leading non-terminal at the start of any RHS by its productions, so every production begins with a terminal.

Small Example

SAB, Aa, BbS \to AB,\ A\to a,\ B\to b. Substitute AaA\to a into SABS\to AB: SaBS \to aB. Now all productions (SaB, Aa, BbS\to aB,\ A\to a,\ B\to b) begin with a terminal — GNF achieved.

greibach-normal-formcfg
11short5 marks

Define instantaneous description (ID) of a PDA and explain acceptance by final state and by empty stack.

Instantaneous Description (ID) of a PDA

A pushdown automaton is P=(Q,Σ,Γ,δ,q0,Z0,F)P = (Q,\Sigma,\Gamma,\delta,q_0,Z_0,F). An instantaneous description (ID) is a triple that captures the complete configuration of the PDA at a moment in time:

(q, w, γ)(q,\ w,\ \gamma)

where

  • qQq \in Q is the current state,
  • wΣw \in \Sigma^* is the remaining (unread) input, and
  • γΓ\gamma \in \Gamma^* is the current stack contents (top written leftmost).

A move between IDs is written with the turnstile \vdash:

(q,aw,Zβ)(p,w,αβ)if (p,α)δ(q,a,Z).(q, a w, Z\beta) \vdash (p, w, \alpha\beta) \quad\text{if } (p,\alpha)\in\delta(q,a,Z).

\vdash^* denotes zero or more moves. The initial ID is (q0,w,Z0)(q_0, w, Z_0).

Acceptance by Final State

The language accepted by final state is

L(P)={w(q0,w,Z0)(qf,ε,γ), qfF}.L(P) = \{\, w \mid (q_0, w, Z_0) \vdash^* (q_f, \varepsilon, \gamma),\ q_f \in F \,\}.

The input is accepted if, after consuming the whole input, the PDA is in some final state — the stack contents are irrelevant.

Acceptance by Empty Stack

The language accepted by empty stack is

N(P)={w(q0,w,Z0)(q,ε,ε), qQ}.N(P) = \{\, w \mid (q_0, w, Z_0) \vdash^* (q, \varepsilon, \varepsilon),\ q\in Q \,\}.

The input is accepted if, after consuming the whole input, the stack becomes empty — the final state is irrelevant (FF is typically taken as \emptyset).

Equivalence: The two acceptance modes accept exactly the same class of languages (the context-free languages); a PDA of one type can be converted to one of the other.

pda
12short5 marks

Explain the working of a multi-tape Turing Machine.

Multi-Tape Turing Machine

A multi-tape Turing machine is a TM equipped with k1k \ge 1 semi-infinite (or two-way infinite) tapes, each with its own independent read/write head. It is a convenient model that is equivalent in power to the standard single-tape TM.

Components / Working

  • It has a finite control (set of states), and kk tapes; tape 1 initially holds the input, the other tapes are blank.
  • In one move, depending on the current state and the kk symbols currently scanned (one per tape), the machine:
    1. Writes a symbol on each of the kk tapes (independently),
    2. Moves each of the kk heads independently Left, Right, or Stays (L/R/S), and
    3. Enters a new state.
  • The transition function has the form
δ:Q×ΓkQ×Γk×{L,R,S}k.\delta: Q \times \Gamma^k \rightarrow Q \times \Gamma^k \times \{L,R,S\}^k.
  • Acceptance is by entering an accept (halt) state, as usual.

Advantage

Separating data onto different tapes makes algorithm design much simpler (e.g. one tape for input, one as a counter/scratchpad). It can speed computation up by a polynomial factor.

Equivalence with Single-Tape TM

Any kk-tape TM MM can be simulated by a single-tape TM SS: SS stores the kk tape contents on one tape using 2k2k tracks — kk tracks for the symbols and kk tracks marking each head position. To simulate one move of MM, SS sweeps across to read all kk marked symbols, then sweeps back updating symbols and head markers. Hence multi-tape TMs recognize exactly the recursively enumerable languages — the same class as single-tape TMs — though a single-tape simulation of tt steps may take O(t2)O(t^2) time.

turing-machine

Frequently asked questions

Where can I find the BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) question paper 2079?
The full BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) 2079 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Theory of Computation (BSc CSIT, CSC257) 2079 paper come with solutions?
Yes. Every question on this Theory of Computation (BSc CSIT, CSC257) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) 2079 paper?
The BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) 2079 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Theory of Computation (BSc CSIT, CSC257) past paper free?
Yes — reading and attempting this Theory of Computation (BSc CSIT, CSC257) past paper on Kekkei is completely free.