Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

What is a Non-deterministic Finite Automaton (NFA)? Construct an NFA that accepts the set of strings over {0, 1} ending with '01' and convert it into an equivalent DFA using the subset construction method.

Non-deterministic Finite Automaton (NFA)

An NFA is a finite automaton defined by the 5-tuple M=(Q,Σ,δ,q0,F)M = (Q, \Sigma, \delta, q_0, F) where:

  • QQ = finite set of states
  • Σ\Sigma = input alphabet
  • δ:Q×(Σ{ϵ})2Q\delta : Q \times (\Sigma \cup \{\epsilon\}) \rightarrow 2^Q is the transition function (it maps to a set of states)
  • q0Qq_0 \in Q = start state
  • FQF \subseteq Q = set of final (accepting) states

Unlike a DFA, from a given state on a given symbol an NFA may move to zero, one, or several states. A string is accepted if at least one computation path ends in a final state.

NFA for strings over {0,1} ending with '01'

States: q0q_0 (start), q1q_1 (seen 0), q2q_2 (seen 01, final).

Stateon 0on 1
q0\to q_0{q0,q1}\{q_0, q_1\}{q0}\{q_0\}
q1q_1\emptyset{q2}\{q_2\}
q2*q_2\emptyset\emptyset

The self-loop on q0q_0 (for both 0 and 1) lets the automaton guess where the suffix 01 begins; q00q11q2q_0 \xrightarrow{0} q_1 \xrightarrow{1} q_2 recognises the final 01. q2q_2 is the only accepting state.

Conversion to DFA (Subset Construction)

We build DFA states as subsets of NFA states. Start = {q0}\{q_0\}.

DFA stateon 0on 1
A={q0}A=\{q_0\}{q0,q1}=B\{q_0,q_1\}=B{q0}=A\{q_0\}=A
B={q0,q1}B=\{q_0,q_1\}{q0,q1}=B\{q_0,q_1\}=B{q0,q2}=C\{q_0,q_2\}=C
C={q0,q2}C=\{q_0,q_2\}{q0,q1}=B\{q_0,q_1\}=B{q0}=A\{q_0\}=A

Resulting DFA:

  • States: A,B,CA, B, C
  • Start state: AA
  • Accepting state: CC (since CC contains the NFA final state q2q_2)
  • Transitions as in the table above.

This DFA accepts exactly the strings over {0,1}\{0,1\} ending in 01.

nfadfasubset-construction
2long10 marks

State and prove the Pumping Lemma for regular languages. Using it, prove that the language L = { a^n b^n | n >= 0 } is not regular.

Pumping Lemma for Regular Languages

Statement. If LL is a regular language, then there exists a constant pp (the pumping length) such that every string wLw \in L with wp|w| \ge p can be written as w=xyzw = xyz satisfying:

  1. y1|y| \ge 1 (the middle part is non-empty),
  2. xyp|xy| \le p,
  3. xyizLxy^iz \in L for every i0i \ge 0.

Proof

Since LL is regular, some DFA MM with pp states accepts LL. Take any wLw \in L with wp|w| \ge p. Reading ww visits w+1p+1|w|+1 \ge p+1 states. By the pigeonhole principle, among the first p+1p+1 visited states some state qq repeats while reading the first pp symbols.

Let the first visit to qq be after reading xx and the second after reading xyxy (so the loop reads yy, y1|y|\ge 1, and xyp|xy| \le p). The remaining input is zz. Because yy takes MM from qq back to qq, the loop can be traversed any number of times. Hence xyizxy^iz is also accepted for all i0i \ge 0, i.e. xyizLxy^iz \in L. \blacksquare

Proving L={anbnn0}L = \{a^n b^n \mid n \ge 0\} is not regular

Assume, for contradiction, that LL is regular with pumping length pp.

Choose w=apbpLw = a^p b^p \in L, and w=2pp|w| = 2p \ge p. By the lemma w=xyzw = xyz with xyp|xy| \le p and y1|y| \ge 1.

Since xyp|xy| \le p, both xx and yy lie entirely within the leading aa's, so y=aky = a^k for some k1k \ge 1.

Now pump with i=2i = 2:

xy2z=ap+kbp.xy^2z = a^{p+k} b^p.

Here the number of aa's (p+kp+k) exceeds the number of bb's (pp), so xy2zLxy^2z \notin L.

This contradicts the pumping lemma. Therefore L={anbnn0}L = \{a^n b^n \mid n \ge 0\} is not regular. \blacksquare

pumping-lemmaregular-languages
3long10 marks

Define context-free grammar (CFG). Convert the following CFG into an equivalent grammar in Chomsky Normal Form (CNF): S -> ASA | aB, A -> B | S, B -> b | epsilon.

Context-Free Grammar (CFG)

A CFG is a 4-tuple G=(V,T,P,S)G = (V, T, P, S) where VV is a finite set of variables (non-terminals), TT is a finite set of terminals (VT=V \cap T = \emptyset), SVS \in V is the start symbol, and PP is a finite set of productions of the form AαA \to \alpha with AVA \in V and α(VT)\alpha \in (V \cup T)^*. The language generated is the set of terminal strings derivable from SS.

Chomsky Normal Form (CNF): every production is ABCA \to BC (two variables) or AaA \to a (single terminal); SϵS \to \epsilon is allowed only if ϵL\epsilon \in L.

Converting the given CFG to CNF

Grammar:

SASAaB,ABS,BbϵS \to ASA \mid aB, \quad A \to B \mid S, \quad B \to b \mid \epsilon

Step 1: Add a new start symbol

S0S,SASAaB,ABS,BbϵS_0 \to S, \quad S \to ASA \mid aB, \quad A \to B \mid S, \quad B \to b \mid \epsilon

Step 2: Remove ϵ\epsilon-productions

Nullable variable: BB (since BϵB \to \epsilon), and hence AA (since ABA \to B). Remove BϵB \to \epsilon and add versions of rules with nullable symbols omitted.

  • SASAS \to ASA: dropping nullable AA's gives SSAASSS \to SA \mid AS \mid S. Drop the unit SSS\to S. So SASASAASS \to ASA \mid SA \mid AS.
  • SaBS \to aB: gives SaBaS \to aB \mid a.
  • ABA \to B stays; AϵA \to \epsilon would appear (A nullable) but we drop it.

Result:

S0SS_0 \to S SASASAASaBaS \to ASA \mid SA \mid AS \mid aB \mid a ABSA \to B \mid S BbB \to b

Step 3: Remove unit productions

Unit pairs: S0SS_0 \to S, ABA \to B, ASA \to S.

  • S0ASASAASaBaS_0 \to ASA \mid SA \mid AS \mid aB \mid a
  • AbA \to b (from ABbA\to B\to b), and AASASAASaBaA \to ASA \mid SA \mid AS \mid aB \mid a (from ASA\to S).

Result:

S0ASASAASaBaS_0 \to ASA \mid SA \mid AS \mid aB \mid a SASASAASaBaS \to ASA \mid SA \mid AS \mid aB \mid a AASASAASaBabA \to ASA \mid SA \mid AS \mid aB \mid a \mid b BbB \to b

Step 4: Replace terminals in long rules

Introduce XaaX_a \to a. Replace a in aBaB by XaX_a: XaB\,\to X_aB.

Step 5: Break rules longer than two symbols

For each ASAASA introduce YSAY \to SA, so ASAAYASA \to A Y with YSAY \to SA.

Final CNF Grammar

S0AYSAASXaBaS_0 \to AY \mid SA \mid AS \mid X_aB \mid a SAYSAASXaBaS \to AY \mid SA \mid AS \mid X_aB \mid a AAYSAASXaBabA \to AY \mid SA \mid AS \mid X_aB \mid a \mid b BbB \to b XaaX_a \to a YSAY \to SA

Every production now has the form ABCA \to BC or AaA \to a, so the grammar is in Chomsky Normal Form.

cfgchomsky-normal-form
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

Explain the closure properties of regular languages.

Closure Properties of Regular Languages

The class of regular languages is closed under the following operations: if L1L_1 and L2L_2 are regular, then so are the results below.

  • Union: L1L2L_1 \cup L_2 is regular (combine the two FAs with a new start state via ϵ\epsilon-moves, or take the union of their regular expressions r1+r2r_1 + r_2).
  • Concatenation: L1L2L_1 L_2 is regular (connect final states of M1M_1 to start of M2M_2 by ϵ\epsilon; regex r1r2r_1 r_2).
  • Kleene Star: L1L_1^* is regular (add ϵ\epsilon-loop back from finals to start).
  • Intersection: L1L2L_1 \cap L_2 is regular (product/cross-product DFA construction).
  • Complement: L1\overline{L_1} is regular (swap accepting and non-accepting states of a complete DFA).
  • Difference: L1L2=L1L2L_1 - L_2 = L_1 \cap \overline{L_2} is regular.
  • Reversal: L1RL_1^R is regular (reverse all transitions, swap start/final).
  • Homomorphism and inverse homomorphism also preserve regularity.

These closure properties are useful for proving languages regular/non-regular and for combining automata in lexical analysis.

closure-propertiesregular-languages
5short5 marks

What is the membership problem? Explain the CYK algorithm in brief.

Membership Problem

The membership problem asks: given a grammar (or automaton) GG and a string ww, decide whether wL(G)w \in L(G). For context-free languages this is decidable — the CYK algorithm answers it efficiently.

CYK Algorithm (Cocke–Younger–Kasami)

CYK is a dynamic-programming, bottom-up parsing algorithm that decides membership for a CFG in Chomsky Normal Form. For a string w=a1a2anw = a_1a_2\dots a_n:

  1. Build a triangular table V[i,j]V[i,j] where V[i,j]V[i,j] holds the set of variables that derive the substring of length jj starting at position ii.
  2. Base case (length 1): V[i,1]={AAaiP}V[i,1] = \{A \mid A \to a_i \in P\}.
  3. Recursive case: for length jj from 2 to nn, for each split point kk, set
AV[i,j] if ABC,  BV[i,k],  CV[i+k,jk].A \in V[i,j] \text{ if } A \to BC, \; B \in V[i,k], \; C \in V[i+k, j-k].
  1. Accept if the start symbol SV[1,n]S \in V[1,n].

It runs in O(n3G)O(n^3 \cdot |G|) time, confirming CFG membership is decidable.

cykdecidability
6short5 marks

Construct an NFA for the regular expression (0+1)*1.

NFA for (0+1)1(0+1)^*1

This regular expression denotes all strings over {0,1}\{0,1\} that end with 1.

States: q0q_0 (start), q1q_1 (final).

Stateon 0on 1
q0\to q_0{q0}\{q_0\}{q0,q1}\{q_0, q_1\}
q1*q_1\emptyset\emptyset

Description: q0q_0 has self-loops on both 0 and 1, modelling the (0+1)(0+1)^* part. On reading a 1, the NFA may also move to the accepting state q1q_1, modelling the final 1. A string is accepted iff some path lands in q1q_1 after the last symbol — which happens exactly when the last input symbol is 1.

  • Q={q0,q1}Q = \{q_0, q_1\}, start =q0= q_0, F={q1}F = \{q_1\}, Σ={0,1}\Sigma=\{0,1\}.
nfaregular-expression
7short5 marks

Explain Rice's theorem in brief.

Rice's Theorem

Statement. Every non-trivial semantic property of the language recognised by a Turing machine is undecidable.

More precisely, let PP be a property of recursively enumerable languages (i.e. a property of the language L(M)L(M), not of the machine's internal structure). PP is non-trivial if some TM-recognisable language has the property and some other does not. Then the problem "Does L(M)L(M) have property PP?" is undecidable.

Key points:

  • The property must be about the language (behaviour), not about syntax/states (e.g. "M has 5 states" is not covered).
  • Non-trivial means PP is neither always true nor always false over all RE languages.

Examples of undecidable problems (by Rice): Is L(M)=L(M) = \emptyset? Is L(M)L(M) regular? Is L(M)L(M) finite? Does MM accept a particular string ww? All are undecidable.

Rice's theorem is a powerful generalisation of the halting problem, showing that essentially nothing interesting about a TM's language can be decided algorithmically.

rices-theoremdecidability
8short5 marks

Define alphabet, string, and language with examples.

Alphabet, String, and Language

Alphabet (Σ\Sigma): a finite, non-empty set of symbols. Example: Σ={0,1}\Sigma = \{0, 1\} (binary alphabet), or Σ={a,b,c}\Sigma = \{a, b, c\}.

String (word): a finite sequence of symbols drawn from an alphabet. The empty string is denoted ϵ\epsilon (length 0). The length of string ww is written w|w|. Example: over Σ={0,1}\Sigma=\{0,1\}, 0110 is a string with 0110=4|0110| = 4; ϵ\epsilon is also a string.

Language: a set of strings over an alphabet, i.e. any subset of Σ\Sigma^* (where Σ\Sigma^* is the set of all strings over Σ\Sigma). A language may be finite or infinite. Example: over Σ={0,1}\Sigma=\{0,1\}:

  • L1={0,01,011}L_1 = \{0, 01, 011\} — a finite language.
  • L2={ww ends with 1}L_2 = \{w \mid w \text{ ends with } 1\} — an infinite language.
  • L3={anbnn0}L_3 = \{a^nb^n \mid n \ge 0\} over {a,b}\{a,b\}.

Thus: symbols form an alphabet, sequences of symbols form strings, and sets of strings form languages.

formal-languages
9short5 marks

Differentiate between DFA and NFA.

Difference between DFA and NFA

FeatureDFA (Deterministic)NFA (Non-deterministic)
Transition functionδ:Q×ΣQ\delta: Q \times \Sigma \to Q (single next state)δ:Q×(Σ{ϵ})2Q\delta: Q \times (\Sigma\cup\{\epsilon\}) \to 2^Q (set of states)
Next moveExactly one state per (state, symbol)Zero, one, or many states
ϵ\epsilon-transitionsNot allowedAllowed
AcceptanceUnique path; accept if it ends in a final stateAccept if some path ends in a final state
ConstructionHarder to designEasier / more intuitive to design
Number of statesMay need more statesUsually fewer states
BacktrackingNot requiredMay need to explore multiple paths
PowerEquivalent — both recognise exactly the regular languagesEquivalent (convertible to DFA via subset construction)

Key point: Although NFAs appear more powerful, every NFA can be converted to an equivalent DFA, so both accept exactly the same class of languages (regular languages).

dfanfa
10short5 marks

Construct a DFA that accepts all binary strings ending with '00'.

DFA Accepting Binary Strings Ending with '00'

We track how many trailing 0s have been seen.

States: q0q_0 (no useful suffix / last symbol not contributing), q1q_1 (string ends in a single 0), q2q_2 (string ends in 00, accepting).

Stateon 0on 1
q0\to q_0q1q_1q0q_0
q1q_1q2q_2q0q_0
q2*q_2q2q_2q0q_0
  • Q={q0,q1,q2}Q = \{q_0, q_1, q_2\}, start =q0= q_0, F={q2}F = \{q_2\}, Σ={0,1}\Sigma = \{0,1\}.

Logic:

  • Reading 1 always resets to q0q_0 (no trailing zeros).
  • Reading 0 from q0q_0 goes to q1q_1 (one trailing zero); from q1q_1 to q2q_2 (two or more); from q2q_2 stays in q2q_2 (still ends in 00).

The DFA accepts iff it ends in q2q_2, i.e. the string ends with 00. Example: 100 is accepted; 010 is rejected.

dfa
11short5 marks

Explain epsilon-closure with a suitable example.

Epsilon-Closure

In an NFA with ϵ\epsilon-transitions, the ϵ\epsilon-closure of a state qq, written ECLOSE(q)ECLOSE(q) or ϵ-closure(q)\epsilon\text{-}closure(q), is the set of all states reachable from qq using zero or more ϵ\epsilon-transitions (including qq itself).

Formally:

  • qϵ-closure(q)q \in \epsilon\text{-}closure(q).
  • If pϵ-closure(q)p \in \epsilon\text{-}closure(q) and there is an ϵ\epsilon-transition prp \to r, then rϵ-closure(q)r \in \epsilon\text{-}closure(q).

It is used when simulating ϵ\epsilon-NFAs and when converting an ϵ\epsilon-NFA to a DFA.

Example

Consider an ϵ\epsilon-NFA with transitions:

q0ϵq1,q1ϵq2,q1aq3.q_0 \xrightarrow{\epsilon} q_1, \quad q_1 \xrightarrow{\epsilon} q_2, \quad q_1 \xrightarrow{a} q_3.

Then:

  • ϵ-closure(q0)={q0,q1,q2}\epsilon\text{-}closure(q_0) = \{q_0, q_1, q_2\} (reach q1q_1 via one ϵ\epsilon, then q2q_2 via another).
  • ϵ-closure(q1)={q1,q2}\epsilon\text{-}closure(q_1) = \{q_1, q_2\}.
  • ϵ-closure(q3)={q3}\epsilon\text{-}closure(q_3) = \{q_3\} (no outgoing ϵ\epsilon-moves).
epsilon-nfa
12short5 marks

Write a regular expression for strings over {0,1} that contain at least one '0' and at least one '1'.

Regular Expression: at least one 0 and at least one 1

For strings over {0,1}\{0,1\} containing at least one 0 AND at least one 1, account for the two possible orderings (the first 0 may come before or after the first 1):

(0+1)0(0+1)1(0+1)  +  (0+1)1(0+1)0(0+1)(0+1)^*\,0\,(0+1)^*\,1\,(0+1)^* \;+\; (0+1)^*\,1\,(0+1)^*\,0\,(0+1)^*

Explanation:

  • The first term (0+1)0(0+1)1(0+1)(0+1)^* 0 (0+1)^* 1 (0+1)^* matches strings where a 0 appears somewhere before a 1.
  • The second term covers the case where a 1 appears before a 0.
  • Their union guarantees the string has both at least one 0 and at least one 1, in either order.

Examples accepted: 01, 10, 0011, 1010. Rejected: 000, 111, \epsilon (each lacks one of the symbols).

regular-expression

Frequently asked questions

Where can I find the BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) question paper 2080?
The full BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) 2080 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Theory of Computation (BSc CSIT, CSC257) 2080 paper come with solutions?
Yes. Every question on this Theory of Computation (BSc CSIT, CSC257) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) 2080 paper?
The BSc CSIT (TU) Theory of Computation (BSc CSIT, CSC257) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Theory of Computation (BSc CSIT, CSC257) past paper free?
Yes — reading and attempting this Theory of Computation (BSc CSIT, CSC257) past paper on Kekkei is completely free.