Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Explain the different phases of a compiler with a suitable diagram, showing the output of each phase for the statement position = initial + rate * 60.

Phases of a Compiler

A compiler translates a source program into an equivalent target program through a sequence of phases, grouped into the analysis (front end) and synthesis (back end). Each phase consults the symbol table and an error handler.

Source program
     |
[1] Lexical Analyzer (Scanner)
     |   -> tokens
[2] Syntax Analyzer (Parser)
     |   -> syntax (parse) tree
[3] Semantic Analyzer
     |   -> annotated tree
[4] Intermediate Code Generator
     |   -> three-address code
[5] Code Optimizer
     |   -> optimized IR
[6] Target Code Generator
     |
Target program

(Symbol Table and Error Handler interact with all phases.)

Description of each phase

  1. Lexical analysis – groups characters into tokens (identifiers, operators, constants) and removes whitespace/comments.
  2. Syntax analysis – checks that tokens form valid sentences per the grammar and builds a parse/syntax tree.
  3. Semantic analysis – performs type checking, scope resolution and inserts coercions.
  4. Intermediate code generation – produces a machine-independent IR such as three-address code.
  5. Code optimization – improves the IR (removes redundancy, folds constants) without changing meaning.
  6. Code generation – produces target/assembly code, assigns registers.

Output of each phase for position = initial + rate * 60

Lexical analysis (tokens):

id1 = id2 + id3 * 60

where id1=position, id2=initial, id3=rate, and =, +, * are operators, 60 is a number.

Syntax analysis (syntax tree):

        =
       / \
     id1   +
          /  \
        id2    *
              /  \
            id3   60

Semantic analysis60 (integer) is converted to real to match the real operands:

        =
       / \
     id1   +
          /  \
        id2    *
              /  \
            id3   inttoreal
                     |
                     60

Intermediate code (three-address code):

t1 = inttoreal(60)
t2 = id3 * t1
t3 = id2 + t2
id1 = t3

Code optimization:

t1 = id3 * 60.0
id1 = id2 + t1

Code generation (assembly):

MOVF  id3, R2
MULF  #60.0, R2
MOVF  id2, R1
ADDF  R2, R1
MOVF  R1, id1
phasescompiler
2long10 marks

Construct the LR(1) parsing table for the grammar S -> CC, C -> cC | d and parse the input string cdd.

LR(1) / Canonical LR Parsing Table

Augmented grammar:

0. S' -> S
1. S  -> C C
2. C  -> c C
3. C  -> d

FIRST sets

FIRST(S)=FIRST(C)={c,d}FIRST(S)=FIRST(C)=\{c,d\}.

Canonical collection of LR(1) item sets

I0: S' -> .S, $ ; S -> .CC, $ ; C -> .cC, c/d ; C -> .d, c/d

I1 = goto(I0,S): S' -> S., $ (accept)

I2 = goto(I0,C): S -> C.C, $ ; C -> .cC, $ ; C -> .d, $

I3 = goto(I0,c): C -> c.C, c/d ; C -> .cC, c/d ; C -> .d, c/d

I4 = goto(I0,d): C -> d., c/d

I5 = goto(I2,C): S -> CC., $

I6 = goto(I2,c): C -> c.C, $ ; C -> .cC, $ ; C -> .d, $

I7 = goto(I2,d): C -> d., $

I8 = goto(I3,C): C -> cC., c/d

I9 = goto(I6,C): C -> cC., $

Gotos: goto(I3,c)=I3, goto(I3,d)=I4; goto(I6,c)=I6, goto(I6,d)=I7.

Parsing table

(sN=shift to N, rN=reduce by rule N, acc=accept)

Statecd$SC
0s3s412
1acc
2s6s75
3s3s48
4r3r3
5r1
6s6s79
7r3
8r2r2
9r2

Parsing the string cdd$

StackInputAction
0c d d $shift s3
0 c 3d d $shift s4
0 c 3 d 4d $reduce C->d (r3), goto(3,C)=8
0 c 3 C 8d $reduce C->cC (r2), goto(0,C)=2
0 C 2d $shift s7
0 C 2 d 7$reduce C->d (r3), goto(2,C)=5
0 C 2 C 5$reduce S->CC (r1), goto(0,S)=1
0 S 1$accept

The string cdd is accepted (valid).

lr-parsingparsing
3long10 marks

What is a directed acyclic graph (DAG)? Represent the expression a + a * (b - c) + (b - c) * d using three-address code, quadruples and triples.

Directed Acyclic Graph (DAG)

A DAG is a directed graph with no directed cycles, used to represent the structure of basic blocks/expressions. Unlike a syntax tree, common sub-expressions are shared (a node is created only once and reused), so the DAG exposes redundancy and helps in optimization. Interior nodes are operators; leaves are identifiers/constants.

Expression: a + a * (b - c) + (b - c) * d

The common sub-expression (b - c) is computed once and shared.

DAG (described): a leaf b and c feed a - node giving t1. This t1 is used by both a * with a (t2 = a * t1) and a * with d (t4 = t1 * d). Then t3 = a + t2, and finally t5 = t3 + t4.

Three-Address Code

t1 = b - c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4

Quadruples

(op, arg1, arg2, result)

#oparg1arg2result
0-bct1
1*at1t2
2+at2t3
3*t1dt4
4+t3t4t5

Triples

(temporaries referenced by instruction index (n))

#oparg1arg2
(0)-bc
(1)*a(0)
(2)+a(1)
(3)*(0)d
(4)+(2)(3)
dagintermediate-code
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

Differentiate between a compiler and an interpreter.

Compiler vs Interpreter

CompilerInterpreter
Translates the whole source program into machine/target code at once.Translates and executes the program line by line / statement by statement.
Produces a separate object/target program.Does not produce object code; executes directly.
Execution is faster (translation done once).Execution is slower (re-translates each run).
Reports all errors after scanning the entire program.Reports the first error and stops, easing debugging.
Needs more memory for object code.Generally needs less memory.
Examples: C, C++ compilers.Examples: Python, BASIC interpreters.
compiler
5short5 marks

Eliminate left recursion and perform left factoring on the grammar A -> Aab | Ac | bd | f.

Grammar: A -> Aab | Ac | bd | f

Step 1: Eliminate left recursion

The rule has the form A -> A\alpha | \beta with α{ab,c}\alpha \in \{ab, c\} and β{bd,f}\beta \in \{bd, f\}.

Using the standard transformation A -> \beta A', A' -> \alpha A' | \epsilon:

A  -> bd A' | f A'
A' -> ab A' | c A' | ε

Step 2: Left factoring

In the A productions, both alternatives bd A' and f A' begin with different symbols (b vs f), so there is no common prefix — no left factoring is needed for A.

For A', the alternatives ab A', c A', ε also have distinct first symbols (a, c), so no common prefix exists.

Final grammar (left-recursion-free, already left-factored)

A  -> bd A' | f A'
A' -> ab A' | c A' | ε
grammarparsing
6short5 marks

What is a symbol table? Explain its role in a compiler.

Symbol Table

A symbol table is a data structure maintained by the compiler that stores information about each identifier (variables, constants, functions, types) appearing in the source program. Typical attributes stored are the name, type, scope, storage location/offset, size, and parameter details.

Role in a compiler

  • Lexical analysis – enters identifiers into the table as tokens are recognized.
  • Syntax & semantic analysis – used for type checking, scope resolution and detecting undeclared/duplicate identifiers.
  • Intermediate & code generation – provides addresses/offsets and type/size information for generating correct code.
  • Optimization – supplies attribute information used during analysis.

It supports two main operations: insert (add a new entry) and lookup (retrieve an entry). It is commonly implemented using a hash table for fast access, or lists/trees. Thus the symbol table acts as a central repository accessed by almost all phases of the compiler.

symbol-table
7short5 marks

Differentiate between a parse tree and a syntax tree.

Parse Tree vs Syntax Tree

Parse Tree (Concrete Syntax Tree)Syntax Tree (Abstract Syntax Tree)
Shows the complete derivation according to the grammar.A condensed form of the parse tree.
Every grammar symbol, including non-terminals and intermediate productions, appears as a node.Only operators and operands are kept; redundant nodes are removed.
Leaves are terminals; interior nodes are non-terminals.Interior nodes are operators; leaves are operands.
Larger, contains chains for single productions.Smaller, more efficient for later phases.

Example: for a + b * c

Parse tree expands through E -> E + T, T -> T * F, F -> id, etc., showing all E, T, F nodes.

Syntax tree:

      +
     / \
    a   *
       / \
      b   c
parsing
8short5 marks

What is an ambiguous grammar? Show with an example.

Ambiguous Grammar

A grammar is ambiguous if there exists at least one string in its language that has more than one distinct parse tree (or equivalently, more than one leftmost/rightmost derivation).

Example

Grammar: E -> E + E | E * E | id

For the string id + id * id, two different parse trees exist:

Parse tree 1 (multiplication first — correct precedence):

      +
     / \
   id   *
       / \
     id   id

Parse tree 2 (addition first):

        *
       / \
      +   id
     / \
   id   id

Since the same string yields two parse trees, the grammar is ambiguous. Ambiguity is undesirable because it makes parsing non-deterministic; it can be removed by introducing precedence/associativity (e.g. using E -> E + T | T, T -> T * F | F, F -> id).

grammar
9short5 marks

Explain the role of the lexical analyzer in a compiler.

Role of the Lexical Analyzer (Scanner)

The lexical analyzer is the first phase of a compiler. It reads the source program as a stream of characters and groups them into meaningful units called tokens (e.g. identifiers, keywords, constants, operators, punctuation).

Main roles / functions

  • Tokenization – scans input and produces <token-name, attribute-value> pairs for the parser.
  • Removing whitespace and comments – discards blanks, tabs, newlines and comments.
  • Symbol table interaction – enters identifiers/constants into the symbol table.
  • Error reporting – detects illegal characters/tokens and reports their line number.
  • Macro/preprocessing & line counting in some implementations.

It works on demand: the parser calls getNextToken(), and the scanner returns the next token. It typically uses regular expressions specified by the language and is implemented as a finite automaton (DFA), often generated by tools like LEX/Flex.

lexical-analysis
10short5 marks

Differentiate between LL(1) and LR parsers.

LL(1) vs LR Parsers

LL(1) ParserLR Parser
Top-down parsing.Bottom-up parsing.
First L=left-to-right scan, second L=Leftmost derivation.L=left-to-right scan, R=Rightmost derivation (in reverse).
Builds the parse tree from root to leaves.Builds the parse tree from leaves to root.
1 means one lookahead symbol.Uses item sets/states; LR(0), SLR(1), LALR(1), CLR(1).
Grammar must have no left recursion and must be left-factored.Handles a larger class of grammars (incl. left recursion).
Uses predictive/recursive-descent parsing with a stack.Uses a shift-reduce parser driven by a parsing table.
Less powerful, simpler to implement by hand.More powerful; usually generated by tools (YACC).
parsing
11short5 marks

What are regular expressions? Write a regular expression for identifiers.

Regular Expressions

A regular expression (RE) is a notation for describing regular languages — the set of strings recognized by finite automata. They are used in lexical analysis to specify the patterns of tokens.

Operations

Given REs r and s:

  • Union: r | s
  • Concatenation: r s
  • Kleene closure (zero or more): r*
  • Positive closure (one or more): r+

Regular expression for identifiers

An identifier begins with a letter or underscore, followed by zero or more letters, digits, or underscores:

letter -> A | B | ... | Z | a | b | ... | z
digit  -> 0 | 1 | ... | 9

identifier -> ( letter | _ ) ( letter | digit | _ )*

Compactly: [A-Za-z_][A-Za-z0-9_]*

regular-expressionlexical-analysis
12short5 marks

Explain the handle and handle pruning in shift-reduce parsing.

Handle and Handle Pruning in Shift-Reduce Parsing

Handle

A handle of a right-sentential form γ\gamma is a substring that matches the right side of a production and whose reduction to the left-side non-terminal represents one step along the reverse of a rightmost derivation. Formally, if SrmαAwrmαβwS \Rightarrow_{rm}^{*} \alpha A w \Rightarrow_{rm} \alpha \beta w, then AβA \rightarrow \beta in the position following α\alpha is a handle.

Handle Pruning

Handle pruning is the process of repeatedly finding a handle and reducing it (replacing it by the corresponding non-terminal) until the start symbol is reached. It produces a rightmost derivation in reverse and is the basis of bottom-up (shift-reduce) parsing.

Example

Grammar: E -> E + E | E * E | id, input id + id * id:

Right-sentential formHandleReducing production
id + id * idid (first)E -> id
E + id * ididE -> id
E + E * ididE -> id
E + E * EE * EE -> E * E
E + EE + EE -> E + E
E(start symbol reached)

In a shift-reduce parser, symbols are shifted onto the stack until a handle appears on top, which is then reduced (pruned).

parsing

Frequently asked questions

Where can I find the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) question paper 2074?
The full BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2074 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Compiler Design and Construction (BSc CSIT, CSC365) 2074 paper come with solutions?
Yes. Every question on this Compiler Design and Construction (BSc CSIT, CSC365) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2074 paper?
The BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2074 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Compiler Design and Construction (BSc CSIT, CSC365) past paper free?
Yes — reading and attempting this Compiler Design and Construction (BSc CSIT, CSC365) past paper on Kekkei is completely free.