Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Explain semantic analysis and type checking. Discuss type expressions, type systems and type conversion.

Semantic Analysis and Type Checking

Semantic analysis is the phase of a compiler that checks the source program for semantic consistency with the language definition, using the syntax tree and symbol table produced by the parser. It enforces rules that cannot be captured by context-free grammar, e.g. every variable is declared before use, operands of an operator have compatible types, function calls have the correct number and type of arguments, and break appears only inside a loop.

Type checking is the central activity of semantic analysis. It verifies that each operation in the program receives operands of permissible types, reporting a type error otherwise. It works by computing a type for every expression and comparing it against the type expected by its context.

Type Expressions

A type expression denotes the type of a language construct. It is either:

  • A basic type such as int, real, bool, char, plus special types void (no value) and type_error (used during error recovery).
  • A type name (a name standing for a type expression).
  • A type formed by applying a type constructor to type expressions:
    • array: array(I, T) — an array of elements of type T with index set I.
    • product: T1 × T2 — used for argument lists and tuples.
    • record: record((name1 × T1) × (name2 × T2) × …).
    • pointer: pointer(T) — a pointer to an object of type T.
    • function: D → R — a function mapping a domain type D to a range type R.

Example: int f(char, char) has type expression char × char → int, and int a[10] has type array(0..9, int).

Type Systems

A type system is a collection of rules for assigning type expressions to the parts of a program. It is implemented by the type checker, often expressed as syntax-directed definitions.

  • A language is statically typed if type checking is done at compile time (C, Java).
  • It is dynamically typed if checks are done at run time (Python, Lisp).
  • A type system is sound if it never lets a type error escape to run time; a sound static type checker eliminates the need for run-time checks.

Example type-checking rule for an addition expression:

E -> E1 + E2   { E.type = if (E1.type == int && E2.type == int) int
                          else type_error }

Type Conversion

Type conversion (coercion) occurs when an operator is applied to operands of different types, e.g. x + i where x is real and i is int. The checker converts one operand so both have a common type.

  • Implicit conversion (coercion) is inserted automatically by the compiler, usually widening (int → real) to avoid loss of information.
  • Explicit conversion (casting) is requested by the programmer, e.g. (int) x, and may narrow a type.

Example rule:

E -> E1 + E2  { if (E1.type==int && E2.type==real)
                  generate code: t = inttoreal E1; E.type = real }

Conclusion: Semantic analysis, driven by the type system and type expressions, ensures the program is meaningful, and type conversion reconciles mixed-type operations before code generation.

semantic-analysistype-checking
2long10 marks

Explain the different phases of a compiler with a suitable diagram, showing the output of each phase for the statement position = initial + rate * 60.

Phases of a Compiler

A compiler is divided into two parts — the analysis (front end) and the synthesis (back end) — comprising six main phases, supported by the symbol table and an error handler that interact with every phase.

source program
      |
[1] Lexical Analyzer (Scanner)
      |  tokens
[2] Syntax Analyzer (Parser)
      |  syntax tree
[3] Semantic Analyzer
      |  annotated tree
[4] Intermediate Code Generator
      |  three-address code
[5] Code Optimizer
      |  optimized IR
[6] Target Code Generator
      |
 target program

(Symbol Table and Error Handler sit beside the column and connect to all six phases.)

The six phases

  1. Lexical Analysis – groups the character stream into tokens and removes whitespace/comments.
  2. Syntax Analysis – builds a parse/syntax tree checking grammatical structure.
  3. Semantic Analysis – checks meaning (type checking, scope) and annotates the tree.
  4. Intermediate Code Generation – produces a machine-independent IR (e.g. three-address code).
  5. Code Optimization – improves the IR for speed/space.
  6. Code Generation – emits target machine/assembly code.

Output of each phase for position = initial + rate * 60

1. Lexical Analyzer — token stream (and symbol-table entries for identifiers):

id1 = id2 + id3 * 60

where id1=position, id2=initial, id3=rate, and 60 is a number constant.

2. Syntax Analyzer — syntax tree (* binds tighter than +):

        =
       / \
     id1   +
          / \
        id2   *
             / \
           id3   60

3. Semantic Analyzer — same tree, with type checking; since rate is real and 60 is int, an inttoreal coercion is inserted:

        =
       / \
     id1   +
          / \
        id2   *
             / \
           id3  inttoreal
                   |
                  60

4. Intermediate Code Generator — three-address code:

t1 = inttoreal(60)
t2 = id3 * t1
t3 = id2 + t2
id1 = t3

5. Code Optimizerinttoreal(60) is folded to the constant 60.0, and t3 is removed by copy propagation:

t1 = id3 * 60.0
id1 = id2 + t1

6. Code Generator — target (assembly) code (F = floating registers):

LDF  R2, id3
MULF R2, R2, #60.0
LDF  R1, id2
ADDF R1, R1, R2
STF  id1, R1
phasescompiler
3long10 marks

Construct the LR(1) parsing table for the grammar S -> CC, C -> cC | d and parse the input string cdd.

LR(1) / Canonical CLR Parsing Table for S → CC, C → cC | d

Augmented grammar (with numbered productions):

0. S' → S
1. S  → C C
2. C  → c C
3. C  → d

Canonical collection of LR(1) item sets

Using CLOSURE and GOTO (item form A → α·β , a):

  • I0: S'→·S, $ ; S→·CC, $ ; C→·cC, c/d ; C→·d, c/d
  • I1: GOTO(I0,S) = S'→S·, $
  • I2: GOTO(I0,C) = S→C·C, $ ; C→·cC, $ ; C→·d, $
  • I3: GOTO(I0,c) = C→c·C, c/d ; C→·cC, c/d ; C→·d, c/d
  • I4: GOTO(I0,d) = C→d·, c/d
  • I5: GOTO(I2,C) = S→CC·, $
  • I6: GOTO(I2,c) = C→c·C, $ ; C→·cC, $ ; C→·d, $
  • I7: GOTO(I2,d) = C→d·, $
  • I8: GOTO(I3,C) = C→cC·, c/d
  • I9: GOTO(I6,C) = C→cC·, $

GOTOs: I3 on c→I3, on d→I4; I6 on c→I6, on d→I7.

LR(1) parsing table

(sN = shift to N, rN = reduce by production N, acc = accept)

Statecd$SC
0s3s412
1acc
2s6s75
3s3s48
4r3r3
5r1
6s6s79
7r3
8r2r2
9r2

Parsing the input cdd (cdd$)

StackInputAction
0c d d $shift s3
0 c 3d d $shift s4
0 c 3 d 4d $reduce C→d (r3), GOTO(3,C)=8
0 c 3 C 8d $reduce C→cC (r2), GOTO(0,C)=2
0 C 2d $shift s7
0 C 2 d 7$reduce C→d (r3), GOTO(2,C)=5
0 C 2 C 5$reduce S→CC (r1), GOTO(0,S)=1
0 S 1$accept

The string cdd is accepted, with derivation S ⇒ CC ⇒ (cC)C ⇒ (cd)C ⇒ cd·d.

lr-parsingparsing
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

What is three-address code? Write the three-address code for x = (a + b) * (c + d).

Three-Address Code

Three-address code (TAC) is a linearized intermediate representation in which each instruction has at most three addresses (operands) — typically two source operands and one result — and at most one operator on the right-hand side. The general form is x = y op z. Complex expressions are broken down using temporary variables (t1, t2, …), which makes the code easy to optimize and to translate to machine code. It can be implemented using quadruples, triples, or indirect triples.

TAC for x = (a + b) * (c + d)

t1 = a + b
t2 = c + d
t3 = t1 * t2
x  = t3
intermediate-code
5short5 marks

Explain peephole optimization with an example.

Peephole Optimization

Peephole optimization is a simple, machine-dependent local optimization technique applied to the target code (or intermediate code). The optimizer slides a small window — the peephole — over a short sequence of consecutive instructions and replaces it with a shorter or faster equivalent sequence. Several passes may be needed because one improvement can expose another.

Common transformations

  • Redundant load/store elimination – remove a load that immediately follows a store of the same location.
  • Constant folding – evaluate constant expressions at compile time.
  • Algebraic simplification – remove x = x + 0, x = x * 1.
  • Strength reduction – replace expensive operators with cheaper ones, e.g. x * 2 → x << 1.
  • Dead-code elimination – remove instructions whose results are never used.
  • Flow-of-control / jump optimization – eliminate jumps to jumps.

Example (redundant store/load)

MOV R0, a      MOV R0, a
MOV a, R0  =>  (second instruction removed)

Here the store MOV a, R0 followed by reload of a into R0 is redundant and can be deleted within the peephole.

code-optimization
6short5 marks

Differentiate between synthesized and inherited attributes.

Synthesized vs Inherited Attributes

In syntax-directed definitions, each grammar symbol has attributes whose values are computed by semantic rules.

AspectSynthesized AttributeInherited Attribute
DefinitionValue computed from the attributes of the node's children (and itself).Value computed from the attributes of the node's parent and/or siblings.
Direction of flowBottom-up in the parse tree.Top-down / sideways in the parse tree.
For production A → XYZAn attribute of A is computed from attributes of X, Y, Z.An attribute of X, Y, or Z is computed from attributes of A and the other symbols.
TerminalsTerminals can have synthesized attributes (from the lexer).Terminals do not have inherited attributes.
Grammar classCan be evaluated by an S-attributed definition (only synthesized).An L-attributed definition allows inherited attributes (with left-to-right restriction).
EvaluationSuits bottom-up (LR) parsing.Suits top-down (LL) parsing.
ExampleE → E1 + T { E.val = E1.val + T.val }D → T L { L.type = T.type } (type passed down to the list L).
syntax-directed-translation
7short5 marks

What is bootstrapping in compiler construction?

Bootstrapping

Bootstrapping is the technique of writing a compiler (or part of it) in the same source language that it is intended to compile, and then using a simpler/existing compiler to build it, eventually making the compiler compile itself.

Compilers are described by T-diagrams, characterized by three languages: the source language (S) it translates, the target language (T) it produces, and the implementation language (I) it is written in — written as S–I–T.

Idea

To build a compiler for language L that runs on machine M:

  1. Write a small/simple compiler for a subset of L in an existing language (or write it in L itself and compile it with another available compiler).
  2. Rewrite the full compiler in L.
  3. Use the first compiler to compile the second, producing a compiler for L written in L, running on M.

Significance

  • The language becomes self-hosting — it can compile itself.
  • It reduces dependence on other languages and tests the compiler on a large real program (itself).
  • Cross-compilation: a compiler running on machine A can be made to generate code for machine B, then ported.
bootstrapping
8short5 marks

Construct an NFA for the regular expression (a|b)*abb.

NFA for (a|b)*abb

Using Thompson's construction, we build an NFA with ε-transitions. Let states be numbered 0..10; state 0 is the start and state 10 is the (single) accepting state.

Transitions

The sub-expression (a|b)* is a loop over a choice between a and b, followed by the fixed suffix a b b.

0  --ε--> 1, 7              (enter the (a|b)* loop or skip it)
1  --ε--> 2, 4             (choose a or b)
2  --a--> 3
4  --b--> 5
3  --ε--> 6 ;  5 --ε--> 6
6  --ε--> 1, 7             (repeat the star, or exit)
7  --a--> 8                (suffix: a)
8  --b--> 9                (suffix: b)
9  --b--> 10               (suffix: b)  -- accepting

Description

  • States 1–6 form the Kleene-star block for (a|b): a choice (states 2/4) between an a-edge and a b-edge, wrapped in a loop that can be taken zero or more times via the ε-edges 0→7 and 6→7.
  • After leaving the star, the machine reads the literal sequence a (7→8), b (8→9), b (9→10).
  • Start state: 0. Accepting state: 10.

This NFA accepts exactly the strings over {a, b} that end in abb.

finite-automatalexical-analysis
9short5 marks

What is operator precedence parsing?

Operator Precedence Parsing

Operator precedence parsing is a bottom-up (shift-reduce) parsing technique that works on a restricted class of grammars called operator grammars — grammars having no ε-productions and no two adjacent non-terminals on the right-hand side of any production.

Key idea

Three precedence relations are defined between pairs of terminals to decide the handle:

  • a ⋖ b : a yields precedence to b.
  • a ⋗ b : a takes precedence over b.
  • a ≐ b : a and b have equal precedence.

These relations are stored in an operator-precedence table. During parsing, the relation between the terminal on top of the stack and the current input terminal decides whether to shift ( or ) or to reduce (); the handle is the substring bounded by ⋖ … ⋗.

Advantages and limitations

  • Advantages: simple, easy to construct by hand, fast.
  • Limitations: applies only to a small class of grammars, cannot handle unary minus easily, and is not suitable for full programming-language grammars (mainly used for expression grammars).
parsing
10short5 marks

Explain activation records and their role in run-time storage management.

Activation Records

An activation record (frame) is a contiguous block of memory on the run-time stack that holds all the information needed for a single execution (activation) of a procedure. Each procedure call pushes a new activation record; the record is popped when the call returns. This supports recursion, since each invocation gets its own copy of locals.

Typical fields (top to bottom)

+-----------------------------+
| Returned value              |
+-----------------------------+
| Actual parameters           |
+-----------------------------+
| Control link (dynamic link) | -> caller's activation record
+-----------------------------+
| Access link (static link)   | -> enclosing scope (for nested procs)
+-----------------------------+
| Saved machine status        | (return address, saved registers)
+-----------------------------+
| Local data (locals)         |
+-----------------------------+
| Temporaries                 |
+-----------------------------+

Role in run-time storage management

  • Manages memory in a stack (LIFO) discipline that mirrors the nesting of calls and returns.
  • Allocates space for parameters, local variables, temporaries, return value and return address on each call and reclaims it automatically on return.
  • The control link restores the caller on return; the access link gives non-local variable access for nested procedures.
  • Variables are addressed as fixed offsets from a frame pointer, making access fast and recursion correct.
runtime-environment
11short5 marks

What is a basic block and a flow graph?

Basic Block and Flow Graph

Basic Block

A basic block is a maximal sequence of consecutive intermediate-code (three-address) statements in which:

  • control enters only at the first statement (no jump into the middle), and
  • control leaves only at the last statement (no halt or branch except possibly at the end).

Thus the statements in a basic block always execute as a unit, in order. They are identified by finding leaders: the first statement, any target of a jump, and any statement immediately following a jump.

Flow Graph

A flow graph is a directed graph in which:

  • each node is a basic block, and
  • there is a directed edge from block B1 to block B2 if control can pass from the end of B1 to the start of B2 (either B2 immediately follows B1, or B1 ends with a jump to B2's leader).

It has a designated entry (initial) block. The flow graph captures the control-flow structure of the program and is the basis for code optimization (loop detection, data-flow analysis, etc.).

Example

(1) i = 1            -> B1: {1}
(2) t1 = i * 4       -> B2 leader (2)
(3) ...
(4) if i <= 10 goto (2)

Block B2 has an edge back to itself (a loop) and the program forms a flow graph of these blocks.

code-optimization
12short5 marks

Differentiate between top-down and bottom-up parsing.

Top-Down vs Bottom-Up Parsing

AspectTop-Down ParsingBottom-Up Parsing
Construction of treeBuilds the parse tree from the root (start symbol) down to the leaves.Builds the parse tree from the leaves (input) up to the root.
Derivation tracedLeftmost derivation.Reverse of a rightmost derivation.
Basic operationPredict & expand non-terminals (apply productions).Shift–reduce: shift tokens, reduce handles to non-terminals.
Typical parsersRecursive-descent, LL(1) predictive parsing.LR(0), SLR(1), LALR(1), CLR(1), operator-precedence.
Grammar restrictionsCannot handle left recursion; needs left factoring; less powerful.Handles left recursion; accepts a larger class of grammars; more powerful.
BacktrackingNaive top-down may need backtracking; LL(1) avoids it with lookahead.No backtracking; uses a parsing table and stack.
Ease vs powerSimpler, often hand-written.More complex (table-driven), usually tool-generated (yacc/bison).
parsing

Frequently asked questions

Where can I find the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) question paper 2078?
The full BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2078 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Compiler Design and Construction (BSc CSIT, CSC365) 2078 paper come with solutions?
Yes. Every question on this Compiler Design and Construction (BSc CSIT, CSC365) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2078 paper?
The BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2078 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Compiler Design and Construction (BSc CSIT, CSC365) past paper free?
Yes — reading and attempting this Compiler Design and Construction (BSc CSIT, CSC365) past paper on Kekkei is completely free.