BSc CSIT (TU) Science Compiler Design and Construction (BSc CSIT, CSC365) Question Paper 2078 Nepal
This is the official BSc CSIT (TU) (Science stream) Compiler Design and Construction (BSc CSIT, CSC365) question paper for 2078, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Compiler Design and Construction (BSc CSIT, CSC365) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) exam or solving previous years' question papers, this 2078 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
Explain semantic analysis and type checking. Discuss type expressions, type systems and type conversion.
Semantic Analysis and Type Checking
Semantic analysis is the phase of a compiler that checks the source program for semantic consistency with the language definition, using the syntax tree and symbol table produced by the parser. It enforces rules that cannot be captured by context-free grammar, e.g. every variable is declared before use, operands of an operator have compatible types, function calls have the correct number and type of arguments, and break appears only inside a loop.
Type checking is the central activity of semantic analysis. It verifies that each operation in the program receives operands of permissible types, reporting a type error otherwise. It works by computing a type for every expression and comparing it against the type expected by its context.
Type Expressions
A type expression denotes the type of a language construct. It is either:
- A basic type such as
int,real,bool,char, plus special typesvoid(no value) andtype_error(used during error recovery). - A type name (a name standing for a type expression).
- A type formed by applying a type constructor to type expressions:
- array:
array(I, T)— an array of elements of typeTwith index setI. - product:
T1 × T2— used for argument lists and tuples. - record:
record((name1 × T1) × (name2 × T2) × …). - pointer:
pointer(T)— a pointer to an object of typeT. - function:
D → R— a function mapping a domain typeDto a range typeR.
- array:
Example: int f(char, char) has type expression char × char → int, and int a[10] has type array(0..9, int).
Type Systems
A type system is a collection of rules for assigning type expressions to the parts of a program. It is implemented by the type checker, often expressed as syntax-directed definitions.
- A language is statically typed if type checking is done at compile time (C, Java).
- It is dynamically typed if checks are done at run time (Python, Lisp).
- A type system is sound if it never lets a type error escape to run time; a sound static type checker eliminates the need for run-time checks.
Example type-checking rule for an addition expression:
E -> E1 + E2 { E.type = if (E1.type == int && E2.type == int) int
else type_error }
Type Conversion
Type conversion (coercion) occurs when an operator is applied to operands of different types, e.g. x + i where x is real and i is int. The checker converts one operand so both have a common type.
- Implicit conversion (coercion) is inserted automatically by the compiler, usually widening (
int → real) to avoid loss of information. - Explicit conversion (casting) is requested by the programmer, e.g.
(int) x, and may narrow a type.
Example rule:
E -> E1 + E2 { if (E1.type==int && E2.type==real)
generate code: t = inttoreal E1; E.type = real }
Conclusion: Semantic analysis, driven by the type system and type expressions, ensures the program is meaningful, and type conversion reconciles mixed-type operations before code generation.
Explain the different phases of a compiler with a suitable diagram, showing the output of each phase for the statement position = initial + rate * 60.
Phases of a Compiler
A compiler is divided into two parts — the analysis (front end) and the synthesis (back end) — comprising six main phases, supported by the symbol table and an error handler that interact with every phase.
source program
|
[1] Lexical Analyzer (Scanner)
| tokens
[2] Syntax Analyzer (Parser)
| syntax tree
[3] Semantic Analyzer
| annotated tree
[4] Intermediate Code Generator
| three-address code
[5] Code Optimizer
| optimized IR
[6] Target Code Generator
|
target program
(Symbol Table and Error Handler sit beside the column and connect to all six phases.)
The six phases
- Lexical Analysis – groups the character stream into tokens and removes whitespace/comments.
- Syntax Analysis – builds a parse/syntax tree checking grammatical structure.
- Semantic Analysis – checks meaning (type checking, scope) and annotates the tree.
- Intermediate Code Generation – produces a machine-independent IR (e.g. three-address code).
- Code Optimization – improves the IR for speed/space.
- Code Generation – emits target machine/assembly code.
Output of each phase for position = initial + rate * 60
1. Lexical Analyzer — token stream (and symbol-table entries for identifiers):
id1 = id2 + id3 * 60
where id1=position, id2=initial, id3=rate, and 60 is a number constant.
2. Syntax Analyzer — syntax tree (* binds tighter than +):
=
/ \
id1 +
/ \
id2 *
/ \
id3 60
3. Semantic Analyzer — same tree, with type checking; since rate is real and 60 is int, an inttoreal coercion is inserted:
=
/ \
id1 +
/ \
id2 *
/ \
id3 inttoreal
|
60
4. Intermediate Code Generator — three-address code:
t1 = inttoreal(60)
t2 = id3 * t1
t3 = id2 + t2
id1 = t3
5. Code Optimizer — inttoreal(60) is folded to the constant 60.0, and t3 is removed by copy propagation:
t1 = id3 * 60.0
id1 = id2 + t1
6. Code Generator — target (assembly) code (F = floating registers):
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2
STF id1, R1
Construct the LR(1) parsing table for the grammar S -> CC, C -> cC | d and parse the input string cdd.
LR(1) / Canonical CLR Parsing Table for S → CC, C → cC | d
Augmented grammar (with numbered productions):
0. S' → S
1. S → C C
2. C → c C
3. C → d
Canonical collection of LR(1) item sets
Using CLOSURE and GOTO (item form A → α·β , a):
- I0:
S'→·S, $;S→·CC, $;C→·cC, c/d;C→·d, c/d - I1: GOTO(I0,S) =
S'→S·, $ - I2: GOTO(I0,C) =
S→C·C, $;C→·cC, $;C→·d, $ - I3: GOTO(I0,c) =
C→c·C, c/d;C→·cC, c/d;C→·d, c/d - I4: GOTO(I0,d) =
C→d·, c/d - I5: GOTO(I2,C) =
S→CC·, $ - I6: GOTO(I2,c) =
C→c·C, $;C→·cC, $;C→·d, $ - I7: GOTO(I2,d) =
C→d·, $ - I8: GOTO(I3,C) =
C→cC·, c/d - I9: GOTO(I6,C) =
C→cC·, $
GOTOs: I3 on c→I3, on d→I4; I6 on c→I6, on d→I7.
LR(1) parsing table
(sN = shift to N, rN = reduce by production N, acc = accept)
| State | c | d | $ | S | C |
|---|---|---|---|---|---|
| 0 | s3 | s4 | 1 | 2 | |
| 1 | acc | ||||
| 2 | s6 | s7 | 5 | ||
| 3 | s3 | s4 | 8 | ||
| 4 | r3 | r3 | |||
| 5 | r1 | ||||
| 6 | s6 | s7 | 9 | ||
| 7 | r3 | ||||
| 8 | r2 | r2 | |||
| 9 | r2 |
Parsing the input cdd (cdd$)
| Stack | Input | Action |
|---|---|---|
| 0 | c d d $ | shift s3 |
| 0 c 3 | d d $ | shift s4 |
| 0 c 3 d 4 | d $ | reduce C→d (r3), GOTO(3,C)=8 |
| 0 c 3 C 8 | d $ | reduce C→cC (r2), GOTO(0,C)=2 |
| 0 C 2 | d $ | shift s7 |
| 0 C 2 d 7 | $ | reduce C→d (r3), GOTO(2,C)=5 |
| 0 C 2 C 5 | $ | reduce S→CC (r1), GOTO(0,S)=1 |
| 0 S 1 | $ | accept |
The string cdd is accepted, with derivation S ⇒ CC ⇒ (cC)C ⇒ (cd)C ⇒ cd·d.
Section B: Short Answer Questions
Attempt any EIGHT questions.
What is three-address code? Write the three-address code for x = (a + b) * (c + d).
Three-Address Code
Three-address code (TAC) is a linearized intermediate representation in which each instruction has at most three addresses (operands) — typically two source operands and one result — and at most one operator on the right-hand side. The general form is x = y op z. Complex expressions are broken down using temporary variables (t1, t2, …), which makes the code easy to optimize and to translate to machine code. It can be implemented using quadruples, triples, or indirect triples.
TAC for x = (a + b) * (c + d)
t1 = a + b
t2 = c + d
t3 = t1 * t2
x = t3
Explain peephole optimization with an example.
Peephole Optimization
Peephole optimization is a simple, machine-dependent local optimization technique applied to the target code (or intermediate code). The optimizer slides a small window — the peephole — over a short sequence of consecutive instructions and replaces it with a shorter or faster equivalent sequence. Several passes may be needed because one improvement can expose another.
Common transformations
- Redundant load/store elimination – remove a load that immediately follows a store of the same location.
- Constant folding – evaluate constant expressions at compile time.
- Algebraic simplification – remove
x = x + 0,x = x * 1. - Strength reduction – replace expensive operators with cheaper ones, e.g.
x * 2 → x << 1. - Dead-code elimination – remove instructions whose results are never used.
- Flow-of-control / jump optimization – eliminate jumps to jumps.
Example (redundant store/load)
MOV R0, a MOV R0, a
MOV a, R0 => (second instruction removed)
Here the store MOV a, R0 followed by reload of a into R0 is redundant and can be deleted within the peephole.
Differentiate between synthesized and inherited attributes.
Synthesized vs Inherited Attributes
In syntax-directed definitions, each grammar symbol has attributes whose values are computed by semantic rules.
| Aspect | Synthesized Attribute | Inherited Attribute |
|---|---|---|
| Definition | Value computed from the attributes of the node's children (and itself). | Value computed from the attributes of the node's parent and/or siblings. |
| Direction of flow | Bottom-up in the parse tree. | Top-down / sideways in the parse tree. |
For production A → XYZ | An attribute of A is computed from attributes of X, Y, Z. | An attribute of X, Y, or Z is computed from attributes of A and the other symbols. |
| Terminals | Terminals can have synthesized attributes (from the lexer). | Terminals do not have inherited attributes. |
| Grammar class | Can be evaluated by an S-attributed definition (only synthesized). | An L-attributed definition allows inherited attributes (with left-to-right restriction). |
| Evaluation | Suits bottom-up (LR) parsing. | Suits top-down (LL) parsing. |
| Example | E → E1 + T { E.val = E1.val + T.val } | D → T L { L.type = T.type } (type passed down to the list L). |
What is bootstrapping in compiler construction?
Bootstrapping
Bootstrapping is the technique of writing a compiler (or part of it) in the same source language that it is intended to compile, and then using a simpler/existing compiler to build it, eventually making the compiler compile itself.
Compilers are described by T-diagrams, characterized by three languages: the source language (S) it translates, the target language (T) it produces, and the implementation language (I) it is written in — written as S–I–T.
Idea
To build a compiler for language L that runs on machine M:
- Write a small/simple compiler for a subset of
Lin an existing language (or write it inLitself and compile it with another available compiler). - Rewrite the full compiler in
L. - Use the first compiler to compile the second, producing a compiler for
Lwritten inL, running onM.
Significance
- The language becomes self-hosting — it can compile itself.
- It reduces dependence on other languages and tests the compiler on a large real program (itself).
- Cross-compilation: a compiler running on machine A can be made to generate code for machine B, then ported.
Construct an NFA for the regular expression (a|b)*abb.
NFA for (a|b)*abb
Using Thompson's construction, we build an NFA with ε-transitions. Let states be numbered 0..10; state 0 is the start and state 10 is the (single) accepting state.
Transitions
The sub-expression (a|b)* is a loop over a choice between a and b, followed by the fixed suffix a b b.
0 --ε--> 1, 7 (enter the (a|b)* loop or skip it)
1 --ε--> 2, 4 (choose a or b)
2 --a--> 3
4 --b--> 5
3 --ε--> 6 ; 5 --ε--> 6
6 --ε--> 1, 7 (repeat the star, or exit)
7 --a--> 8 (suffix: a)
8 --b--> 9 (suffix: b)
9 --b--> 10 (suffix: b) -- accepting
Description
- States
1–6form the Kleene-star block for(a|b): a choice (states 2/4) between ana-edge and ab-edge, wrapped in a loop that can be taken zero or more times via the ε-edges0→7and6→7. - After leaving the star, the machine reads the literal sequence
a(7→8),b(8→9),b(9→10). - Start state: 0. Accepting state: 10.
This NFA accepts exactly the strings over {a, b} that end in abb.
What is operator precedence parsing?
Operator Precedence Parsing
Operator precedence parsing is a bottom-up (shift-reduce) parsing technique that works on a restricted class of grammars called operator grammars — grammars having no ε-productions and no two adjacent non-terminals on the right-hand side of any production.
Key idea
Three precedence relations are defined between pairs of terminals to decide the handle:
a ⋖ b:ayields precedence tob.a ⋗ b:atakes precedence overb.a ≐ b:aandbhave equal precedence.
These relations are stored in an operator-precedence table. During parsing, the relation between the terminal on top of the stack and the current input terminal decides whether to shift (⋖ or ≐) or to reduce (⋗); the handle is the substring bounded by ⋖ … ⋗.
Advantages and limitations
- Advantages: simple, easy to construct by hand, fast.
- Limitations: applies only to a small class of grammars, cannot handle unary minus easily, and is not suitable for full programming-language grammars (mainly used for expression grammars).
Explain activation records and their role in run-time storage management.
Activation Records
An activation record (frame) is a contiguous block of memory on the run-time stack that holds all the information needed for a single execution (activation) of a procedure. Each procedure call pushes a new activation record; the record is popped when the call returns. This supports recursion, since each invocation gets its own copy of locals.
Typical fields (top to bottom)
+-----------------------------+
| Returned value |
+-----------------------------+
| Actual parameters |
+-----------------------------+
| Control link (dynamic link) | -> caller's activation record
+-----------------------------+
| Access link (static link) | -> enclosing scope (for nested procs)
+-----------------------------+
| Saved machine status | (return address, saved registers)
+-----------------------------+
| Local data (locals) |
+-----------------------------+
| Temporaries |
+-----------------------------+
Role in run-time storage management
- Manages memory in a stack (LIFO) discipline that mirrors the nesting of calls and returns.
- Allocates space for parameters, local variables, temporaries, return value and return address on each call and reclaims it automatically on return.
- The control link restores the caller on return; the access link gives non-local variable access for nested procedures.
- Variables are addressed as fixed offsets from a frame pointer, making access fast and recursion correct.
What is a basic block and a flow graph?
Basic Block and Flow Graph
Basic Block
A basic block is a maximal sequence of consecutive intermediate-code (three-address) statements in which:
- control enters only at the first statement (no jump into the middle), and
- control leaves only at the last statement (no halt or branch except possibly at the end).
Thus the statements in a basic block always execute as a unit, in order. They are identified by finding leaders: the first statement, any target of a jump, and any statement immediately following a jump.
Flow Graph
A flow graph is a directed graph in which:
- each node is a basic block, and
- there is a directed edge from block
B1to blockB2if control can pass from the end ofB1to the start ofB2(eitherB2immediately followsB1, orB1ends with a jump toB2's leader).
It has a designated entry (initial) block. The flow graph captures the control-flow structure of the program and is the basis for code optimization (loop detection, data-flow analysis, etc.).
Example
(1) i = 1 -> B1: {1}
(2) t1 = i * 4 -> B2 leader (2)
(3) ...
(4) if i <= 10 goto (2)
Block B2 has an edge back to itself (a loop) and the program forms a flow graph of these blocks.
Differentiate between top-down and bottom-up parsing.
Top-Down vs Bottom-Up Parsing
| Aspect | Top-Down Parsing | Bottom-Up Parsing |
|---|---|---|
| Construction of tree | Builds the parse tree from the root (start symbol) down to the leaves. | Builds the parse tree from the leaves (input) up to the root. |
| Derivation traced | Leftmost derivation. | Reverse of a rightmost derivation. |
| Basic operation | Predict & expand non-terminals (apply productions). | Shift–reduce: shift tokens, reduce handles to non-terminals. |
| Typical parsers | Recursive-descent, LL(1) predictive parsing. | LR(0), SLR(1), LALR(1), CLR(1), operator-precedence. |
| Grammar restrictions | Cannot handle left recursion; needs left factoring; less powerful. | Handles left recursion; accepts a larger class of grammars; more powerful. |
| Backtracking | Naive top-down may need backtracking; LL(1) avoids it with lookahead. | No backtracking; uses a parsing table and stack. |
| Ease vs power | Simpler, often hand-written. | More complex (table-driven), usually tool-generated (yacc/bison). |
Frequently asked questions
- Where can I find the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) question paper 2078?
- The full BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2078 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Compiler Design and Construction (BSc CSIT, CSC365) 2078 paper come with solutions?
- Yes. Every question on this Compiler Design and Construction (BSc CSIT, CSC365) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2078 paper?
- The BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2078 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Compiler Design and Construction (BSc CSIT, CSC365) past paper free?
- Yes — reading and attempting this Compiler Design and Construction (BSc CSIT, CSC365) past paper on Kekkei is completely free.