BSc CSIT (TU) Science Compiler Design and Construction (BSc CSIT, CSC365) Question Paper 2075 Nepal
This is the official BSc CSIT (TU) (Science stream) Compiler Design and Construction (BSc CSIT, CSC365) question paper for 2075, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Compiler Design and Construction (BSc CSIT, CSC365) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) exam or solving previous years' question papers, this 2075 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
Explain the general structure of an LR parser. Construct the LR(0) item sets for the grammar S -> AA, A -> aA | b.
General Structure of an LR Parser
An LR parser is a bottom-up, shift-reduce parser that scans input Left-to-right and produces a Rightmost derivation in reverse. It consists of:
- Input buffer — holds the remaining input string followed by the end marker
$. - Stack — stores a sequence of states (and grammar symbols). The top state summarises all the information needed to make parsing decisions.
- Parsing program (driver) — a fixed routine, the same for every LR grammar.
- Parsing table — divided into two parts:
- ACTION[state, terminal] →
shift,reduce,accept, orerror. - GOTO[state, nonterminal] → next state after a reduction.
- ACTION[state, terminal] →
Working: Using the state on top of the stack and the current input symbol, the driver consults ACTION. On shift it pushes the symbol/state; on reduce A → β it pops 2·|β| entries, then pushes GOTO[top, A]; on accept parsing succeeds.
LR(0) Item Sets for the Grammar
Augmented grammar:
Using CLOSURE and GOTO:
- I0 = CLOSURE({S'→·S}):
S'→·S,S→·AA,A→·aA,A→·b - I1 = GOTO(I0,S):
S'→S· - I2 = GOTO(I0,A):
S→A·A,A→·aA,A→·b - I3 = GOTO(I0,a):
A→a·A,A→·aA,A→·b - I4 = GOTO(I0,b):
A→b· - I5 = GOTO(I2,A):
S→AA· - I6 = GOTO(I3,A):
A→aA· - GOTO(I2,a)=I3, GOTO(I2,b)=I4, GOTO(I3,a)=I3, GOTO(I3,b)=I4
There are 7 LR(0) states (I0–I6). The DFA of items recognises viable prefixes; states I1, I4, I5, I6 contain reducing/accepting items.
What is syntax-directed translation? Explain syntax-directed definitions with an example of a translation scheme.
Syntax-Directed Translation (SDT)
Syntax-directed translation is a method of translating a source program in which semantic rules (actions) are attached to the productions of a context-free grammar. As the parser recognises grammar productions, the associated semantic rules are evaluated to compute attribute values, build the syntax tree, generate intermediate code, etc. Translation is thus driven by the syntax of the language.
Syntax-Directed Definitions (SDD)
An SDD is a CFG together with attributes associated with grammar symbols and semantic rules associated with productions. Attributes are of two kinds:
- Synthesized attribute — computed from the attributes of the children (bottom-up).
- Inherited attribute — computed from the parent and/or sibling attributes (top-down).
An SDD specifies what to compute without specifying when/order; the dependency graph determines evaluation order.
Translation Scheme (Example)
A translation scheme is an SDD in which semantic actions are enclosed in { } and embedded at specific positions within the right side of productions, indicating the order of evaluation.
Example — evaluating an arithmetic expression (synthesized attribute val):
L → E { print(E.val) }
E → E1 + T { E.val = E1.val + T.val }
E → T { E.val = T.val }
T → T1 * F { T.val = T1.val * F.val }
T → F { T.val = F.val }
F → ( E ) { F.val = E.val }
F → digit { F.val = digit.lexval }
For input 3 + 4 * 5, the actions compute T.val = 20, then E.val = 3 + 20 = 23, and L prints 23. The actions fire as each production is reduced, demonstrating bottom-up syntax-directed translation.
Discuss the specification and recognition of tokens. Explain the role of finite automata and input buffering in lexical analysis.
Specification of Tokens
Tokens are specified using regular expressions (and regular definitions). A token is a category (e.g. id, num, relop), a pattern is the rule describing its form, and a lexeme is the actual string matched. Example regular definitions:
digit → [0-9]
letter → [A-Za-z]
id → letter (letter | digit)*
num → digit+ ( . digit+ )? ( E [+-]? digit+ )?
Recognition of Tokens
Regular expressions are converted to a transition diagram / finite automaton that the scanner simulates to recognise lexemes. On reaching an accepting state the longest matching lexeme is reported as a token; ties are broken by the longest-match and rule-priority (first listed) conventions.
Role of Finite Automata
The construction pipeline is:
- An NFA allows ε-moves and is easy to build from an RE but is non-deterministic.
- A DFA has exactly one move per (state, input) and is what the lexer actually executes — fast,
O(n)in input length.
Tools like LEX/Flex automatically build a DFA from the supplied regular expressions.
Role of Input Buffering
Reading the source one character at a time from disk is slow, and a lexer often needs look-ahead (e.g. to tell = from ==). To make this efficient, the lexer uses a two-buffer scheme with two halves of size N. Two pointers are maintained:
- lexemeBegin — start of the current lexeme.
- forward — scans ahead to find the end of the lexeme.
When forward crosses the end of one half, the other half is reloaded. The sentinel (eof) technique places an eof marker at the end of each half so that a single test detects both end-of-buffer and end-of-input, reducing comparisons per character.
Section B: Short Answer Questions
Attempt any EIGHT questions.
What is three-address code? Write the three-address code for x = (a + b) * (c + d).
Three-address code (TAC) is an intermediate representation in which each instruction has at most three addresses (operands) — typically of the form x = y op z, using at most one operator on the right-hand side. Complex expressions are broken into a sequence of such instructions using temporary variables, which makes it easy to rearrange, optimise, and generate target code.
Three-address code for x = (a + b) * (c + d):
t1 = a + b
t2 = c + d
t3 = t1 * t2
x = t3
Here t1, t2, t3 are compiler-generated temporaries.
Explain peephole optimization with an example.
Peephole Optimization
Peephole optimization is a simple, local code-improvement technique that examines a short sliding window (the peephole) of a few consecutive target/intermediate instructions and replaces them with a shorter or faster equivalent sequence. It is applied repeatedly until no further improvement is possible.
Common transformations:
- Redundant load/store elimination — remove
LD R, aimmediately afterST a, R. - Constant folding — replace
x = 2 * 3withx = 6. - Algebraic simplification — remove
x = x + 0orx = x * 1. - Strength reduction — replace
x = y * 2withx = y + yor a shift. - Dead-code / unreachable-code elimination.
- Flow-of-control optimization — remove jumps to jumps.
Example (redundant store/load):
MOV R0, a MOV R0, a
MOV a, R0 → ... (second MOV removed)
The instruction MOV a, R0 is unnecessary because a already holds the value just loaded into R0, so it is deleted, producing faster code.
Differentiate between synthesized and inherited attributes.
Synthesized vs Inherited Attributes
| Aspect | Synthesized Attribute | Inherited Attribute |
|---|---|---|
| Definition | Value computed from attributes of the children of the node | Value computed from attributes of the parent and/or siblings |
| Direction of flow | Bottom-up (leaves → root) | Top-down / sideways (root, siblings → node) |
| Production form | For A → X Y Z, A.s depends on X,Y,Z | For A → X Y Z, X.i (or Y.i, Z.i) depends on A and other RHS symbols |
| Parsing fit | Naturally evaluated during bottom-up (LR) parsing | Naturally suited to top-down parsing / passing context down |
| Grammar class | An SDD using only synthesized attributes is S-attributed | A grammar with restricted inherited attributes is L-attributed |
| Example | E → E1 + T { E.val = E1.val + T.val } | D → T L { L.type = T.type } (type passed down to the list) |
In short, synthesized attributes carry information up the parse tree, whereas inherited attributes carry context down (or across) the tree.
What is bootstrapping in compiler construction?
Bootstrapping
Bootstrapping is the technique of writing a compiler (or part of it) for a language in the same language it is meant to compile, and then using a smaller existing compiler to build a full self-hosting compiler in stages.
The idea is described with T-diagrams showing source language S, target/implementation language, and the implementing language. A typical bootstrapping process:
- Write a small/simple compiler for a subset of language L in an existing language (e.g. assembly or C). Call it .
- Rewrite the compiler for L in L itself, and compile this source using to obtain a working compiler .
- Use to compile its own source again, optionally adding optimisations, producing an improved, fully self-hosting compiler .
Advantages: the compiler becomes self-hosting, it serves as a large test program for the language, and improvements to the compiler automatically benefit later versions.
Example: an early C compiler was first written in assembly, then rewritten in C and compiled by the earlier version.
Construct an NFA for the regular expression (a|b)*abb.
NFA for (a|b)*abb
Using Thompson's construction, an NFA (with ε-moves) for the regular expression (a|b)*abb accepts any string of a's and b's that ends in the suffix abb.
A compact equivalent NFA (after collapsing the (a|b)* loop into a single self-looping start state) has states 0–3:
| State | on a | on b |
|---|---|---|
| →0 (start) | {0, 1} | {0} |
| 1 | ∅ | {2} |
| 2 | ∅ | {3} |
| *3 (accept) | ∅ | ∅ |
Description of the transition diagram:
- State 0 has a self-loop on both
aandb(this realises(a|b)*). - On
a, state 0 can also move to state 1 (begin matchingabb). - State 1 --
b--> State 2, State 2 --b--> State 3. - State 3 is the single accepting state.
The nondeterminism at state 0 (staying in the loop vs. starting abb) is exactly what an NFA permits. For example, the string aabb is accepted: 0 --a--> 0 --a--> 1 --b--> 2 --b--> 3.
What is operator precedence parsing?
Operator Precedence Parsing
Operator-precedence parsing is a bottom-up (shift-reduce) parsing technique applicable to operator grammars — grammars in which no production has two adjacent nonterminals on the right side and no ε-production. It is mainly used for parsing arithmetic and boolean expressions.
The parser defines three precedence relations between pairs of terminals:
- :
ayields precedence tob(b has higher precedence / starts a handle) - :
aandbhave equal precedence (same handle) - :
atakes precedence overb(end of a handle — reduce)
These relations are stored in an operator-precedence table. During parsing, the relation between the top-of-stack terminal and the current input terminal decides whether to shift ( or ) or reduce () the handle delimited by ⋖ … ⋗.
Advantages: simple and easy to construct by hand; Disadvantages: works only for a small class of grammars, cannot handle the unary minus / token ambiguity well, and may accept some erroneous inputs.
Explain activation records and their role in run-time storage management.
Activation Records and Run-Time Storage Management
An activation record (or stack frame) is a contiguous block of storage that holds all the information needed for a single execution (activation) of a procedure. Each time a procedure is called, a new activation record is pushed onto the run-time control stack, and it is popped when the procedure returns. This naturally supports nested and recursive calls.
Typical fields of an activation record (from top to bottom):
- Return value — space for the value returned to the caller.
- Actual parameters — arguments supplied by the caller.
- Control link (dynamic link) — pointer to the caller's activation record.
- Access link (static link) — pointer used to reach non-local data in enclosing scopes.
- Saved machine status — return address and saved registers.
- Local data — local variables of the procedure.
- Temporaries — values generated during expression evaluation.
Role in storage management: activation records implement stack-based allocation. The stack pointer / frame pointer is adjusted on call and return, giving automatic, efficient allocation and deallocation of locals, parameters, and bookkeeping data, and enabling recursion since each call gets its own private frame.
What is a basic block and a flow graph?
Basic Block and Flow Graph
Basic block: A basic block is a maximal sequence of consecutive intermediate-code (three-address) instructions such that:
- Control enters only at the first instruction (the leader), and
- Control leaves only at the last instruction,
with no jumps into or out of the middle. Thus the whole block executes as a straight-line unit.
Leaders are identified as: (1) the first instruction; (2) any target of a (conditional/unconditional) jump; (3) any instruction immediately following a jump. A basic block runs from a leader up to (but not including) the next leader.
Flow graph: A flow graph (control-flow graph) is a directed graph whose nodes are basic blocks and whose edges represent the flow of control. There is an edge from block to block if control can pass from the last statement of to the first statement of (i.e. is the jump target, or immediately follows and does not end in an unconditional jump). An entry and sometimes an exit node are added.
Flow graphs are the basis for control-flow and data-flow analysis used in code optimization (e.g. loop detection, dead-code elimination).
Differentiate between top-down and bottom-up parsing.
Top-Down vs Bottom-Up Parsing
| Aspect | Top-Down Parsing | Bottom-Up Parsing |
|---|---|---|
| Direction | Builds the parse tree from the root (start symbol) down to the leaves | Builds the parse tree from the leaves (input) up to the root |
| Derivation | Produces a leftmost derivation | Produces a rightmost derivation in reverse |
| Basic operation | Expand a nonterminal by predicting/matching productions | Shift symbols and reduce handles |
| Also called | Predictive / recursive-descent parsing | Shift-reduce parsing |
| Grammar restrictions | Cannot handle left recursion; needs left-factoring | Handles left recursion; more powerful |
| Examples | Recursive-descent, LL(1) | LR(0), SLR, LALR, CLR, operator-precedence |
| Power | Accepts a smaller class of grammars | Accepts a larger class of grammars |
In short, top-down parsing starts from the start symbol and tries to derive the input (predictive), whereas bottom-up parsing starts from the input tokens and reduces them back to the start symbol (shift-reduce).
Frequently asked questions
- Where can I find the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) question paper 2075?
- The full BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2075 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Compiler Design and Construction (BSc CSIT, CSC365) 2075 paper come with solutions?
- Yes. Every question on this Compiler Design and Construction (BSc CSIT, CSC365) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2075 paper?
- The BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2075 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Compiler Design and Construction (BSc CSIT, CSC365) past paper free?
- Yes — reading and attempting this Compiler Design and Construction (BSc CSIT, CSC365) past paper on Kekkei is completely free.