BSc CSIT (TU) Science Compiler Design and Construction (BSc CSIT, CSC365) Question Paper 2075 Nepal

Q: Where can I find the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) question paper 2075?

The full BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2075 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Compiler Design and Construction (BSc CSIT, CSC365) 2075 paper come with solutions?

Yes. Every question on this Compiler Design and Construction (BSc CSIT, CSC365) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2075 paper?

The BSc CSIT (TU) Compiler Design and Construction (BSc CSIT, CSC365) 2075 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Compiler Design and Construction (BSc CSIT, CSC365) past paper free?

Yes — reading and attempting this Compiler Design and Construction (BSc CSIT, CSC365) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain the general structure of an LR parser. Construct the LR(0) item sets for the grammar S -> AA, A -> aA | b.

lr-parsing

Answer 1

General Structure of an LR Parser

An LR parser is a bottom-up, shift-reduce parser that scans input Left-to-right and produces a Rightmost derivation in reverse. It consists of:

Input buffer — holds the remaining input string followed by the end marker $.
Stack — stores a sequence of states (and grammar symbols). The top state summarises all the information needed to make parsing decisions.
Parsing program (driver) — a fixed routine, the same for every LR grammar.
Parsing table — divided into two parts:
- ACTION[state, terminal] → shift, reduce, accept, or error.
- GOTO[state, nonterminal] → next state after a reduction.

Working: Using the state on top of the stack and the current input symbol, the driver consults ACTION. On shift it pushes the symbol/state; on reduce A → β it pops 2·|β| entries, then pushes GOTO[top, A]; on accept parsing succeeds.

LR(0) Item Sets for the Grammar

Augmented grammar:

S' \to S,\quad S \to AA,\quad A \to aA \mid b

Using CLOSURE and GOTO:

I0 = CLOSURE({S'→·S}): S'→·S, S→·AA, A→·aA, A→·b
I1 = GOTO(I0,S): S'→S·
I2 = GOTO(I0,A): S→A·A, A→·aA, A→·b
I3 = GOTO(I0,a): A→a·A, A→·aA, A→·b
I4 = GOTO(I0,b): A→b·
I5 = GOTO(I2,A): S→AA·
I6 = GOTO(I3,A): A→aA·
GOTO(I2,a)=I3, GOTO(I2,b)=I4, GOTO(I3,a)=I3, GOTO(I3,b)=I4

There are 7 LR(0) states (I0–I6). The DFA of items recognises viable prefixes; states I1, I4, I5, I6 contain reducing/accepting items.

Answer 2

Syntax-Directed Translation (SDT)

Syntax-directed translation is a method of translating a source program in which semantic rules (actions) are attached to the productions of a context-free grammar. As the parser recognises grammar productions, the associated semantic rules are evaluated to compute attribute values, build the syntax tree, generate intermediate code, etc. Translation is thus driven by the syntax of the language.

Syntax-Directed Definitions (SDD)

An SDD is a CFG together with attributes associated with grammar symbols and semantic rules associated with productions. Attributes are of two kinds:

Synthesized attribute — computed from the attributes of the children (bottom-up).
Inherited attribute — computed from the parent and/or sibling attributes (top-down).

An SDD specifies what to compute without specifying when/order; the dependency graph determines evaluation order.

Translation Scheme (Example)

A translation scheme is an SDD in which semantic actions are enclosed in { } and embedded at specific positions within the right side of productions, indicating the order of evaluation.

Example — evaluating an arithmetic expression (synthesized attribute val):

L → E         { print(E.val) }
E → E1 + T    { E.val = E1.val + T.val }
E → T         { E.val = T.val }
T → T1 * F    { T.val = T1.val * F.val }
T → F         { T.val = F.val }
F → ( E )     { F.val = E.val }
F → digit     { F.val = digit.lexval }

For input 3 + 4 * 5, the actions compute T.val = 20, then E.val = 3 + 20 = 23, and L prints 23. The actions fire as each production is reduced, demonstrating bottom-up syntax-directed translation.

Answer 3

Specification of Tokens

Tokens are specified using regular expressions (and regular definitions). A token is a category (e.g. id, num, relop), a pattern is the rule describing its form, and a lexeme is the actual string matched. Example regular definitions:

digit  → [0-9]
letter → [A-Za-z]
id     → letter (letter | digit)*
num    → digit+ ( . digit+ )? ( E [+-]? digit+ )?

Recognition of Tokens

Regular expressions are converted to a transition diagram / finite automaton that the scanner simulates to recognise lexemes. On reaching an accepting state the longest matching lexeme is reported as a token; ties are broken by the longest-match and rule-priority (first listed) conventions.

Role of Finite Automata

The construction pipeline is:

\text{RE} \;\to\; \text{NFA (Thompson)} \;\to\; \text{DFA (subset construction)} \;\to\; \text{minimised DFA}

An NFA allows ε-moves and is easy to build from an RE but is non-deterministic.
A DFA has exactly one move per (state, input) and is what the lexer actually executes — fast, O(n) in input length.

Tools like LEX/Flex automatically build a DFA from the supplied regular expressions.

Role of Input Buffering

Reading the source one character at a time from disk is slow, and a lexer often needs look-ahead (e.g. to tell = from ==). To make this efficient, the lexer uses a two-buffer scheme with two halves of size N. Two pointers are maintained:

lexemeBegin — start of the current lexeme.
forward — scans ahead to find the end of the lexeme.

When forward crosses the end of one half, the other half is reloaded. The sentinel (eof) technique places an eof marker at the end of each half so that a single test detects both end-of-buffer and end-of-input, reducing comparisons per character.

Answer 4

Three-address code (TAC) is an intermediate representation in which each instruction has at most three addresses (operands) — typically of the form x = y op z, using at most one operator on the right-hand side. Complex expressions are broken into a sequence of such instructions using temporary variables, which makes it easy to rearrange, optimise, and generate target code.

Three-address code for x = (a + b) * (c + d):

t1 = a + b
t2 = c + d
t3 = t1 * t2
x  = t3

Here t1, t2, t3 are compiler-generated temporaries.

Answer 5

Peephole Optimization

Peephole optimization is a simple, local code-improvement technique that examines a short sliding window (the peephole) of a few consecutive target/intermediate instructions and replaces them with a shorter or faster equivalent sequence. It is applied repeatedly until no further improvement is possible.

Common transformations:

Redundant load/store elimination — remove LD R, a immediately after ST a, R.
Constant folding — replace x = 2 * 3 with x = 6.
Algebraic simplification — remove x = x + 0 or x = x * 1.
Strength reduction — replace x = y * 2 with x = y + y or a shift.
Dead-code / unreachable-code elimination.
Flow-of-control optimization — remove jumps to jumps.

Example (redundant store/load):

MOV R0, a      MOV R0, a
MOV a,  R0  →  ... (second MOV removed)

The instruction MOV a, R0 is unnecessary because a already holds the value just loaded into R0, so it is deleted, producing faster code.

Answer 6

Synthesized vs Inherited Attributes

Aspect	Synthesized Attribute	Inherited Attribute
Definition	Value computed from attributes of the children of the node	Value computed from attributes of the parent and/or siblings
Direction of flow	Bottom-up (leaves → root)	Top-down / sideways (root, siblings → node)
Production form	For `A → X Y Z`, `A.s` depends on `X,Y,Z`	For `A → X Y Z`, `X.i` (or Y.i, Z.i) depends on `A` and other RHS symbols
Parsing fit	Naturally evaluated during bottom-up (LR) parsing	Naturally suited to top-down parsing / passing context down
Grammar class	An SDD using only synthesized attributes is S-attributed	A grammar with restricted inherited attributes is L-attributed
Example	`E → E1 + T { E.val = E1.val + T.val }`	`D → T L { L.type = T.type }` (type passed down to the list)

In short, synthesized attributes carry information up the parse tree, whereas inherited attributes carry context down (or across) the tree.

Answer 7

Bootstrapping

Bootstrapping is the technique of writing a compiler (or part of it) for a language in the same language it is meant to compile, and then using a smaller existing compiler to build a full self-hosting compiler in stages.

The idea is described with T-diagrams showing source language S, target/implementation language, and the implementing language. A typical bootstrapping process:

Write a small/simple compiler for a subset of language L in an existing language (e.g. assembly or C). Call it $C_0$ .
Rewrite the compiler for L in L itself, and compile this source using $C_0$ to obtain a working compiler $C_1$ .
Use $C_1$ to compile its own source again, optionally adding optimisations, producing an improved, fully self-hosting compiler $C_2$ .

Advantages: the compiler becomes self-hosting, it serves as a large test program for the language, and improvements to the compiler automatically benefit later versions.

Example: an early C compiler was first written in assembly, then rewritten in C and compiled by the earlier version.

Answer 8

NFA for (a|b)*abb

Using Thompson's construction, an NFA (with ε-moves) for the regular expression (a|b)*abb accepts any string of a's and b's that ends in the suffix abb.

A compact equivalent NFA (after collapsing the (a|b)* loop into a single self-looping start state) has states 0–3:

State	on `a`	on `b`
→0 (start)	{0, 1}	{0}
1	∅	{2}
2	∅	{3}
*3 (accept)	∅	∅

Description of the transition diagram:

State 0 has a self-loop on both a and b (this realises (a|b)*).
On a, state 0 can also move to state 1 (begin matching abb).
State 1 --b--> State 2, State 2 --b--> State 3.
State 3 is the single accepting state.

The nondeterminism at state 0 (staying in the loop vs. starting abb) is exactly what an NFA permits. For example, the string aabb is accepted: 0 --a--> 0 --a--> 1 --b--> 2 --b--> 3.

Answer 9

Operator Precedence Parsing

Operator-precedence parsing is a bottom-up (shift-reduce) parsing technique applicable to operator grammars — grammars in which no production has two adjacent nonterminals on the right side and no ε-production. It is mainly used for parsing arithmetic and boolean expressions.

The parser defines three precedence relations between pairs of terminals:

$a \lessdot b$ : a yields precedence to b (b has higher precedence / starts a handle)
$a \doteq b$ : a and b have equal precedence (same handle)
$a \gtrdot b$ : a takes precedence over b (end of a handle — reduce)

These relations are stored in an operator-precedence table. During parsing, the relation between the top-of-stack terminal and the current input terminal decides whether to shift ( $\lessdot$ or $\doteq$ ) or reduce ( $\gtrdot$ ) the handle delimited by ⋖ … ⋗.

Advantages: simple and easy to construct by hand; Disadvantages: works only for a small class of grammars, cannot handle the unary minus / token ambiguity well, and may accept some erroneous inputs.

Answer 10

Activation Records and Run-Time Storage Management

An activation record (or stack frame) is a contiguous block of storage that holds all the information needed for a single execution (activation) of a procedure. Each time a procedure is called, a new activation record is pushed onto the run-time control stack, and it is popped when the procedure returns. This naturally supports nested and recursive calls.

Typical fields of an activation record (from top to bottom):

Return value — space for the value returned to the caller.
Actual parameters — arguments supplied by the caller.
Control link (dynamic link) — pointer to the caller's activation record.
Access link (static link) — pointer used to reach non-local data in enclosing scopes.
Saved machine status — return address and saved registers.
Local data — local variables of the procedure.
Temporaries — values generated during expression evaluation.

Role in storage management: activation records implement stack-based allocation. The stack pointer / frame pointer is adjusted on call and return, giving automatic, efficient allocation and deallocation of locals, parameters, and bookkeeping data, and enabling recursion since each call gets its own private frame.

Answer 11

Basic Block and Flow Graph

Basic block: A basic block is a maximal sequence of consecutive intermediate-code (three-address) instructions such that:

Control enters only at the first instruction (the leader), and
Control leaves only at the last instruction,

with no jumps into or out of the middle. Thus the whole block executes as a straight-line unit.

Leaders are identified as: (1) the first instruction; (2) any target of a (conditional/unconditional) jump; (3) any instruction immediately following a jump. A basic block runs from a leader up to (but not including) the next leader.

Flow graph: A flow graph (control-flow graph) is a directed graph whose nodes are basic blocks and whose edges represent the flow of control. There is an edge from block $B_1$ to block $B_2$ if control can pass from the last statement of $B_1$ to the first statement of $B_2$ (i.e. $B_2$ is the jump target, or $B_2$ immediately follows $B_1$ and $B_1$ does not end in an unconditional jump). An entry and sometimes an exit node are added.

Flow graphs are the basis for control-flow and data-flow analysis used in code optimization (e.g. loop detection, dead-code elimination).

Answer 12

Top-Down vs Bottom-Up Parsing

Aspect	Top-Down Parsing	Bottom-Up Parsing
Direction	Builds the parse tree from the root (start symbol) down to the leaves	Builds the parse tree from the leaves (input) up to the root
Derivation	Produces a leftmost derivation	Produces a rightmost derivation in reverse
Basic operation	Expand a nonterminal by predicting/matching productions	Shift symbols and reduce handles
Also called	Predictive / recursive-descent parsing	Shift-reduce parsing
Grammar restrictions	Cannot handle left recursion; needs left-factoring	Handles left recursion; more powerful
Examples	Recursive-descent, LL(1)	LR(0), SLR, LALR, CLR, operator-precedence
Power	Accepts a smaller class of grammars	Accepts a larger class of grammars

In short, top-down parsing starts from the start symbol and tries to derive the input (predictive), whereas bottom-up parsing starts from the input tokens and reduces them back to the start symbol (shift-reduce).

Level	BSc CSIT (TU)
Stream	Science
Subject	Compiler Design and Construction (BSc CSIT, CSC365)
Year	2075 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions