BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) Question Paper 2079 Nepal

Q: Where can I find the BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) question paper 2079?

The full BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) 2079 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Artificial Intelligence (IOE, CT 653) 2079 paper come with solutions?

Yes. Every question on this Artificial Intelligence (IOE, CT 653) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) 2079 paper?

The BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) 2079 paper carries 80 full marks and is meant to be completed in 180 minutes, across 13 questions.

Q: Is practising this Artificial Intelligence (IOE, CT 653) past paper free?

Yes — reading and attempting this Artificial Intelligence (IOE, CT 653) past paper on Kekkei is completely free.

Question

1Long answer12 marks

(a) Define an intelligent agent. With the help of a block diagram, explain the structure of a learning agent, clearly describing the role of the performance element, learning element, critic, and problem generator. (7 marks)

(b) Consider an automated vacuum-cleaning robot operating in a house. Specify its task environment using the PEAS (Performance measure, Environment, Actuators, Sensors) framework, and classify the environment along the dimensions: observable vs. partially observable, deterministic vs. stochastic, episodic vs. sequential, static vs. dynamic. (5 marks)

intelligent-agentsagent-environment

Answer 1

(a) Intelligent agent and the learning agent (7 marks)

Intelligent agent: An intelligent agent is anything that perceives its environment through sensors and acts upon that environment through actuators so as to maximize its expected performance measure. It maps a sequence of percepts to actions and behaves rationally.

Structure of a learning agent

A learning agent can be divided into four conceptual components:

              Performance standard
                     |
                     v
  Sensors --> [ Critic ] --feedback--> [ Learning Element ]
     ^                                        |
     |                                  changes (knowledge)
     |                                        v
Environment <-- Actuators <-- [ Performance Element ] <--+
                                        ^               |
                                        |               |
                                  learning goals        |
                                 [ Problem Generator ]---+

Performance element: Selects external actions based on percepts. It is the part that, in a non-learning agent, would be the whole agent (it decides what to do now).
Learning element: Responsible for making improvements. Using feedback from the critic, it modifies the knowledge/components of the performance element so future decisions are better.
Critic: Observes the agent's behaviour and tells the learning element how well the agent is doing with respect to a fixed performance standard. The standard must be external because the percepts themselves give no clue about success.
Problem generator: Suggests exploratory actions that may be suboptimal in the short term but lead to new, informative experiences, preventing the agent from always exploiting what it already knows.

(b) PEAS for an automated vacuum-cleaning robot (5 marks)

PEAS	Specification
Performance measure	Amount of dirt cleaned, area covered, time/energy used, avoiding collisions, minimum noise
Environment	Rooms, floor surfaces, furniture, walls, dust/dirt, pets, charging dock
Actuators	Wheels/motors for movement, suction motor, brushes, dirt-bin mechanism
Sensors	Bump/contact sensors, dirt/optical sensors, cliff sensors, infrared/cliff detectors, battery sensor, camera/odometry

Environment classification

Partially observable — the robot senses only its immediate surroundings, not the whole house at once.
Stochastic — outcomes are uncertain (wheel slip, unexpectedly moved furniture, new dirt appearing).
Sequential — current actions (where it moves) affect which areas remain to be cleaned later.
Dynamic — the world can change while the robot deliberates (people walk, pets move, new dirt appears).

Answer 2

(a) Admissibility and consistency of a heuristic (6 marks)

Let $h(n)$ be the heuristic estimate of the cost from node $n$ to a goal, and $h^*(n)$ the true optimal cost.

Admissibility: $h$ is admissible if it never overestimates the cost to reach the goal:

0 \le h(n) \le h^*(n) \quad \text{for every node } n.

Consistency (monotonicity): $h$ is consistent if for every node $n$ and every successor $n'$ reached by an action of cost $c(n,n')$ :

h(n) \le c(n,n') + h(n').

Every consistent heuristic is admissible

Proof by induction on the number of steps $k$ from $n$ to the goal along an optimal path.

Base case ( $k = 0$ ): $n$ is the goal, $h(n) = 0 = h^*(n)$ , so $h(n) \le h^*(n)$ .
Inductive step: Assume $h(n') \le h^*(n')$ for the optimal successor $n'$ that is $k$ steps from the goal. By consistency, $h(n) \le c(n,n') + h(n')$ . Since $n'$ lies on the optimal path, $h^*(n) = c(n,n') + h^*(n')$ . Therefore

h(n) \le c(n,n') + h(n') \le c(n,n') + h^*(n') = h^*(n).

Hence $h(n) \le h^*(n)$ for all $n$ , i.e. consistency implies admissibility. (The converse is not true.) $\blacksquare$

(b) A* trace (10 marks)

Using $f(n) = g(n) + h(n)$ . Edges: S-A=3, S-B=5, A-C=4, B-C=2, C-G=4, A-G=12. Heuristics: S=10, A=7, B=6, C=3, G=0.

Iteration 1 — expand S ( $g=0$ ):

A: $g=3, f=3+7=10$
B: $g=5, f=5+6=11$

Open = {A(10), B(11)}, Closed = {S}

Iteration 2 — expand A (lowest $f=10$ , $g=3$ ):

C via A: $g=3+4=7, f=7+3=10$
G via A: $g=3+12=15, f=15+0=15$

Open = {C(10), B(11), G(15)}, Closed = {S, A}

Iteration 3 — expand C (lowest $f=10$ , $g=7$ ):

G via C: $g=7+4=11, f=11+0=11$ (better than previous G=15, so update)

Open = {G(11), B(11)}, Closed = {S, A, C}

Iteration 4 — expand G (tie at $f=11$ ; G is the goal): Goal reached with $f=11$ .

(Note: B is also $f=11$ but expanding B would give C via B at $g=5+2=7$ = same $g$ , no improvement, and cannot beat $f=11$ to goal.)

Result

Optimal path: $S \to A \to C \to G$

Total cost: $3 + 4 + 4 = \mathbf{11}$ .

Answer 3

(a) Statements in first-order predicate logic (6 marks)

1. Every student who studies hard passes the examination.

\forall x\,[\, Student(x) \wedge StudiesHard(x) \Rightarrow Passes(x) \,]

2. Some students do not like every subject.

\exists x\,[\, Student(x) \wedge \exists y\,( Subject(y) \wedge \neg Likes(x,y) ) \,]

Equivalently $\exists x\,[Student(x) \wedge \neg \forall y\,(Subject(y) \Rightarrow Likes(x,y))]$ .

3. All birds can fly except penguins.

\forall x\,[\, Bird(x) \wedge \neg Penguin(x) \Rightarrow CanFly(x) \,]

(b) Resolution refutation: Curiosity killed the cat (6 marks)

Predicates

$Dog(x)$ , $Owns(x,y)$ , $AnimalLover(x)$ , $Animal(x)$ , $Kills(x,y)$ , $Cat(x)$ ; constants: $Jack$ , $Curiosity$ , $Tuna$ .

Knowledge base in FOL

$\forall x\,[(\exists y\, Dog(y) \wedge Owns(x,y)) \Rightarrow AnimalLover(x)]$
$\forall x\,[AnimalLover(x) \Rightarrow \forall y\,(Animal(y) \Rightarrow \neg Kills(x,y))]$
$Kills(Jack,Tuna) \vee Kills(Curiosity,Tuna)$
$Cat(Tuna)$
$\forall x\,[Cat(x) \Rightarrow Animal(x)]$ (cats are animals)
$Dog(D)$ and $Owns(Jack,D)$ (Jack owns a dog $D$ )

Clause form

C1: $\neg Dog(y) \vee \neg Owns(x,y) \vee AnimalLover(x)$
C2: $\neg AnimalLover(x) \vee \neg Animal(y) \vee \neg Kills(x,y)$
C3: $Kills(Jack,Tuna) \vee Kills(Curiosity,Tuna)$
C4: $Cat(Tuna)$
C5: $\neg Cat(x) \vee Animal(x)$
C6a: $Dog(D)$ , C6b: $Owns(Jack,D)$
Negated goal C7: $\neg Kills(Curiosity,Tuna)$

Refutation

C4 + C5 $\Rightarrow$ R1: $Animal(Tuna)$
C6a + C6b + C1 ( $y=D, x=Jack$ ) $\Rightarrow$ R2: $AnimalLover(Jack)$
C3 + C7 (resolve on $Kills(Curiosity,Tuna)$ ) $\Rightarrow$ R3: $Kills(Jack,Tuna)$
C2 + R2 ( $x=Jack$ ) $\Rightarrow$ R4: $\neg Animal(y) \vee \neg Kills(Jack,y)$
R4 + R1 ( $y=Tuna$ ) $\Rightarrow$ R5: $\neg Kills(Jack,Tuna)$
R5 + R3 $\Rightarrow$ $\square$ (empty clause)

The empty clause is derived, so the negation leads to a contradiction. Hence Curiosity killed the cat is proved.

Answer 4

(a) Feed-forward MLP and backpropagation (8 marks)

Architecture

  x1 ---\        (hidden)        (output)
  x2 ----+--> [ h1 ]  w_jk -->  [ o1 ]
  x3 ----+--> [ h2 ]  ------->  [ o2 ]
   ...   /--> [ hj ]
 Input layer   v_ij weights   w_jk weights

A fully connected, acyclic network: input layer $\to$ one hidden layer (weights $v_{ij}$ ) $\to$ output layer (weights $w_{jk}$ ). Signals flow only forward.

Assumptions / activation function

The activation $f$ is differentiable and non-linear (e.g. sigmoid $f(net)=\frac{1}{1+e^{-net}}$ , so $f'(net)=f(net)(1-f(net))$ ). Non-linearity lets the network represent non-linearly-separable functions; differentiability allows gradient computation.
Error is measured by mean-squared error $E = \tfrac{1}{2}\sum_k (t_k - o_k)^2$ ; weights updated by gradient descent with learning rate $\eta$ .

Derivation of the update rule

For an output-layer weight $w_{jk}$ (hidden $j$ to output $k$ ), with $net_k=\sum_j w_{jk} o_j$ and $o_k=f(net_k)$ :

\frac{\partial E}{\partial w_{jk}} = -(t_k - o_k)\,f'(net_k)\,o_j = -\delta_k\, o_j,\quad \delta_k=(t_k-o_k)f'(net_k)

\boxed{\;\Delta w_{jk} = \eta\,\delta_k\,o_j\;}

For a hidden-layer weight $v_{ij}$ (input $i$ to hidden $j$ ), the error is backpropagated from all output units:

\delta_j = f'(net_j)\sum_k \delta_k\, w_{jk}

\boxed{\;\Delta v_{ij} = \eta\,\delta_j\,x_i\;}

Weights are updated as $w \leftarrow w + \Delta w$ and iterated until the error converges. This is the backpropagation of error.

(b) XOR and the single-layer perceptron (4 marks)

A single-layer perceptron computes a single linear decision boundary $w_1x_1 + w_2x_2 + b = 0$ and can only classify linearly separable data. XOR outputs:

$x_1$	$x_2$	XOR
0	0	0
0	1	1
1	0	1
1	1	0

The two classes (0s at opposite corners, 1s at the other two corners) cannot be separated by any single straight line, so a single perceptron fails.

Multilayer solution: Use a hidden layer that builds intermediate features, e.g. $h_1 = OR(x_1,x_2)$ and $h_2 = AND(x_1,x_2)$ , then output $= AND(h_1, \neg h_2) = h_1 \wedge \neg h_2$ . Geometrically each hidden unit draws one line, and the output unit combines the two half-planes into the non-linear XOR region. Thus an MLP with one hidden layer solves XOR.

Answer 5

Let $b$ = branching factor, $d$ = depth of the shallowest goal, $m$ = maximum depth of the tree.

Property	BFS	DFS
Completeness	Yes (finite $b$ )	No (can loop in infinite/deep branches)
Optimality	Yes, if all step costs are equal	No
Time complexity	$O(b^d)$	$O(b^m)$
Space complexity	$O(b^d)$ (stores whole frontier)	$O(bm)$ (only current path)

BFS guarantees the shallowest/optimal solution but its memory demand is exponential, which is usually the limiting factor. DFS uses little memory but is neither complete nor optimal.

Iterative Deepening DFS (IDDFS) is preferred when memory is limited but completeness and optimality (uniform cost) are still required, and the solution depth is unknown. It runs DFS with increasing depth limits, so it has DFS's linear space $O(bd)$ yet BFS's completeness and optimality, with time $O(b^d)$ (re-expanding shallow nodes adds only a constant-factor overhead).

Answer 6

Hill-climbing is a local-search technique that starts from an arbitrary state and repeatedly moves to the neighbouring state with the best (highest) value of an objective/heuristic function, stopping when no neighbour is better. It is essentially greedy local optimisation that keeps no search tree, only the current state.

current <- initial state
loop:
    neighbour <- highest-valued successor of current
    if value(neighbour) <= value(current): return current
    current <- neighbour

Problems and remedies

Local maximum: A peak higher than all neighbours but lower than the global maximum; the search stops there. Remedies: random-restart hill climbing, simulated annealing, or allowing sideways/uphill-with-probability moves.
Plateau: A flat region where neighbours have equal value, giving no gradient to follow. Remedies: allow a bounded number of sideways moves to traverse the plateau, or random restart.
Ridge: A sloping region where progress requires moving through states that individually look worse along the available single-step moves (axes are not aligned with the ridge). Remedies: use larger or compound moves / multiple directions at once, or apply random restarts.

Generally, random-restart hill climbing and simulated annealing overcome all three by escaping poor local optima.

Answer 7

Knowledge representation (KR) is the area of AI concerned with how facts, objects, relationships and rules about the world are formally encoded in a form a computer can store and reason over (e.g. logic, semantic networks, frames, rules) to derive new knowledge.

Semantic network for the given facts

Nodes are concepts; labelled arcs are relations. is-a arcs allow inheritance of properties.

            [ Animal ] --can--> [ Breathe ]
               ^
             is-a
               |
            [ Bird ] --can--> [ Fly ]
               ^
             is-a
               |
            [ Tweety ]

Tweety is-a Bird; Bird is-a Animal.
Bird can Fly; Animal can Breathe.
By inheritance, Tweety inherits can fly (from Bird) and can breathe (from Animal).

Advantage: Natural, intuitive representation that supports property inheritance through is-a links, making reasoning and visualisation easy.

Limitation: No standard formal semantics for quantifiers, negation or exceptions (e.g. handling a non-flying bird like a penguin is awkward), so reasoning can be ambiguous.

Answer 8

Architecture of an expert system

        +-------------+        +------------------+
 User-->| User        |<------>| Inference Engine |
        | Interface   |        +------------------+
        +-------------+              ^      ^
               ^                     |      |
               |                     v      v
        +--------------+     +-------------+  +-----------------+
        | Explanation  |     | Knowledge   |  | Working Memory  |
        | Facility     |     | Base (facts |  | (case facts)    |
        +--------------+     |  + rules)   |  +-----------------+
                             +-------------+
                                    ^
                          +-------------------+
                          | Knowledge         |
                          | Acquisition (expert)|
                          +-------------------+

Knowledge base: Domain facts and IF–THEN production rules.
Inference engine: Applies rules to the facts to derive conclusions (forward/backward chaining).
Working memory: Holds the current problem facts.
User interface: Lets the user pose problems and receive answers.
Explanation facility: Justifies how/why a conclusion was reached.
Knowledge acquisition module: Lets experts update/extend the knowledge base.

Forward vs. backward chaining

	Forward chaining	Backward chaining
Direction	Data-driven: from facts to conclusions	Goal-driven: from a hypothesis back to facts
Method	Fire rules whose conditions are satisfied, adding new facts until goal appears	Take the goal, find rules concluding it, recursively prove their conditions
Best for	Many possible conclusions from data (monitoring, design)	Verifying a specific hypothesis (diagnosis)

Example. Rule: IF it has fever AND cough THEN flu.

Forward: given facts fever and cough, the engine fires the rule and concludes flu.
Backward: to check the hypothesis flu, the engine asks whether fever and cough hold, querying the user/facts for each.

Answer 9

Supervised vs. unsupervised learning

	Supervised learning	Unsupervised learning
Data	Labelled (input–output pairs)	Unlabelled (inputs only)
Goal	Learn a mapping $x \to y$ to predict labels	Discover hidden structure/patterns
Tasks	Classification, regression	Clustering, dimensionality reduction, association
Examples	Spam detection, house-price prediction (linear regression, SVM, decision tree)	Customer segmentation, k-means clustering, PCA

Overfitting

Overfitting occurs when a model learns the training data too well — including its noise and random fluctuations — so it has very low training error but poor generalisation (high error on unseen test data). It typically happens with overly complex models, too many parameters, or too little training data.

Two techniques to reduce it:

Regularisation (e.g. L1/L2 penalty, dropout in neural nets) to discourage overly complex models.
Cross-validation / more training data / early stopping / pruning to select simpler models that generalise. (Any two are acceptable.)

Answer 10

Phases of Natural Language Processing

Lexical (morphological) analysis: Breaks text into tokens/words and analyses word structure (stems, prefixes, suffixes). e.g. "unhappiness" $\to$ un + happy + ness.
Syntactic analysis (parsing): Checks grammatical structure and builds a parse tree using grammar rules; rejects ungrammatical sequences (e.g. "boy the apple eats").
Semantic analysis: Determines the literal meaning of the sentence by mapping syntactic structures to meaning representations and checking meaningfulness.
Discourse integration: Interprets a sentence in the context of preceding sentences (e.g. resolving what "he" or "it" refers to across sentences).
Pragmatic analysis: Derives the intended/real-world meaning considering context, speaker intent and world knowledge (e.g. "Can you pass the salt?" is a request, not a yes/no question).

Why ambiguity is a fundamental challenge

Natural language allows a single expression to have multiple valid interpretations at every level (lexical, syntactic, semantic, pragmatic), so the system must choose the intended one using context — which is hard to compute reliably.

Example (syntactic/semantic ambiguity): "I saw the man with the telescope." — Did I use a telescope to see the man, or did I see a man who had a telescope? Both parses are grammatically valid.

Answer 11

Unification is the process of finding a substitution $\theta$ (a set of variable bindings) that makes two predicate-logic expressions syntactically identical, i.e. $E_1\theta = E_2\theta$ . The most general such substitution is the Most General Unifier (MGU) — every other unifier is an instance of it.

(a) `P(x, f(y))` and `P(a, f(g(z)))`

Same predicate $P$ , same arity — proceed.
Unify arguments: $x$ with $a$ $\Rightarrow$ $\{x/a\}$ .
Unify $f(y)$ with $f(g(z))$ : same function $f$ $\Rightarrow$ unify $y$ with $g(z)$ $\Rightarrow$ $\{y/g(z)\}$ (no occurs-check violation since $y$ does not occur in $g(z)$ ).

MGU $\theta = \{x/a,\; y/g(z)\}$ . Applying it gives $P(a, f(g(z)))$ on both sides. ✔

(b) `Q(x, x)` and `Q(a, b)`

First arguments: $x$ with $a$ $\Rightarrow$ $\{x/a\}$ .
Second arguments under this substitution: $x\theta = a$ must unify with $b$ . But $a$ and $b$ are distinct constants — they cannot be made equal.

Unification fails (no MGU exists), because the single variable $x$ would have to equal two different constants $a$ and $b$ at the same time.

Answer 12

Turing Test: Proposed by Alan Turing (1950) as the "Imitation Game." A human interrogator holds a text-only conversation with two unseen participants — one human and one machine — and must decide which is which. If the interrogator cannot reliably distinguish the machine from the human, the machine is said to have passed the test and to exhibit human-level intelligent behaviour.

Is passing it sufficient for true intelligence? Arguably not. The test measures only the ability to imitate human conversational behaviour (the appearance of intelligence), not genuine understanding, consciousness, reasoning or self-awareness. A system could pass through clever pattern matching, tricks or deception without any real comprehension.

Major criticism — Searle's Chinese Room argument: A person manipulating Chinese symbols by following rules can produce correct Chinese responses without understanding Chinese at all. Likewise a machine could pass the Turing Test by symbol manipulation while having no actual understanding — so behaviour-based testing does not establish real intelligence.

Answer 13

Minimax algorithm: A decision rule for two-player, zero-sum, perfect-information games (e.g. tic-tac-toe, chess). One player (MAX) tries to maximise the score; the opponent (MIN) tries to minimise it. The game tree is generated to some depth and evaluated by a utility/evaluation function at the leaves; values are then backed up: at MAX nodes take the maximum of the children's values, at MIN nodes take the minimum. MAX then chooses the move leading to the best backed-up value, assuming optimal play by MIN.

minimax(node, MAX):  return max over children of minimax(child, MIN)
minimax(node, MIN):  return min over children of minimax(child, MAX)

Alpha-beta pruning improves efficiency by keeping two bounds during the depth-first search:

$\alpha$ = best value MAX can guarantee so far,
$\beta$ = best value MIN can guarantee so far.

Whenever a node's value can no longer affect the final decision (i.e. $\alpha \ge \beta$ ), the remaining children of that node are pruned (not explored). Because only branches that cannot change the backed-up value are cut, alpha-beta returns exactly the same minimax value/move as plain minimax. With good move ordering it reduces the effective time complexity from $O(b^d)$ to about $O(b^{d/2})$ , effectively doubling the searchable depth.

Level	BE Computer Engineering (IOE, TU)
Subject	Artificial Intelligence (IOE, CT 653)
Year	2079 BS
Exam session	Regular (annual)
Full marks	80
Time allowed	180 minutes
Questions	13, all with step-by-step solutions

BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) Question Paper 2079 Nepal

Section A: Long Answer Questions

(a) Intelligent agent and the learning agent (7 marks)

Structure of a learning agent

(b) PEAS for an automated vacuum-cleaning robot (5 marks)

Environment classification

(a) Admissibility and consistency of a heuristic (6 marks)

Every consistent heuristic is admissible

(b) A* trace (10 marks)

Result

(a) Statements in first-order predicate logic (6 marks)

(b) Resolution refutation: Curiosity killed the cat (6 marks)

Predicates

Knowledge base in FOL

Clause form

Refutation

(a) Feed-forward MLP and backpropagation (8 marks)

Architecture

Assumptions / activation function

Derivation of the update rule

(b) XOR and the single-layer perceptron (4 marks)

Section B: Short Answer Questions

Problems and remedies

Semantic network for the given facts

Architecture of an expert system

Forward vs. backward chaining

Supervised vs. unsupervised learning

Overfitting

Phases of Natural Language Processing

Why ambiguity is a fundamental challenge

(a) `P(x, f(y))` and `P(a, f(g(z)))`

(b) `Q(x, x)` and `Q(a, b)`

Frequently asked questions

Node	h(n)
S	10
A	7
B	6
C	3
G	0

Section A: Long Answer Questions

(a) Intelligent agent and the learning agent (7 marks)

Structure of a learning agent

(b) PEAS for an automated vacuum-cleaning robot (5 marks)

Environment classification

(a) Admissibility and consistency of a heuristic (6 marks)

Every consistent heuristic is admissible

(b) A* trace (10 marks)

Result

(a) Statements in first-order predicate logic (6 marks)

(b) Resolution refutation: Curiosity killed the cat (6 marks)

Predicates

Knowledge base in FOL

Clause form

Refutation

(a) Feed-forward MLP and backpropagation (8 marks)

Architecture

Assumptions / activation function

Derivation of the update rule

(b) XOR and the single-layer perceptron (4 marks)

Section B: Short Answer Questions

Problems and remedies

Semantic network for the given facts

Architecture of an expert system

Forward vs. backward chaining

Supervised vs. unsupervised learning

Overfitting

Phases of Natural Language Processing

Why ambiguity is a fundamental challenge

(a) P(x, f(y)) and P(a, f(g(z)))

(b) Q(x, x) and Q(a, b)

Frequently asked questions

(a) `P(x, f(y))` and `P(a, f(g(z)))`

(b) `Q(x, x)` and `Q(a, b)`