Browse papers
A

Section A: Long Answer Questions

Attempt all / any as specified.

4 questions
1long12 marks

Define an intelligent agent and explain the structure of a rational agent in terms of the PEAS (Performance measure, Environment, Actuators, Sensors) framework.

(a) Describe, with a suitable diagram, the working of a model-based reflex agent and a utility-based agent, and clearly distinguish between them. (8)

(b) Classify the task environment of an automated taxi driver agent along the dimensions: fully/partially observable, deterministic/stochastic, episodic/sequential, static/dynamic, and discrete/continuous. Justify each classification. (4)

Intelligent Agent

An intelligent agent is any entity that perceives its environment through sensors and acts upon that environment through actuators so as to achieve its goals. A rational agent is one that, for each possible percept sequence, selects an action that is expected to maximise its performance measure, given the evidence provided by the percept sequence and its built-in knowledge.

PEAS Framework

The task of an agent is fully specified by the PEAS description:

ComponentMeaning
P – Performance measureThe criterion that defines success of the agent's behaviour
E – EnvironmentThe surroundings in which the agent operates
A – ActuatorsThe means by which the agent acts on the environment
S – SensorsThe means by which the agent perceives the environment

Example (automated taxi): P = safe, fast, legal, comfortable trip, maximise profit; E = roads, traffic, pedestrians, customers; A = steering, accelerator, brake, signal, horn; S = cameras, GPS, speedometer, odometer, sensors.

(a) Model-based Reflex Agent vs Utility-based Agent (8)

Model-based reflex agent

It keeps an internal state that depends on the percept history and reflects the unobserved aspects of the current state. It uses a model of the world (how the world evolves and how the agent's actions affect it) to update this state, then applies condition–action rules to choose an action.

        +---------------------------------------------+
Percepts|  Sensors --> [State] <-- How world evolves   |
  --->  |               |        What my actions do   |
        |               v                             |
        |        Condition-action rules --> Actuators | ---> Actions
        +---------------------------------------------+

Utility-based agent

Goals alone give only a binary distinction (happy/unhappy). A utility function maps a state (or sequence of states) to a real number expressing the degree of desirability. The agent chooses the action that maximises expected utility, allowing it to trade off conflicting goals (e.g. speed vs safety) and handle uncertainty.

        +-----------------------------------------------------+
Percepts|  Sensors --> [State] --> What it will be like if I   |
  --->  |                          do action A                |
        |                          --> How happy will I be    |
        |                              (Utility) --> Actuators | ---> Actions
        +-----------------------------------------------------+

Distinction

Model-based reflex agentUtility-based agent
Acts on condition–action rulesActs to maximise a utility (happiness) function
Maintains internal state but no explicit notion of "how good"Quantifies desirability of states with a utility value
Cannot resolve conflicting goals wellResolves trade-offs and handles uncertainty rationally
No look-ahead of qualityEvaluates expected outcome quality before acting

(b) Task Environment of an Automated Taxi Driver (4)

DimensionClassificationJustification
ObservabilityPartially observableSensors cannot capture everything – other drivers' intentions, what is around a blind corner, etc.
DeterminismStochasticTraffic, weather and behaviour of other agents are unpredictable; the same action can give different outcomes.
Episodic/SequentialSequentialCurrent driving decisions affect future states (e.g. braking now affects later position).
Static/DynamicDynamicThe world keeps changing while the agent deliberates – other cars move continuously.
Discrete/ContinuousContinuousSpeed, steering angle, location and time all take continuous values.

(It is also multi-agent, since other vehicles are independent agents.)

intelligent-agentsagent-environment
2long16 marks

Consider the following graph where each node represents a city and edge labels denote actual path costs. The straight-line distance heuristic h(n) to the goal G is given as: h(A)=10, h(B)=6, h(C)=4, h(D)=7, h(G)=0. Edges: A-B = 3, A-D = 5, B-C = 4, C-G = 5, D-G = 9.

(a) State the conditions for a heuristic to be admissible and consistent. Verify whether the given heuristic h(n) is admissible. (5)

(b) Apply the A* search algorithm to find the optimal path from A to G. Show the contents of the open and closed lists, and the f(n) = g(n) + h(n) values at each step. (8)

(c) Why is A* search guaranteed to be optimal when the heuristic is admissible? Explain briefly. (3)

(a) Admissibility and Consistency (5)

Admissible heuristic: h(n)h(n) never overestimates the true cost h(n)h^*(n) from node nn to the goal, i.e.

0h(n)h(n)for every node n.0 \le h(n) \le h^*(n) \quad \text{for every node } n.

Consistent (monotone) heuristic: for every node nn and every successor nn' reached by action with step cost c(n,n)c(n,n'),

h(n)c(n,n)+h(n),h(goal)=0.h(n) \le c(n,n') + h(n'), \qquad h(\text{goal}) = 0.

Consistency implies admissibility.

Verification of the given heuristic. The true (optimal) costs to G are:

Nodeh(n)h(n) givenTrue cost h(n)h^*(n)hhh \le h^*?
A10A→B→C→G = 3+4+5 = 12yes
B6B→C→G = 4+5 = 9yes
C4C→G = 5yes
D7D→G = 9yes
G00yes

Since h(n)h(n)h(n) \le h^*(n) for every node, the heuristic is admissible.

(b) A* Search from A to G (8)

We expand the node with the smallest f(n)=g(n)+h(n)f(n)=g(n)+h(n). Edges (undirected costs): A-B=3, A-D=5, B-C=4, C-G=5, D-G=9.

Step 1 – Expand A (g=0)

  • B: g=3, f = 3+6 = 9
  • D: g=5, f = 5+7 = 12

Open = {B(9), D(12)} , Closed = {A}

Step 2 – Expand B (smallest f=9, g=3)

  • C: g = 3+4 = 7, f = 7+4 = 11

Open = {C(11), D(12)} , Closed = {A, B}

Step 3 – Expand C (f=11, g=7)

  • G: g = 7+5 = 12, f = 12+0 = 12

Open = {G(12), D(12)} , Closed = {A, B, C}

Step 4 – Expand G (f=12) → Goal reached.

(The alternative path A→D→G has g = 5+9 = 14, f = 14, which is larger and never chosen.)

Optimal path: A → B → C → G with total cost 12.

StepNode expandedOpen list (node, f)
1AB(9), D(12)
2BC(11), D(12)
3CG(12), D(12)
4Ggoal

(c) Why A* is Optimal with an Admissible Heuristic (3)

With an admissible heuristic, f(n)=g(n)+h(n)f(n)=g(n)+h(n) never overestimates the true cost of the cheapest solution through nn. Suppose a sub-optimal goal G2G_2 (cost C2>CC_2 > C^*) is on the open list at the same time as a node nn on an optimal path. Then f(G2)=g(G2)=C2>Cf(n)f(G_2)=g(G_2)=C_2 > C^* \ge f(n), so A* always expands nn before G2G_2. Hence A* can never select a sub-optimal goal for expansion before reaching the optimal one, guaranteeing an optimal solution (for tree search; graph search additionally needs consistency).

informed-searcha-starheuristics
3long12 marks

(a) Represent the following statements in First Order Predicate Logic: "All students who study hard pass the exam. Ram is a student. Ram studies hard." Hence prove that "Ram passes the exam" using resolution refutation. Show the conversion to clausal (CNF) form and the unification steps. (8)

(b) Explain the role of unification and the Most General Unifier (MGU) in resolution-based inference, with an example. **(4)

(a) FOPL Representation and Resolution Refutation (8)

Statements in First Order Predicate Logic

  1. All students who study hard pass the exam:
x[(Student(x)StudiesHard(x))Passes(x)]\forall x\,[\,(Student(x) \wedge StudiesHard(x)) \rightarrow Passes(x)\,]
  1. Ram is a student: Student(Ram)Student(Ram)
  2. Ram studies hard: StudiesHard(Ram)StudiesHard(Ram)

Goal to prove: Passes(Ram)Passes(Ram).

Conversion to clausal (CNF) form

Eliminate implication in (1): x[¬Student(x)¬StudiesHard(x)Passes(x)]\forall x\,[\,\neg Student(x) \vee \neg StudiesHard(x) \vee Passes(x)\,]. The clauses are:

  • C1: ¬Student(x)¬StudiesHard(x)Passes(x)\neg Student(x) \vee \neg StudiesHard(x) \vee Passes(x)
  • C2: Student(Ram)Student(Ram)
  • C3: StudiesHard(Ram)StudiesHard(Ram)

Negate the goal (refutation): add C4: ¬Passes(Ram)C4:\ \neg Passes(Ram).

Resolution steps

#ResolveUnifier (MGU)Resolvent
1C1 & C2{x/Ram}\{x/Ram\}¬StudiesHard(Ram)Passes(Ram)\neg StudiesHard(Ram) \vee Passes(Ram) (C5)
2C5 & C3Passes(Ram)Passes(Ram) (C6)
3C6 & C4\square (empty clause)

The derivation of the empty clause \square shows a contradiction, so the negated goal is false; therefore Passes(Ram)Passes(Ram) is proved.

(b) Unification and the Most General Unifier (4)

Unification is the process of finding a substitution θ\theta that makes two literals (or terms) syntactically identical. It is essential in resolution because two clauses can be resolved only when a literal in one and the complementary literal in the other can be made equal by such a substitution.

The Most General Unifier (MGU) is the least constraining unifier: any other unifier can be obtained from it by an additional substitution. Using the MGU keeps inferences as general as possible.

Example. Unify Knows(John,x)Knows(John, x) and Knows(y,Mother(y))Knows(y, Mother(y)):

  • y/Johny/John, then x/Mother(John)x/Mother(John).
  • MGU θ={y/John, x/Mother(John)}\theta = \{y/John,\ x/Mother(John)\}, giving the common instance Knows(John,Mother(John))Knows(John, Mother(John)).

A more specific unifier such as {y/John,x/Mother(John)}\{y/John, x/Mother(John)\} together with extra bindings would be less general; the MGU above is the most general one.

predicate-logicinferenceresolution
4long12 marks

(a) Draw the architecture of a feed-forward multilayer perceptron and explain how the backpropagation algorithm is used to train it. Derive the weight update rule for an output-layer weight using gradient descent. (8)

(b) Explain the role of an activation function in a neural network. Compare the sigmoid and ReLU activation functions, and state one advantage of each. **(4)

(a) Multilayer Perceptron and Backpropagation (8)

Architecture (feed-forward MLP)

  Inputs        Hidden layer        Output layer
   x1 ---\        (o)----\
   x2 ----+----->(o)-----+----->( o ) ---> y1
   x3 ---/        (o)----/        ( o ) ---> y2
            w_ij            w_jk

Neurons are arranged in layers; every neuron in one layer connects to every neuron in the next via weighted edges. Signals flow only forward (input → hidden → output). Each neuron computes a weighted sum plus bias and passes it through a non-linear activation ff.

Backpropagation training (idea)

  1. Forward pass: propagate the input through the network to compute the output y^\hat{y}.
  2. Compute error using a loss function, e.g. E=12k(tkok)2E = \tfrac{1}{2}\sum_k (t_k - o_k)^2.
  3. Backward pass: propagate the error gradient from the output layer back to the input layer using the chain rule, computing E/w\partial E/\partial w for each weight.
  4. Update weights by gradient descent: wwηEww \leftarrow w - \eta\,\dfrac{\partial E}{\partial w}. Repeat for many epochs until the error converges.

Derivation of the output-layer weight update

Let output neuron kk have net input netk=jwjkojnet_k=\sum_j w_{jk}o_j, output ok=f(netk)o_k=f(net_k), target tkt_k, and E=12k(tkok)2E=\tfrac12\sum_k(t_k-o_k)^2. By the chain rule:

Ewjk=Eokoknetknetkwjk.\frac{\partial E}{\partial w_{jk}} = \frac{\partial E}{\partial o_k}\cdot\frac{\partial o_k}{\partial net_k}\cdot\frac{\partial net_k}{\partial w_{jk}}.

Now Eok=(tkok)\dfrac{\partial E}{\partial o_k} = -(t_k-o_k), oknetk=f(netk)\dfrac{\partial o_k}{\partial net_k}=f'(net_k), and netkwjk=oj\dfrac{\partial net_k}{\partial w_{jk}}=o_j. Hence

Ewjk=(tkok)f(netk)oj.\frac{\partial E}{\partial w_{jk}} = -(t_k-o_k)\,f'(net_k)\,o_j.

Defining the output-layer error term δk=(tkok)f(netk)\delta_k=(t_k-o_k)f'(net_k), the weight update rule is

wjkwjk+ηδkoj\boxed{\,w_{jk} \leftarrow w_{jk} + \eta\,\delta_k\,o_j\,}

where η\eta is the learning rate. For a sigmoid, f(netk)=ok(1ok)f'(net_k)=o_k(1-o_k), so δk=(tkok)ok(1ok)\delta_k=(t_k-o_k)o_k(1-o_k).

(b) Activation Functions: Sigmoid vs ReLU (4)

An activation function introduces non-linearity into a neuron, enabling the network to learn complex, non-linear mappings. Without it, any stack of layers would collapse into a single linear transformation. It also bounds/transforms the neuron output.

Sigmoid σ(x)=11+ex\sigma(x)=\dfrac{1}{1+e^{-x}}ReLU f(x)=max(0,x)f(x)=\max(0,x)
Range(0, 1)[0, ∞)
GradientSaturates for large $x
CostExpensive (exponential)Cheap (threshold)

One advantage of each:

  • Sigmoid: smooth output in (0,1), useful as a probability at the output layer for binary classification.
  • ReLU: avoids vanishing gradients for positive inputs and is computationally cheap, giving faster training in deep networks.
neural-networksmachine-learning
B

Section B: Short Answer Questions

Attempt all / any as specified.

8 questions
5short8 marks

Compare Breadth-First Search (BFS) and Depth-First Search (DFS) on the basis of completeness, optimality, time complexity, and space complexity. Under what circumstances would you prefer Iterative Deepening Depth-First Search (IDDFS) over both, and why?

BFS vs DFS

Let bb = branching factor, dd = depth of the shallowest goal, mm = maximum depth of the search tree.

CriterionBFSDFS
CompletenessComplete (finds a goal if one exists, bb finite)Not complete in infinite/cyclic spaces; complete only in finite spaces
OptimalityOptimal if all step costs are equalNot optimal (may return a deeper, costlier solution)
Time complexityO(bd)O(b^d)O(bm)O(b^m)
Space complexityO(bd)O(b^d) – stores whole frontier (its weakness)O(bm)O(bm) – only one path + siblings (its strength)

Summary: BFS guarantees the shallowest (optimal for unit costs) solution but uses huge memory; DFS uses little memory but may go deep, miss shallow goals, and is neither complete nor optimal in general.

When to prefer IDDFS

Iterative Deepening DFS runs DFS with increasing depth limits 0,1,2,0,1,2,\dots until the goal is found. It is preferred when:

  • The search space is large or infinite and memory is limited, and
  • The goal depth is unknown.

It combines the best of both: like DFS it needs only O(bd)O(bd) space, and like BFS it is complete and optimal (for unit step costs) with time O(bd)O(b^d). The repeated re-expansion of shallow nodes adds only a small constant overhead, so IDDFS is the preferred uninformed strategy when both memory efficiency and a shallowest-goal guarantee are required.

uninformed-searchsearch-strategies
6short6 marks

Explain semantic networks and frames as knowledge representation schemes. Represent the fact "A sparrow is a bird that can fly and has wings; Tweety is a sparrow" using a semantic network, and show how property inheritance applies to Tweety.

Semantic Networks and Frames

Semantic network: a graphical knowledge-representation scheme in which nodes represent objects, concepts or events and labelled directed edges (links) represent relationships between them. Common links are is-a (subclass/inheritance), instance-of, and property links such as has-part or can.

Frames: a structured representation where knowledge about a stereotyped object/situation is stored as a record of slots (attributes) and their fillers (values), which may be default values, constraints, or procedures (demons). Frames can be linked in an inheritance hierarchy, so a frame can inherit slot values from its parent frame. Frames are essentially a more structured form of semantic networks.

Semantic Network for the Given Fact

           is-a
  Sparrow --------> Bird
     |                |
  instance-of    can ----> Fly
     |                |
   Tweety        has-part-> Wings

Links used:

  • Bird --can--> Fly
  • Bird --has-part--> Wings
  • Sparrow --is-a--> Bird
  • Tweety --instance-of--> Sparrow

Property Inheritance to Tweety

Properties are inherited downward through is-a / instance-of links. Since Tweety is-a Sparrow and Sparrow is-a Bird, Tweety inherits the properties attached to Bird:

  • Tweety can Fly (inherited from Bird)
  • Tweety has Wings (inherited from Bird)

Thus, without storing these facts on Tweety directly, the network deduces that Tweety can fly and has wings through inheritance.

knowledge-representationsemantic-networks
7short8 marks

(a) Draw and explain the basic architecture of an expert system, clearly describing the function of the knowledge base and the inference engine. (5)

(b) Differentiate between forward chaining and backward chaining inference with a suitable example. **(3)

(a) Architecture of an Expert System (5)

   +-----------+      +----------------+      +-----------------+
   |   User    |<---->| User Interface |<---->| Inference Engine|
   +-----------+      +----------------+      +--------+--------+
                          ^                            |
                          |                            v
                  +---------------+            +-----------------+
                  | Explanation   |            |  Knowledge Base |
                  | Facility      |            | (rules + facts) |
                  +---------------+            +-----------------+
                          ^                            ^
                          |                            |
                  +---------------+            +-----------------+
                  | Working Memory|            | Knowledge       |
                  | (facts)       |            | Acquisition     |<-- Expert
                  +---------------+            +-----------------+

Knowledge Base: stores the domain knowledge as a set of IF–THEN production rules and facts gathered from human experts. It is the heart of the expert system and is kept separate from the reasoning mechanism so that knowledge can be updated independently.

Inference Engine: the reasoning component that applies the rules in the knowledge base to the facts in working memory to derive new conclusions / recommendations. It performs match → select (conflict resolution) → execute cycles, using forward or backward chaining, and drives the problem-solving process.

Other components: working memory (current facts), user interface, explanation facility (justifies conclusions), and knowledge-acquisition module (adds new knowledge).

(b) Forward vs Backward Chaining (3)

Forward chainingBackward chaining
Data-driven – starts from known factsGoal-driven – starts from a hypothesis/goal
Applies rules whose premises match facts to derive new facts until the goal is reachedLooks for rules whose conclusion is the goal, then tries to prove their premises
Good for monitoring/design (many possible conclusions)Good for diagnosis (a specific goal to confirm)

Example. Rules: R1: IF fever AND cough THEN flu; R2: IF flu THEN take rest. Facts: fever, cough.

  • Forward: fever+cough ⇒ flu (R1) ⇒ take rest (R2). Conclusion derived from data.
  • Backward: to prove goal take rest, R2 needs flu; to prove flu, R1 needs fever and cough, which are known ⇒ goal confirmed.
expert-systemsknowledge-base
8short6 marks

What is a heuristic function? Explain the working of hill-climbing search and discuss the problems of local maxima, plateau, and ridges. State one technique to overcome these problems.

Heuristic Function

A heuristic function h(n)h(n) is a problem-specific function that estimates the cost (or distance) from node nn to the goal. It uses domain knowledge to guide a search toward promising states, reducing the number of nodes explored. (e.g. straight-line distance in route finding, number of misplaced tiles in the 8-puzzle).

Hill-Climbing Search

Hill climbing is a local search that continually moves in the direction of increasing value (or decreasing cost). Working:

  1. Start with an initial state.
  2. Evaluate neighbours using the heuristic.
  3. Move to the neighbour with the best (highest) heuristic value.
  4. Repeat until no neighbour is better than the current state; return the current state.

It keeps only the current node (no backtracking, no search tree), so it is memory-efficient but greedy.

Problems

  • Local maximum: a peak higher than its neighbours but lower than the global maximum; the algorithm stops here, missing the true optimum.
  • Plateau: a flat region where neighbours have equal value, giving no gradient to follow, so the search wanders or halts.
  • Ridge: a sequence of local maxima oriented diagonally; every single-step move lowers the value, so the search cannot climb the ridge efficiently.

Technique to Overcome

Use random-restart hill climbing (run hill climbing from many random initial states and keep the best result) or simulated annealing (occasionally accept worse moves with a probability that decreases over time), which let the search escape local maxima, plateaus and ridges.

heuristicslocal-search
9short6 marks

Differentiate between supervised, unsupervised, and reinforcement learning with one application example of each. How does the problem of overfitting arise in supervised learning, and how can it be reduced?

Types of Machine Learning

TypeTraining dataGoalExample application
SupervisedLabelled data (input–output pairs)Learn a mapping from inputs to known outputsEmail spam classification; house-price prediction
UnsupervisedUnlabelled dataDiscover hidden structure / groupingsCustomer segmentation (clustering); dimensionality reduction
ReinforcementNo fixed dataset; an agent interacts with an environment and receives rewardsLearn a policy that maximises cumulative rewardGame playing (e.g. AlphaGo); robot navigation

Overfitting in Supervised Learning

Overfitting occurs when a model learns the training data too well, capturing not only the underlying pattern but also the noise and random fluctuations. As a result it gives very low training error but high test/generalisation error — it performs poorly on unseen data. It typically arises when the model is too complex (too many parameters) relative to the amount of training data, or when training runs too long.

Ways to reduce overfitting:

  • Use more training data.
  • Cross-validation to tune and select models.
  • Regularisation (L1/L2) to penalise large weights.
  • Pruning (decision trees) / dropout and early stopping (neural networks).
  • Reduce model complexity / number of features.
machine-learningsupervised-learning
10short6 marks

List and briefly explain the major stages of Natural Language Processing (NLP): lexical (morphological) analysis, syntactic analysis, semantic analysis, and pragmatic analysis. Give one example illustrating ambiguity handled at the syntactic level.

Major Stages of Natural Language Processing

  1. Lexical / Morphological analysis — Breaks the input text into tokens (words) and analyses the structure of individual words (root, prefix, suffix, part of speech). E.g. "unhappiness"un- + happy + -ness. Identifies valid words and their basic grammatical category.

  2. Syntactic analysis (parsing) — Arranges the tokens into a grammatical structure (parse tree) according to the rules of grammar, checking that the sentence is well-formed and revealing relationships such as subject–verb–object. Sentences that violate grammar (e.g. "Boy the apple eat") are rejected.

  3. Semantic analysis — Derives the literal meaning of the sentence by mapping syntactic structures to meaning, checking that combinations of words make sense (e.g. "colourless green ideas" is grammatical but semantically anomalous).

  4. Pragmatic analysis — Interprets the sentence in context, using real-world knowledge to find the intended meaning (e.g. "Can you pass the salt?" is a request, not a question about ability).

Example of Syntactic-Level Ambiguity

Sentence: "I saw the man with the telescope."

This has two valid parse trees (structural / syntactic ambiguity):

  • with the telescope attaches to sawI used a telescope to see the man.
  • with the telescope attaches to the manthe man who had a telescope was seen.

Resolving which prepositional-phrase attachment is correct is an ambiguity handled at the syntactic level.

nlpnatural-language-processing
11short6 marks

Explain the Minimax algorithm for two-player games using a suitable game tree. How does alpha-beta pruning improve its efficiency? Illustrate which branches are pruned in a small example tree.

Minimax Algorithm

Minimax is a decision rule for two-player, zero-sum, perfect-information games (e.g. tic-tac-toe, chess). The two players are MAX (tries to maximise the score) and MIN (tries to minimise it). Assuming both play optimally:

  • At MAX nodes choose the child with the maximum value.
  • At MIN nodes choose the child with the minimum value.
  • At terminal/leaf nodes use the utility (evaluation) function. Values are computed bottom-up to give the optimal move at the root.

Example game tree (leaf utilities shown):

            MAX
           /    \
        MIN      MIN
        / \      / \
       3   5    6   9
  • Left MIN = min(3,5) = 3
  • Right MIN = min(6,9) = 6
  • Root MAX = max(3,6) = 6 → MAX chooses the right branch.

Alpha-Beta Pruning

Alpha-beta pruning returns the same minimax value but skips branches that cannot influence the final decision, so it explores far fewer nodes.

  • α\alpha = best (highest) value found so far for MAX along the path.
  • β\beta = best (lowest) value found so far for MIN along the path.
  • Prune whenever αβ\alpha \ge \beta (further siblings cannot change the result).

With perfect move ordering, time improves from O(bd)O(b^d) to about O(bd/2)O(b^{d/2}), effectively doubling the search depth.

Pruning illustration (same tree):

            MAX
           /    \
        MIN      MIN
        / \      / \
       3   5    6   [9]
  • Evaluate left MIN = 3, so root has α=3\alpha = 3.
  • At right MIN, first child = 6. Since the right MIN will be 6\le 6 and could only lower its value, examine next child 9: 9 > 6 so MIN keeps 6. (If instead the first right child had been 3\le 3, the remaining sibling would be pruned because that MIN value α\le \alpha could never beat the left branch's 3 at MAX.)

General rule: a sibling is pruned once the running MIN value drops to α\le \alpha (or a running MAX value rises to β\ge \beta), because the opponent would never let that branch be reached.

adversarial-searchgame-playing
12short4 marks

Convert the following First Order Logic sentences into Conjunctive Normal Form (CNF), showing the steps of eliminating implications, moving negations inward, skolemization, and dropping universal quantifiers:

(i) ∀x (Person(x) → ∃y (Loves(x, y)))

(ii) ∀x (Bird(x) ∧ ¬Penguin(x) → CanFly(x))

Conversion of FOL Sentences to CNF

(i) x(Person(x)yLoves(x,y))\forall x\,(Person(x) \rightarrow \exists y\,Loves(x, y))

Step 1 – Eliminate implication (AB¬ABA\rightarrow B \equiv \neg A \vee B):

x(¬Person(x)yLoves(x,y))\forall x\,(\neg Person(x) \vee \exists y\,Loves(x, y))

Step 2 – Move negations inward: already in negation normal form (no change).

Step 3 – Skolemize. yy is existentially quantified inside the scope of x\forall x, so replace it with a Skolem function f(x)f(x) depending on xx:

x(¬Person(x)Loves(x,f(x)))\forall x\,(\neg Person(x) \vee Loves(x, f(x)))

Step 4 – Drop universal quantifiers:

¬Person(x)Loves(x,f(x))\boxed{\neg Person(x) \vee Loves(x, f(x))}

This is a single clause in CNF. (Reading: every person loves someone f(x)f(x), the person they love.)

(ii) x(Bird(x)¬Penguin(x)CanFly(x))\forall x\,(Bird(x) \wedge \neg Penguin(x) \rightarrow CanFly(x))

Step 1 – Eliminate implication:

x(¬(Bird(x)¬Penguin(x))CanFly(x))\forall x\,(\neg(Bird(x) \wedge \neg Penguin(x)) \vee CanFly(x))

Step 2 – Move negations inward (De Morgan, double negation):

x(¬Bird(x)Penguin(x)CanFly(x))\forall x\,(\neg Bird(x) \vee Penguin(x) \vee CanFly(x))

Step 3 – Skolemize: no existential quantifiers → nothing to do.

Step 4 – Drop universal quantifiers:

¬Bird(x)Penguin(x)CanFly(x)\boxed{\neg Bird(x) \vee Penguin(x) \vee CanFly(x)}

This is a single disjunctive clause already in CNF.

predicate-logicknowledge-representation

Frequently asked questions

Where can I find the BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) question paper 2078?
The full BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) 2078 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Artificial Intelligence (IOE, CT 653) 2078 paper come with solutions?
Yes. Every question on this Artificial Intelligence (IOE, CT 653) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) 2078 paper?
The BE Computer Engineering (IOE, TU) Artificial Intelligence (IOE, CT 653) 2078 paper carries 80 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Artificial Intelligence (IOE, CT 653) past paper free?
Yes — reading and attempting this Artificial Intelligence (IOE, CT 653) past paper on Kekkei is completely free.