BSc CSIT (TU) Science Image Processing (BSc CSIT, CSC413) Question Paper 2080 Nepal

Q: Where can I find the BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) question paper 2080?

The full BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) 2080 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Image Processing (BSc CSIT, CSC413) 2080 paper come with solutions?

Yes. Every question on this Image Processing (BSc CSIT, CSC413) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) 2080 paper?

The BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Image Processing (BSc CSIT, CSC413) past paper free?

Yes — reading and attempting this Image Processing (BSc CSIT, CSC413) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain image enhancement techniques in the spatial and frequency domains with suitable examples and their comparative advantages.

enhancement

Answer 1

Image Enhancement

Image enhancement is the process of improving the visual appearance of an image (or making it more suitable for a specific analysis task) by accentuating certain features while suppressing others. The two broad approaches are spatial-domain and frequency-domain methods.

1. Spatial-Domain Techniques

These operate directly on the pixels of the image. A spatial operation is expressed as:

g(x,y) = T[f(x,y)]

where $f$ is the input image, $g$ is the output and $T$ is an operator defined over a neighbourhood of $(x,y)$ .

(a) Point processing (intensity transformations) — $T$ acts on a single pixel.

Negative: $s = L-1-r$ (highlights white detail in dark regions).
Log transform: $s = c\log(1+r)$ (expands dark values, compresses bright ones).
Power-law (gamma): $s = c\,r^{\gamma}$ .
Contrast stretching and thresholding.
Histogram equalization: redistributes intensities to span the full range for better contrast.

(b) Spatial filtering (mask/neighbourhood processing) — a kernel is convolved with the image.

Smoothing (low-pass): averaging / Gaussian masks remove noise and blur.
Sharpening (high-pass): Laplacian, unsharp masking, high-boost filtering enhance edges.
Order-statistic filters: median filter for impulse noise.

2. Frequency-Domain Techniques

The image is transformed with the 2-D DFT $F(u,v)=\mathcal{F}\{f(x,y)\}$ , multiplied by a filter transfer function $H(u,v)$ , and transformed back:

g(x,y) = \mathcal{F}^{-1}\{H(u,v)\,F(u,v)\}

Low-pass filters (Ideal, Butterworth, Gaussian) attenuate high frequencies → smoothing / noise removal / blurring.
High-pass filters attenuate low frequencies → sharpening / edge enhancement.
Homomorphic filtering simultaneously compresses dynamic range and enhances contrast by separating illumination and reflectance.
Band-pass / band-reject (notch) filters remove periodic noise.

3. Comparative Advantages

Aspect	Spatial domain	Frequency domain
Operation	Direct on pixels (convolution)	Multiplication after DFT
Cost	Cheap for small kernels	Cheaper for large kernels (FFT $O(N\log N)$ )
Intuition	Easy, local control	Frequency-selective, global control
Best for	Point ops, small smoothing/sharpening masks	Periodic-noise removal, homomorphic filtering, large-support filters
Design	Kernel design can be ad hoc	Filter directly specified by frequency response

Example: Removing periodic interference (e.g. scan-line noise) is awkward in the spatial domain but trivial with a frequency-domain notch filter; conversely a quick 3×3 median to kill salt-and-pepper noise is simplest in the spatial domain.

Conclusion: Spatial methods are simple and computationally light for local operations, whereas frequency methods give precise, global, frequency-selective control and are efficient for large filters via the FFT. The two are linked by the convolution theorem, so a designer chooses the domain that is more convenient for the task.

Answer 2

Image Transforms

An image transform maps an image from the spatial domain into another domain (usually a frequency / sequency domain) using a set of orthogonal basis functions. The forward transform expresses the image as a weighted sum of these basis images; the coefficients reveal information (frequency, energy compaction) that is hard to see in the pixel domain and is useful for compression, filtering, feature extraction and enhancement. A general 2-D separable transform is:

T(u,v)=\sum_{x=0}^{N-1}\sum_{y=0}^{N-1} f(x,y)\,g(x,y,u,v)

where $g(\cdot)$ is the forward transformation kernel.

1. Fourier Transform (DFT)

Uses complex sinusoidal basis functions. The 2-D DFT is:

F(u,v)=\frac{1}{MN}\sum_{x=0}^{M-1}\sum_{y=0}^{N-1} f(x,y)\,e^{-j2\pi(ux/M+vy/N)}

Produces magnitude (energy at each frequency) and phase (structure/position).
Applications: frequency-domain filtering, fast convolution, periodic-noise removal, image analysis, computed via the FFT.

2. Discrete Cosine Transform (DCT)

Uses real cosine basis functions:

C(u,v)=\alpha(u)\alpha(v)\sum_{x}\sum_{y} f(x,y)\cos\!\frac{(2x+1)u\pi}{2N}\cos\!\frac{(2y+1)v\pi}{2N}

Excellent energy compaction — most signal energy packs into a few low-frequency coefficients; real-valued (no phase).
Applications: the heart of JPEG image compression (8×8 block DCT), MPEG video.

3. Walsh–Hadamard Transform

Basis functions are the Hadamard matrix entries, taking only values $+1$ and $-1$ :

H_{2N}=\begin{bmatrix}H_N & H_N\\ H_N & -H_N\end{bmatrix},\quad H_1=[1]

No multiplications needed (only additions/subtractions) → very fast and cheap hardware. Ordered by sequency (number of sign changes) rather than frequency.
Applications: low-cost compression, image coding, spread-spectrum, fast computation.

4. Haar Transform

The oldest wavelet transform; basis functions are rectangular Haar wavelets that are localized in both space and frequency.

Captures both average (approximation) and detail (edges) at multiple resolutions.
Applications: multiresolution analysis, edge detection, the basis of wavelet-based compression (JPEG-2000-style), feature extraction.

Comparison

Transform	Basis	Real/Complex	Strength
Fourier	Complex sinusoids	Complex	Frequency analysis, filtering
DCT	Cosines	Real	Best energy compaction (JPEG)
Hadamard	±1 (Walsh)	Real	Fastest, cheapest
Haar	Wavelets	Real	Localized, multiresolution

Conclusion: All four are orthogonal transforms used for compression and analysis. Fourier suits frequency filtering, DCT dominates practical compression, Hadamard wins on computational cost, and Haar provides spatial-frequency localization for multiresolution processing.

Answer 3

Image Segmentation

Image segmentation is the process of partitioning an image into multiple meaningful, non-overlapping regions (sets of pixels) that are homogeneous with respect to some property (intensity, colour, texture), so that objects and boundaries can be identified. Formally the image $R$ is divided into regions $R_1,R_2,\dots,R_n$ such that:

$\bigcup_{i} R_i = R$ (complete),
$R_i \cap R_j = \varnothing$ for $i\neq j$ (disjoint),
each $R_i$ is connected and satisfies a homogeneity predicate $P(R_i)=\text{TRUE}$ , while $P(R_i\cup R_j)=\text{FALSE}$ for adjacent regions.

Segmentation algorithms exploit two basic properties of intensity: discontinuity (edges) and similarity (regions/thresholds).

1. Thresholding

Separates objects from background using one or more intensity thresholds. For a single threshold $T$ :

g(x,y)=\begin{cases}1 & f(x,y)>T\\ 0 & f(x,y)\le T\end{cases}

Global thresholding: a single $T$ for the whole image; $T$ chosen from the histogram. Otsu's method picks $T$ that maximises between-class variance.
Adaptive/local thresholding: $T$ varies across the image to handle uneven illumination.
Pros: simple, fast, works well for bimodal histograms. Cons: sensitive to noise and illumination; ignores spatial context.

2. Edge-Based Segmentation

Detects boundaries where intensity changes abruptly (discontinuity), then links edges into contours.

First-derivative (gradient) operators: Roberts, Prewitt, Sobel detect edges as gradient maxima; magnitude $|\nabla f|=\sqrt{G_x^2+G_y^2}$ .
Second-derivative: Laplacian / Laplacian of Gaussian (LoG, Marr–Hildreth) find edges at zero-crossings.
Canny edge detector: Gaussian smoothing → gradient → non-maximum suppression → double thresholding with hysteresis; gives thin, well-localized edges.
Detected edge pixels are then linked (local processing / Hough transform) to form region boundaries.
Pros: good localization of boundaries. Cons: sensitive to noise, edges may be broken/incomplete.

3. Region-Based Segmentation

Groups pixels directly into regions based on similarity.

Region growing: starts from seed pixels and appends neighbouring pixels that satisfy a similarity criterion (e.g. intensity within a tolerance) until no more pixels qualify.
Region splitting and merging: uses a quadtree — split a region into quadrants if it is not homogeneous ( $P(R_i)=\text{FALSE}$ ), then merge adjacent homogeneous regions.
Pros: produces connected regions, more robust to noise, uses spatial connectivity. Cons: needs seed/criterion selection, can be computationally heavy, sensitive to seed choice.

Summary

Method	Based on	Strength	Weakness
Thresholding	Similarity (intensity)	Simple, fast	Poor with non-uniform illumination
Edge-based	Discontinuity	Good boundary localization	Broken edges, noise-sensitive
Region-based	Similarity + connectivity	Connected, robust regions	Slow, seed-dependent

Conclusion: Thresholding and region methods rely on pixel similarity, while edge methods rely on discontinuity. In practice they are often combined (e.g. edges + region growing) for reliable segmentation.

Answer 4

Spatial resolution refers to the smallest discernible detail in an image — it is determined by the number of pixels (sampling density), usually stated as rows × columns or dots-per-inch (dpi). More pixels (finer sampling) → higher spatial resolution and more spatial detail; reducing it causes blocky/pixelated images.

Intensity (gray-level) resolution refers to the smallest discernible change in gray level — it is determined by the number of bits used to quantize the intensity. For $k$ bits there are $L=2^{k}$ gray levels (e.g. 8 bits → 256 levels). More levels → finer tonal/brightness detail; reducing it causes false contouring (banding).

	Spatial resolution	Intensity resolution
Determined by	Sampling (number of pixels)	Quantization (bits per pixel)
Measures	Spatial detail	Brightness/gray-level detail
Degradation effect	Pixelation/blockiness	False contouring (banding)

In short, spatial resolution = how many pixels, while intensity resolution = how many gray levels per pixel.

Answer 5

Power-law (gamma) transformation is a point intensity transformation of the form:

s = c\,r^{\gamma}

where $r$ is the input intensity (often normalized to $[0,1]$ ), $s$ is the output intensity, and $c$ and $\gamma>0$ are positive constants.

$\gamma < 1$ : maps a narrow range of dark input values to a wider range of output values → brightens the image and expands contrast in dark regions.
$\gamma > 1$ : maps a narrow range of bright input values to a wider output range → darkens the image and expands contrast in bright regions.
$\gamma = 1$ : linear (identity) mapping.

Gamma correction uses this law to compensate for the non-linear response of display devices (CRT/LCD), which inherently follow $s\propto r^{\gamma}$ with $\gamma\approx 1.8$ – $2.5$ ; an inverse power law is applied so the displayed image looks correct. It is also widely used for general contrast manipulation, MRI/X-ray enhancement and aerial-image enhancement.

Unlike the fixed log transform, gamma gives a family of curves controlled by $\gamma$ , so the amount of brightening/darkening can be tuned.

Answer 6

A median filter is a non-linear, order-statistic spatial filter used mainly to remove impulse (salt-and-pepper) noise while preserving edges.

Operation: A window (e.g. 3×3) slides over the image. For each pixel, the gray values inside the window are sorted and the median value replaces the centre pixel:

g(x,y)=\operatorname{median}\{ f(s,t) : (s,t)\in S_{xy}\}

Why it works for noise removal:

Salt-and-pepper noise creates extreme (very high or very low) outlier pixels. Outliers fall at the ends of the sorted list, so the median rejects them, whereas a mean (averaging) filter would smear them across neighbours.
Because the output is an actual pixel value from the neighbourhood (not an average), the median filter preserves sharp edges and causes far less blurring than a linear averaging filter.

Limitations: Less effective against Gaussian noise; very large windows blur fine detail; computationally costlier (requires sorting).

Summary: The median filter is the preferred choice for removing impulse noise with good edge preservation.

Answer 7

Properties of the 2-D DFT

For an $M\times N$ image $f(x,y)$ with transform $F(u,v)=\frac{1}{MN}\sum_{x}\sum_{y} f(x,y)e^{-j2\pi(ux/M+vy/N)}$ , the key properties are:

Separability: the 2-D DFT can be computed as two successive 1-D DFTs — first along rows, then along columns — allowing use of the FFT for speed.
Linearity: $\mathcal{F}\{a f_1 + b f_2\}=aF_1+bF_2$ .
Translation (shift): a spatial shift adds a phase term, $f(x-x_0,y-y_0)\leftrightarrow F(u,v)e^{-j2\pi(ux_0/M+vy_0/N)}$ ; multiplying $f$ by an exponential shifts the spectrum (used to centre the spectrum with $(-1)^{x+y}$ ).
Periodicity: $F(u,v)=F(u+M,v)=F(u,v+N)$ — the DFT is periodic in both directions.
Conjugate symmetry: for a real image, $F(u,v)=F^{*}(-u,-v)$ , so $|F(u,v)|=|F(-u,-v)|$ (magnitude spectrum is symmetric).
Rotation: rotating the image by an angle rotates its spectrum by the same angle (shown in polar coordinates).
Convolution theorem: spatial convolution corresponds to multiplication in the frequency domain, $f(x,y)*h(x,y)\leftrightarrow F(u,v)H(u,v)$ (basis of frequency-domain filtering).
Scaling / distributivity, and DC term: $F(0,0)=\frac{1}{MN}\sum\sum f(x,y)$ = average intensity; scaling spatial coordinates inversely scales the frequency axes.
Average value & Parseval (energy conservation): total energy is preserved between domains.

These properties make the 2-D DFT a practical tool for filtering, convolution and analysis.

Answer 8

All three are intensity-discontinuity detectors that use spatial masks (typically based on the second derivative / Laplacian), but they detect different geometric features.

Feature	What it detects	Idea / mask	Output
Point detection	An isolated point (single pixel) that differs sharply from its surroundings	Laplacian mask, e.g. centre $+8$ , neighbours $-1$ ; flag pixel if $	R
Line detection	Thin lines one pixel wide in a particular orientation	Directional masks for horizontal, vertical, $+45°$ , $-45°$ ; choose the largest response	Line segments
Edge detection	A boundary between two regions of different intensity (a step/ramp change)	Gradient (Sobel/Prewitt – first derivative) or Laplacian zero-crossings (second derivative)	Region boundaries

Distinguishing points:

A point is a 0-D discontinuity (single pixel), a line is a 1-D feature (thin, oriented), and an edge separates two extended regions.
Point and line detection typically use the Laplacian (second derivative); edge detection commonly uses the gradient (first derivative) whose magnitude peaks at edges, or the second derivative's zero-crossings.
Edges are the most common and important for segmentation; points and lines are special, sparser features.

Answer 9

The hit-or-miss transform (HMT) is a basic morphological operation used to detect specific shapes/patterns in a binary image — it locates pixel configurations that simultaneously match the foreground and match the background around them.

It uses a pair of disjoint structuring elements $B=(B_1,B_2)$ : $B_1$ must fit inside the object (a hit) and $B_2$ must fit inside the background/complement (a miss). The transform is defined as:

A \circledast B = (A \ominus B_1)\,\cap\,(A^{c} \ominus B_2)

where $\ominus$ is erosion, $A^{c}$ is the complement of $A$ , and $B_1, B_2$ are disjoint ( $B_1\cap B_2=\varnothing$ ).

Interpretation: A pixel survives only if the foreground pattern $B_1$ fits the object and the background pattern $B_2$ fits the surrounding background at that location — i.e. an exact local shape match.

Applications: template/shape matching, corner and endpoint detection, isolated-pixel detection, and as a building block for thinning, thickening and skeletonization.

In short, the hit-or-miss transform is the morphological tool for exact pattern/shape detection in binary images.

Answer 10

Predictive coding is a lossless or lossy compression technique that reduces redundancy by coding the difference (prediction error) between each pixel and a value predicted from previously coded neighbouring pixels, instead of coding the pixel itself. Because neighbouring pixels are highly correlated, this error is small and concentrated near zero, so it can be entropy-coded with far fewer bits.

Basic scheme (DPCM):

A predictor estimates the current pixel from past pixels: $\hat{f}_n = \operatorname{round}\!\Big(\sum_{i} a_i f_{n-i}\Big)$ (e.g. $\hat{f}_n=f_{n-1}$ ).
Compute the prediction error / residual: $e_n = f_n - \hat{f}_n$ .
Encode $e_n$ (quantize if lossy) and apply entropy coding (Huffman/arithmetic).
The decoder uses the same predictor and adds the decoded error back to reconstruct the pixel: $f_n = \hat{f}_n + e_n$ .

Lossless vs lossy: if the residual is coded exactly, reconstruction is perfect (lossless, e.g. used in lossless JPEG); if a quantizer is inserted (DPCM), it becomes lossy with higher compression.

Advantages: simple, removes inter-pixel (spatial/temporal) redundancy, low complexity. Used in lossless JPEG, JPEG-LS, and motion-compensated prediction in video.

Answer 11

A Gaussian filter is a linear, low-pass smoothing filter whose kernel weights follow a 2-D Gaussian function:

h(x,y)=\frac{1}{2\pi\sigma^{2}}\,e^{-\frac{x^{2}+y^{2}}{2\sigma^{2}}}

The standard deviation $\sigma$ controls the amount of smoothing (larger $\sigma$ → wider kernel → more blurring).

Role / uses:

Noise reduction: smooths the image by weighted averaging, suppressing Gaussian/random noise while giving the centre pixel the most weight, so it blurs less harshly than a box (mean) filter.
Edge/feature preparation: used as the pre-smoothing stage in edge detectors (e.g. Canny, Laplacian of Gaussian / Marr–Hildreth) to reduce noise before differentiation.
Scale-space / multiresolution: repeated Gaussian smoothing builds image pyramids (e.g. for SIFT).

Advantages:

It is separable (a 2-D Gaussian = product of two 1-D Gaussians) → computationally efficient.
It has no sharp cutoff, so its frequency response is smooth → no ringing artifacts (unlike the ideal low-pass filter).
Rotationally symmetric and optimally trades off spatial and frequency localization.

Trade-off: smoothing also blurs edges; larger $\sigma$ removes more noise but loses more detail.

Answer 12

Image Restoration Techniques

Image restoration aims to recover an original image from a degraded version by modelling the degradation process and applying its inverse — unlike enhancement, it is objective and based on the known/estimated degradation. The degradation model is:

g(x,y) = f(x,y)*h(x,y) + \eta(x,y) \;\Longleftrightarrow\; G(u,v)=F(u,v)H(u,v)+N(u,v)

where $h$ is the degradation (blur) function and $\eta$ is additive noise. The goal is to estimate $\hat{f}$ .

Main techniques:

Noise-only restoration (spatial filters): when only noise is present, use mean filters (arithmetic, geometric), order-statistic filters (median, max, min) and adaptive filters to remove noise based on its statistics.
Inverse filtering: estimate $\hat{F}(u,v)=G(u,v)/H(u,v)$ . Simple but amplifies noise badly where $H(u,v)$ is small/zero.
Wiener (minimum mean-square-error) filtering: balances inverse filtering and noise smoothing:

\hat{F}(u,v)=\left[\frac{|H|^{2}}{|H|^{2}+S_\eta/S_f}\,\frac{1}{H}\right]G(u,v)

It incorporates the noise-to-signal power ratio and gives the best MSE estimate.

Constrained least-squares filtering: restores using a smoothness (Laplacian) constraint, needing only the noise variance — more robust than Wiener when statistics are unknown.
Geometric / blind deconvolution: correct geometric distortions or estimate $h$ when it is unknown.

Summary: Restoration models degradation ( $h$ + noise) and inverts it; in practice Wiener and constrained least-squares filters are preferred over plain inverse filtering because they control noise amplification.

Level	BSc CSIT (TU)
Stream	Science
Subject	Image Processing (BSc CSIT, CSC413)
Year	2080 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions