Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Explain image enhancement techniques in the spatial and frequency domains with suitable examples and their comparative advantages.

Image Enhancement

Image enhancement is the process of improving the visual appearance of an image (or making it more suitable for a specific analysis task) by accentuating certain features while suppressing others. The two broad approaches are spatial-domain and frequency-domain methods.

1. Spatial-Domain Techniques

These operate directly on the pixels of the image. A spatial operation is expressed as:

g(x,y)=T[f(x,y)]g(x,y) = T[f(x,y)]

where ff is the input image, gg is the output and TT is an operator defined over a neighbourhood of (x,y)(x,y).

(a) Point processing (intensity transformations)TT acts on a single pixel.

  • Negative: s=L1rs = L-1-r (highlights white detail in dark regions).
  • Log transform: s=clog(1+r)s = c\log(1+r) (expands dark values, compresses bright ones).
  • Power-law (gamma): s=crγs = c\,r^{\gamma}.
  • Contrast stretching and thresholding.
  • Histogram equalization: redistributes intensities to span the full range for better contrast.

(b) Spatial filtering (mask/neighbourhood processing) — a kernel is convolved with the image.

  • Smoothing (low-pass): averaging / Gaussian masks remove noise and blur.
  • Sharpening (high-pass): Laplacian, unsharp masking, high-boost filtering enhance edges.
  • Order-statistic filters: median filter for impulse noise.

2. Frequency-Domain Techniques

The image is transformed with the 2-D DFT F(u,v)=F{f(x,y)}F(u,v)=\mathcal{F}\{f(x,y)\}, multiplied by a filter transfer function H(u,v)H(u,v), and transformed back:

g(x,y)=F1{H(u,v)F(u,v)}g(x,y) = \mathcal{F}^{-1}\{H(u,v)\,F(u,v)\}
  • Low-pass filters (Ideal, Butterworth, Gaussian) attenuate high frequencies → smoothing / noise removal / blurring.
  • High-pass filters attenuate low frequencies → sharpening / edge enhancement.
  • Homomorphic filtering simultaneously compresses dynamic range and enhances contrast by separating illumination and reflectance.
  • Band-pass / band-reject (notch) filters remove periodic noise.

3. Comparative Advantages

AspectSpatial domainFrequency domain
OperationDirect on pixels (convolution)Multiplication after DFT
CostCheap for small kernelsCheaper for large kernels (FFT O(NlogN)O(N\log N))
IntuitionEasy, local controlFrequency-selective, global control
Best forPoint ops, small smoothing/sharpening masksPeriodic-noise removal, homomorphic filtering, large-support filters
DesignKernel design can be ad hocFilter directly specified by frequency response

Example: Removing periodic interference (e.g. scan-line noise) is awkward in the spatial domain but trivial with a frequency-domain notch filter; conversely a quick 3×3 median to kill salt-and-pepper noise is simplest in the spatial domain.

Conclusion: Spatial methods are simple and computationally light for local operations, whereas frequency methods give precise, global, frequency-selective control and are efficient for large filters via the FFT. The two are linked by the convolution theorem, so a designer chooses the domain that is more convenient for the task.

enhancement
2long10 marks

What are image transforms? Explain the Fourier, cosine, Hadamard, and Haar transforms and their applications in image processing.

Image Transforms

An image transform maps an image from the spatial domain into another domain (usually a frequency / sequency domain) using a set of orthogonal basis functions. The forward transform expresses the image as a weighted sum of these basis images; the coefficients reveal information (frequency, energy compaction) that is hard to see in the pixel domain and is useful for compression, filtering, feature extraction and enhancement. A general 2-D separable transform is:

T(u,v)=x=0N1y=0N1f(x,y)g(x,y,u,v)T(u,v)=\sum_{x=0}^{N-1}\sum_{y=0}^{N-1} f(x,y)\,g(x,y,u,v)

where g()g(\cdot) is the forward transformation kernel.

1. Fourier Transform (DFT)

Uses complex sinusoidal basis functions. The 2-D DFT is:

F(u,v)=1MNx=0M1y=0N1f(x,y)ej2π(ux/M+vy/N)F(u,v)=\frac{1}{MN}\sum_{x=0}^{M-1}\sum_{y=0}^{N-1} f(x,y)\,e^{-j2\pi(ux/M+vy/N)}
  • Produces magnitude (energy at each frequency) and phase (structure/position).
  • Applications: frequency-domain filtering, fast convolution, periodic-noise removal, image analysis, computed via the FFT.

2. Discrete Cosine Transform (DCT)

Uses real cosine basis functions:

C(u,v)=α(u)α(v)xyf(x,y)cos ⁣(2x+1)uπ2Ncos ⁣(2y+1)vπ2NC(u,v)=\alpha(u)\alpha(v)\sum_{x}\sum_{y} f(x,y)\cos\!\frac{(2x+1)u\pi}{2N}\cos\!\frac{(2y+1)v\pi}{2N}
  • Excellent energy compaction — most signal energy packs into a few low-frequency coefficients; real-valued (no phase).
  • Applications: the heart of JPEG image compression (8×8 block DCT), MPEG video.

3. Walsh–Hadamard Transform

Basis functions are the Hadamard matrix entries, taking only values +1+1 and 1-1:

H2N=[HNHNHNHN],H1=[1]H_{2N}=\begin{bmatrix}H_N & H_N\\ H_N & -H_N\end{bmatrix},\quad H_1=[1]
  • No multiplications needed (only additions/subtractions) → very fast and cheap hardware. Ordered by sequency (number of sign changes) rather than frequency.
  • Applications: low-cost compression, image coding, spread-spectrum, fast computation.

4. Haar Transform

The oldest wavelet transform; basis functions are rectangular Haar wavelets that are localized in both space and frequency.

  • Captures both average (approximation) and detail (edges) at multiple resolutions.
  • Applications: multiresolution analysis, edge detection, the basis of wavelet-based compression (JPEG-2000-style), feature extraction.

Comparison

TransformBasisReal/ComplexStrength
FourierComplex sinusoidsComplexFrequency analysis, filtering
DCTCosinesRealBest energy compaction (JPEG)
Hadamard±1 (Walsh)RealFastest, cheapest
HaarWaveletsRealLocalized, multiresolution

Conclusion: All four are orthogonal transforms used for compression and analysis. Fourier suits frequency filtering, DCT dominates practical compression, Hadamard wins on computational cost, and Haar provides spatial-frequency localization for multiresolution processing.

transformsfourier
3long10 marks

Explain image segmentation. Discuss thresholding, edge-based, and region-based segmentation methods in detail.

Image Segmentation

Image segmentation is the process of partitioning an image into multiple meaningful, non-overlapping regions (sets of pixels) that are homogeneous with respect to some property (intensity, colour, texture), so that objects and boundaries can be identified. Formally the image RR is divided into regions R1,R2,,RnR_1,R_2,\dots,R_n such that:

  • iRi=R\bigcup_{i} R_i = R (complete),
  • RiRj=R_i \cap R_j = \varnothing for iji\neq j (disjoint),
  • each RiR_i is connected and satisfies a homogeneity predicate P(Ri)=TRUEP(R_i)=\text{TRUE}, while P(RiRj)=FALSEP(R_i\cup R_j)=\text{FALSE} for adjacent regions.

Segmentation algorithms exploit two basic properties of intensity: discontinuity (edges) and similarity (regions/thresholds).

1. Thresholding

Separates objects from background using one or more intensity thresholds. For a single threshold TT:

g(x,y)={1f(x,y)>T0f(x,y)Tg(x,y)=\begin{cases}1 & f(x,y)>T\\ 0 & f(x,y)\le T\end{cases}
  • Global thresholding: a single TT for the whole image; TT chosen from the histogram. Otsu's method picks TT that maximises between-class variance.
  • Adaptive/local thresholding: TT varies across the image to handle uneven illumination.
  • Pros: simple, fast, works well for bimodal histograms. Cons: sensitive to noise and illumination; ignores spatial context.

2. Edge-Based Segmentation

Detects boundaries where intensity changes abruptly (discontinuity), then links edges into contours.

  • First-derivative (gradient) operators: Roberts, Prewitt, Sobel detect edges as gradient maxima; magnitude f=Gx2+Gy2|\nabla f|=\sqrt{G_x^2+G_y^2}.
  • Second-derivative: Laplacian / Laplacian of Gaussian (LoG, Marr–Hildreth) find edges at zero-crossings.
  • Canny edge detector: Gaussian smoothing → gradient → non-maximum suppression → double thresholding with hysteresis; gives thin, well-localized edges.
  • Detected edge pixels are then linked (local processing / Hough transform) to form region boundaries.
  • Pros: good localization of boundaries. Cons: sensitive to noise, edges may be broken/incomplete.

3. Region-Based Segmentation

Groups pixels directly into regions based on similarity.

  • Region growing: starts from seed pixels and appends neighbouring pixels that satisfy a similarity criterion (e.g. intensity within a tolerance) until no more pixels qualify.
  • Region splitting and merging: uses a quadtree — split a region into quadrants if it is not homogeneous (P(Ri)=FALSEP(R_i)=\text{FALSE}), then merge adjacent homogeneous regions.
  • Pros: produces connected regions, more robust to noise, uses spatial connectivity. Cons: needs seed/criterion selection, can be computationally heavy, sensitive to seed choice.

Summary

MethodBased onStrengthWeakness
ThresholdingSimilarity (intensity)Simple, fastPoor with non-uniform illumination
Edge-basedDiscontinuityGood boundary localizationBroken edges, noise-sensitive
Region-basedSimilarity + connectivityConnected, robust regionsSlow, seed-dependent

Conclusion: Thresholding and region methods rely on pixel similarity, while edge methods rely on discontinuity. In practice they are often combined (e.g. edges + region growing) for reliable segmentation.

segmentation
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

What is the difference between spatial and intensity resolution?

Spatial resolution refers to the smallest discernible detail in an image — it is determined by the number of pixels (sampling density), usually stated as rows × columns or dots-per-inch (dpi). More pixels (finer sampling) → higher spatial resolution and more spatial detail; reducing it causes blocky/pixelated images.

Intensity (gray-level) resolution refers to the smallest discernible change in gray level — it is determined by the number of bits used to quantize the intensity. For kk bits there are L=2kL=2^{k} gray levels (e.g. 8 bits → 256 levels). More levels → finer tonal/brightness detail; reducing it causes false contouring (banding).

Spatial resolutionIntensity resolution
Determined bySampling (number of pixels)Quantization (bits per pixel)
MeasuresSpatial detailBrightness/gray-level detail
Degradation effectPixelation/blockinessFalse contouring (banding)

In short, spatial resolution = how many pixels, while intensity resolution = how many gray levels per pixel.

fundamentals
5short5 marks

Explain power-law (gamma) transformation.

Power-law (gamma) transformation is a point intensity transformation of the form:

s=crγs = c\,r^{\gamma}

where rr is the input intensity (often normalized to [0,1][0,1]), ss is the output intensity, and cc and γ>0\gamma>0 are positive constants.

  • γ<1\gamma < 1: maps a narrow range of dark input values to a wider range of output values → brightens the image and expands contrast in dark regions.
  • γ>1\gamma > 1: maps a narrow range of bright input values to a wider output range → darkens the image and expands contrast in bright regions.
  • γ=1\gamma = 1: linear (identity) mapping.

Gamma correction uses this law to compensate for the non-linear response of display devices (CRT/LCD), which inherently follow srγs\propto r^{\gamma} with γ1.8\gamma\approx 1.82.52.5; an inverse power law is applied so the displayed image looks correct. It is also widely used for general contrast manipulation, MRI/X-ray enhancement and aerial-image enhancement.

Unlike the fixed log transform, gamma gives a family of curves controlled by γ\gamma, so the amount of brightening/darkening can be tuned.

enhancement
6short5 marks

What is the use of a median filter in noise removal?

A median filter is a non-linear, order-statistic spatial filter used mainly to remove impulse (salt-and-pepper) noise while preserving edges.

Operation: A window (e.g. 3×3) slides over the image. For each pixel, the gray values inside the window are sorted and the median value replaces the centre pixel:

g(x,y)=median{f(s,t):(s,t)Sxy}g(x,y)=\operatorname{median}\{ f(s,t) : (s,t)\in S_{xy}\}

Why it works for noise removal:

  • Salt-and-pepper noise creates extreme (very high or very low) outlier pixels. Outliers fall at the ends of the sorted list, so the median rejects them, whereas a mean (averaging) filter would smear them across neighbours.
  • Because the output is an actual pixel value from the neighbourhood (not an average), the median filter preserves sharp edges and causes far less blurring than a linear averaging filter.

Limitations: Less effective against Gaussian noise; very large windows blur fine detail; computationally costlier (requires sorting).

Summary: The median filter is the preferred choice for removing impulse noise with good edge preservation.

filtering
7short5 marks

Explain the properties of the 2D DFT.

Properties of the 2-D DFT

For an M×NM\times N image f(x,y)f(x,y) with transform F(u,v)=1MNxyf(x,y)ej2π(ux/M+vy/N)F(u,v)=\frac{1}{MN}\sum_{x}\sum_{y} f(x,y)e^{-j2\pi(ux/M+vy/N)}, the key properties are:

  1. Separability: the 2-D DFT can be computed as two successive 1-D DFTs — first along rows, then along columns — allowing use of the FFT for speed.

  2. Linearity: F{af1+bf2}=aF1+bF2\mathcal{F}\{a f_1 + b f_2\}=aF_1+bF_2.

  3. Translation (shift): a spatial shift adds a phase term, f(xx0,yy0)F(u,v)ej2π(ux0/M+vy0/N)f(x-x_0,y-y_0)\leftrightarrow F(u,v)e^{-j2\pi(ux_0/M+vy_0/N)}; multiplying ff by an exponential shifts the spectrum (used to centre the spectrum with (1)x+y(-1)^{x+y}).

  4. Periodicity: F(u,v)=F(u+M,v)=F(u,v+N)F(u,v)=F(u+M,v)=F(u,v+N) — the DFT is periodic in both directions.

  5. Conjugate symmetry: for a real image, F(u,v)=F(u,v)F(u,v)=F^{*}(-u,-v), so F(u,v)=F(u,v)|F(u,v)|=|F(-u,-v)| (magnitude spectrum is symmetric).

  6. Rotation: rotating the image by an angle rotates its spectrum by the same angle (shown in polar coordinates).

  7. Convolution theorem: spatial convolution corresponds to multiplication in the frequency domain, f(x,y)h(x,y)F(u,v)H(u,v)f(x,y)*h(x,y)\leftrightarrow F(u,v)H(u,v) (basis of frequency-domain filtering).

  8. Scaling / distributivity, and DC term: F(0,0)=1MNf(x,y)F(0,0)=\frac{1}{MN}\sum\sum f(x,y) = average intensity; scaling spatial coordinates inversely scales the frequency axes.

  9. Average value & Parseval (energy conservation): total energy is preserved between domains.

These properties make the 2-D DFT a practical tool for filtering, convolution and analysis.

fourier
8short5 marks

Differentiate between point, line, and edge detection.

All three are intensity-discontinuity detectors that use spatial masks (typically based on the second derivative / Laplacian), but they detect different geometric features.

FeatureWhat it detectsIdea / maskOutput
Point detectionAn isolated point (single pixel) that differs sharply from its surroundingsLaplacian mask, e.g. centre +8+8, neighbours 1-1; flag pixel if $R
Line detectionThin lines one pixel wide in a particular orientationDirectional masks for horizontal, vertical, +45°+45°, 45°-45°; choose the largest responseLine segments
Edge detectionA boundary between two regions of different intensity (a step/ramp change)Gradient (Sobel/Prewitt – first derivative) or Laplacian zero-crossings (second derivative)Region boundaries

Distinguishing points:

  • A point is a 0-D discontinuity (single pixel), a line is a 1-D feature (thin, oriented), and an edge separates two extended regions.
  • Point and line detection typically use the Laplacian (second derivative); edge detection commonly uses the gradient (first derivative) whose magnitude peaks at edges, or the second derivative's zero-crossings.
  • Edges are the most common and important for segmentation; points and lines are special, sparser features.
edge-detection
9short5 marks

What is hit-or-miss transformation?

The hit-or-miss transform (HMT) is a basic morphological operation used to detect specific shapes/patterns in a binary image — it locates pixel configurations that simultaneously match the foreground and match the background around them.

It uses a pair of disjoint structuring elements B=(B1,B2)B=(B_1,B_2): B1B_1 must fit inside the object (a hit) and B2B_2 must fit inside the background/complement (a miss). The transform is defined as:

AB=(AB1)(AcB2)A \circledast B = (A \ominus B_1)\,\cap\,(A^{c} \ominus B_2)

where \ominus is erosion, AcA^{c} is the complement of AA, and B1,B2B_1, B_2 are disjoint (B1B2=B_1\cap B_2=\varnothing).

Interpretation: A pixel survives only if the foreground pattern B1B_1 fits the object and the background pattern B2B_2 fits the surrounding background at that location — i.e. an exact local shape match.

Applications: template/shape matching, corner and endpoint detection, isolated-pixel detection, and as a building block for thinning, thickening and skeletonization.

In short, the hit-or-miss transform is the morphological tool for exact pattern/shape detection in binary images.

morphology
10short5 marks

Explain predictive coding in compression.

Predictive coding is a lossless or lossy compression technique that reduces redundancy by coding the difference (prediction error) between each pixel and a value predicted from previously coded neighbouring pixels, instead of coding the pixel itself. Because neighbouring pixels are highly correlated, this error is small and concentrated near zero, so it can be entropy-coded with far fewer bits.

Basic scheme (DPCM):

  1. A predictor estimates the current pixel from past pixels: f^n=round ⁣(iaifni)\hat{f}_n = \operatorname{round}\!\Big(\sum_{i} a_i f_{n-i}\Big) (e.g. f^n=fn1\hat{f}_n=f_{n-1}).
  2. Compute the prediction error / residual: en=fnf^ne_n = f_n - \hat{f}_n.
  3. Encode ene_n (quantize if lossy) and apply entropy coding (Huffman/arithmetic).
  4. The decoder uses the same predictor and adds the decoded error back to reconstruct the pixel: fn=f^n+enf_n = \hat{f}_n + e_n.

Lossless vs lossy: if the residual is coded exactly, reconstruction is perfect (lossless, e.g. used in lossless JPEG); if a quantizer is inserted (DPCM), it becomes lossy with higher compression.

Advantages: simple, removes inter-pixel (spatial/temporal) redundancy, low complexity. Used in lossless JPEG, JPEG-LS, and motion-compensated prediction in video.

compression
11short5 marks

What is the role of a Gaussian filter?

A Gaussian filter is a linear, low-pass smoothing filter whose kernel weights follow a 2-D Gaussian function:

h(x,y)=12πσ2ex2+y22σ2h(x,y)=\frac{1}{2\pi\sigma^{2}}\,e^{-\frac{x^{2}+y^{2}}{2\sigma^{2}}}

The standard deviation σ\sigma controls the amount of smoothing (larger σ\sigma → wider kernel → more blurring).

Role / uses:

  • Noise reduction: smooths the image by weighted averaging, suppressing Gaussian/random noise while giving the centre pixel the most weight, so it blurs less harshly than a box (mean) filter.
  • Edge/feature preparation: used as the pre-smoothing stage in edge detectors (e.g. Canny, Laplacian of Gaussian / Marr–Hildreth) to reduce noise before differentiation.
  • Scale-space / multiresolution: repeated Gaussian smoothing builds image pyramids (e.g. for SIFT).

Advantages:

  • It is separable (a 2-D Gaussian = product of two 1-D Gaussians) → computationally efficient.
  • It has no sharp cutoff, so its frequency response is smooth → no ringing artifacts (unlike the ideal low-pass filter).
  • Rotationally symmetric and optimally trades off spatial and frequency localization.

Trade-off: smoothing also blurs edges; larger σ\sigma removes more noise but loses more detail.

filtering
12short5 marks

Write short notes on image restoration techniques.

Image Restoration Techniques

Image restoration aims to recover an original image from a degraded version by modelling the degradation process and applying its inverse — unlike enhancement, it is objective and based on the known/estimated degradation. The degradation model is:

g(x,y)=f(x,y)h(x,y)+η(x,y)    G(u,v)=F(u,v)H(u,v)+N(u,v)g(x,y) = f(x,y)*h(x,y) + \eta(x,y) \;\Longleftrightarrow\; G(u,v)=F(u,v)H(u,v)+N(u,v)

where hh is the degradation (blur) function and η\eta is additive noise. The goal is to estimate f^\hat{f}.

Main techniques:

  1. Noise-only restoration (spatial filters): when only noise is present, use mean filters (arithmetic, geometric), order-statistic filters (median, max, min) and adaptive filters to remove noise based on its statistics.

  2. Inverse filtering: estimate F^(u,v)=G(u,v)/H(u,v)\hat{F}(u,v)=G(u,v)/H(u,v). Simple but amplifies noise badly where H(u,v)H(u,v) is small/zero.

  3. Wiener (minimum mean-square-error) filtering: balances inverse filtering and noise smoothing:

F^(u,v)=[H2H2+Sη/Sf1H]G(u,v)\hat{F}(u,v)=\left[\frac{|H|^{2}}{|H|^{2}+S_\eta/S_f}\,\frac{1}{H}\right]G(u,v)

It incorporates the noise-to-signal power ratio and gives the best MSE estimate.

  1. Constrained least-squares filtering: restores using a smoothness (Laplacian) constraint, needing only the noise variance — more robust than Wiener when statistics are unknown.

  2. Geometric / blind deconvolution: correct geometric distortions or estimate hh when it is unknown.

Summary: Restoration models degradation (hh + noise) and inverts it; in practice Wiener and constrained least-squares filters are preferred over plain inverse filtering because they control noise amplification.

restoration

Frequently asked questions

Where can I find the BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) question paper 2080?
The full BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) 2080 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Image Processing (BSc CSIT, CSC413) 2080 paper come with solutions?
Yes. Every question on this Image Processing (BSc CSIT, CSC413) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) 2080 paper?
The BSc CSIT (TU) Image Processing (BSc CSIT, CSC413) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Image Processing (BSc CSIT, CSC413) past paper free?
Yes — reading and attempting this Image Processing (BSc CSIT, CSC413) past paper on Kekkei is completely free.