Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Explain the MPEG video compression standard. Discuss I-frames, P-frames and B-frames, motion estimation and compensation, and the group of pictures (GOP) structure.

MPEG Video Compression Standard

MPEG (Moving Picture Experts Group) defines a family of standards (MPEG-1, MPEG-2, MPEG-4) for compressing digital video and audio. Video compression in MPEG exploits two kinds of redundancy:

  • Spatial redundancy within a single frame, removed using block-based DCT, quantization and entropy coding (like JPEG).
  • Temporal redundancy between successive frames, removed using motion estimation and motion compensation.

Frame Types (Picture Types)

FrameNameCodingCompressionRandom access
IIntra-codedCoded independently (like JPEG), no reference to other framesLowestYes (entry point)
PPredictivePredicted from the previous I or P frame (forward prediction)MediumNo
BBidirectionalPredicted from both the previous and next I/P framesHighestNo
  • I-frame: self-contained reference frame; provides resilience to errors and a point for random access/seeking.
  • P-frame: stores only the motion vectors plus the residual (difference) relative to the previous reference frame.
  • B-frame: interpolated from past and future reference frames, giving the best compression but requiring reordering of frames for decoding.

Motion Estimation and Compensation

Each frame is divided into macroblocks (typically 16×16 pixels). For each macroblock, motion estimation searches a window in the reference frame to find the best matching block, producing a motion vector (dx, dy) that minimizes a cost such as SAD (sum of absolute differences).

Motion compensation then forms a prediction of the current block by shifting the reference block by the motion vector. Only the motion vector and the residual error (current block − predicted block) are coded via DCT, quantization and entropy coding. Since the residual is usually small, compression is high.

Group of Pictures (GOP)

A GOP is a repeating sequence of I, P and B frames between successive I-frames, e.g.:

I B B P B B P B B P B B I ...
  • The GOP length (e.g. 12 or 15 frames) trades compression efficiency against random-access granularity and error resilience.
  • Open GOP allows B-frames to reference frames in an adjacent GOP; closed GOP does not.
  • Because B-frames depend on future frames, the display order differs from the decoding (bitstream) order.

Summary

MPEG achieves high compression by combining intra-frame (DCT-based) coding for I-frames with inter-frame motion-compensated prediction for P and B-frames, organized into GOP structures that balance efficiency, error recovery and random access.

compressionmpeg
2long10 marks

Explain entropy and source coding for multimedia. Discuss Huffman coding, run-length encoding and arithmetic coding, and construct a Huffman code for a given set of symbol frequencies.

Entropy and Source Coding for Multimedia

Entropy measures the average information content of a source. For a source with symbols sis_i of probability pip_i:

H=ipilog2pi(bits/symbol)H = -\sum_{i} p_i \log_2 p_i \quad \text{(bits/symbol)}

Entropy gives the theoretical lower bound on the average number of bits per symbol for lossless coding (Shannon's source coding theorem). Source (entropy) coding assigns shorter codewords to more probable symbols to approach this bound.

Huffman Coding

A prefix-free, variable-length code built bottom-up: repeatedly merge the two least-probable symbols into a new node until one tree remains; assign 0/1 along branches. It is optimal among integer-length prefix codes. Limitation: each symbol uses a whole number of bits, so it can be up to ~1 bit/symbol away from entropy.

Run-Length Encoding (RLE)

Replaces runs of identical symbols by a (value, count) pair, e.g. AAAAABBB → 5A3B. Very effective for data with long runs (e.g. quantized DCT coefficients, fax images, GIF) but poor for noisy/varied data.

Arithmetic Coding

Encodes the entire message as a single fractional number in [0, 1). It does not require integer bit lengths, so it can come arbitrarily close to entropy and often outperforms Huffman, especially for skewed probabilities. Cost: more computation and patent/complexity issues historically.

Worked Huffman Example

Symbols and frequencies: A=45, B=13, C=12, D=16, E=9, F=5 (total 100).

Merge least-probable repeatedly:

  1. F(5)+E(9) = 14
  2. C(12)+B(13) = 25
  3. (14)+D(16) = 30
  4. (25)+(30) = 55
  5. A(45)+(55) = 100

Resulting codes (one valid assignment):

SymbolFreqCodeLength
A4501
C121003
B131013
F511004
E911014
D161113

Average length =(451+123+133+54+94+163)/100=224/100=2.24= (45\cdot1 + 12\cdot3 + 13\cdot3 + 5\cdot4 + 9\cdot4 + 16\cdot3)/100 = 224/100 = 2.24 bits/symbol, well below the fixed 3 bits needed for 6 symbols.

codinghuffman
3long10 marks

Explain digital audio representation. Discuss sampling, quantization, PCM, and audio compression techniques including MPEG audio (MP3) with the role of psychoacoustic models.

Digital Audio Representation

Sound is a continuous (analog) pressure wave. To process it on a computer it is converted to digital form by sampling and quantization (Analog-to-Digital Conversion).

Sampling

The analog signal is measured at regular intervals at the sampling rate fsf_s. By the Nyquist–Shannon theorem, fsf_s must be at least twice the highest frequency fmaxf_{max} to avoid aliasing:

fs2fmaxf_s \ge 2 f_{max}

CD-quality audio uses fs=44.1kHzf_s = 44.1\,\text{kHz} (covers up to ~22 kHz of audible sound).

Quantization

Each sample's amplitude is rounded to the nearest level among 2n2^n levels, where nn is the bit depth (e.g. 16 bits → 65,536 levels). More bits give a larger dynamic range and lower quantization noise; the signal-to-quantization-noise ratio is approximately 6.02n6.02n dB.

PCM (Pulse Code Modulation)

PCM stores the stream of quantized sample values directly as binary numbers — the standard uncompressed representation used in WAV and audio CDs. Uncompressed bitrate:

bitrate=fs×bit depth×channels\text{bitrate} = f_s \times \text{bit depth} \times \text{channels}

e.g. 44100×16×21.41Mbps44100 \times 16 \times 2 \approx 1.41\,\text{Mbps} for stereo CD audio.

Audio Compression and MPEG Audio (MP3)

PCM is large, so compression is used. MP3 (MPEG-1/2 Audio Layer III) is a lossy perceptual codec:

  1. The signal is split into frequency subbands (filter bank + MDCT).
  2. A psychoacoustic model analyses the audio to decide what the human ear cannot perceive.
  3. Bits are allocated and coefficients quantized so that quantization noise stays below the audible threshold.
  4. The result is Huffman-coded and packed into frames.

Role of the Psychoacoustic Model

It exploits limits of human hearing:

  • Absolute threshold of hearing: very quiet sounds are inaudible and can be discarded.
  • Frequency (simultaneous) masking: a loud tone hides nearby quieter tones in frequency.
  • Temporal masking: sounds just before/after a loud sound are masked in time.

By removing or coarsely quantizing inaudible components, MP3 achieves roughly 10:1 compression (e.g. 128 kbps) with little perceived quality loss.

audio
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

Define entropy in the context of coding. How is it related to compression?

Entropy is the average information content (uncertainty) of a source, measured in bits per symbol:

H=ipilog2piH = -\sum_i p_i \log_2 p_i

where pip_i is the probability of symbol ii.

Relation to compression: By Shannon's source-coding theorem, HH is the theoretical minimum average number of bits per symbol needed to represent the source without loss. A good entropy coder (Huffman, arithmetic) assigns shorter codes to frequent symbols and approaches HH. The closer the average code length is to HH, the better the compression; a uniform/random source (high entropy) compresses poorly, while a skewed source (low entropy) compresses well.

entropy
5short5 marks

Explain arithmetic coding and how it differs from Huffman coding.

Arithmetic Coding

Arithmetic coding represents an entire message as a single number in the interval [0,1)[0, 1). Starting from the full interval, the range is repeatedly subdivided in proportion to each symbol's probability; the next symbol selects its sub-interval, which becomes the new working interval. After processing all symbols, any number within the final (very narrow) interval — written in binary — uniquely identifies the message. More probable symbols shrink the interval less, so they cost fewer bits.

Differences from Huffman Coding

AspectHuffmanArithmetic
Unit of codingOne codeword per symbolOne number for the whole message
Code lengthInteger (whole) bits per symbolCan use fractional bits per symbol
OptimalityOptimal among integer prefix codes; up to ~1 bit/symbol above entropyCan come arbitrarily close to entropy
Skewed probabilitiesInefficient when p0.5p \gg 0.5Very efficient
AdaptivityPossible but needs tree rebuildingNaturally supports adaptive models
ComplexitySimple, fastMore computation; precision/overflow handling

In short: Huffman is simpler but limited to whole-bit codewords, whereas arithmetic coding uses fractional bits and generally achieves better compression, especially for highly skewed symbol distributions.

arithmetic-coding
6short5 marks

Explain motion estimation and motion compensation in video compression.

Motion Estimation and Motion Compensation

These techniques remove temporal redundancy between consecutive video frames (successive frames are usually very similar).

Motion Estimation: The current frame is divided into macroblocks (e.g. 16×16 pixels). For each macroblock, the encoder searches a window in a reference frame to find the best-matching block, minimizing a cost such as SAD (Sum of Absolute Differences). The displacement to that best match is a motion vector (dx,dy)(dx, dy). Search strategies include full (exhaustive) search and faster methods like three-step search or diamond search.

Motion Compensation: Using the motion vector, the encoder forms a prediction of the current block by copying the displaced reference block. It then computes the residual = current block − predicted block. Only the motion vector and the (small) residual, coded with DCT/quantization/entropy coding, are transmitted instead of the full block.

Benefit: Since neighbouring frames differ little, motion vectors plus residuals are far cheaper to store than full frames, giving the large compression gains of P- and B-frames in MPEG/H.26x codecs.

video
7short5 marks

Write short notes on MP3 audio compression and psychoacoustic models.

MP3 Audio Compression and Psychoacoustic Models

MP3 (MPEG-1/2 Audio Layer III) is a widely used lossy, perceptual audio compression format. Its goal is to discard sound information the human ear cannot perceive, achieving about 10:1 compression (e.g. 1.4 Mbps CD audio → ~128 kbps) with little audible loss.

Encoding steps:

  1. Split the PCM signal into many frequency subbands using a filter bank and a Modified DCT (MDCT).
  2. Apply a psychoacoustic model to determine the masking threshold per band.
  3. Quantize each band, allocating more bits where the ear is sensitive and fewer (or zero) where sound is masked.
  4. Huffman-code the quantized values and pack them into frames with headers.

Psychoacoustic Models exploit limits of human hearing:

  • Absolute threshold of hearing: very quiet sounds are inaudible and removed.
  • Frequency (simultaneous) masking: a loud tone masks nearby quieter tones in frequency.
  • Temporal masking: sounds just before/after a loud sound are masked in time.

By keeping quantization noise below these thresholds, MP3 removes perceptually irrelevant data while preserving the audio the listener actually hears.

mp3
8short5 marks

What is computer animation? Differentiate between frame-based and key-frame animation.

Computer Animation

Computer animation is the process of creating the illusion of motion by displaying a rapid sequence of computer-generated images (frames). When shown fast enough (typically 24–30 fps), persistence of vision makes the viewer perceive continuous movement. It includes 2D and 3D animation and is used in films, games, simulations and multimedia presentations.

Frame-based vs Key-frame Animation

AspectFrame-based AnimationKey-frame Animation
DefinitionEvery individual frame is drawn/specified one by oneOnly important key frames are defined; the system fills in the rest
In-betweensCreated manuallyGenerated automatically by interpolation (tweening)
EffortHigh — animator draws all framesLower — animator draws only key poses
ControlFull control over each frameSmooth automatic transitions; less per-frame control
ExampleTraditional cel/flipbook animationFlash/3D software animating between key poses

Summary: In frame-based animation the artist produces every frame, whereas in key-frame animation only key frames are specified and the intermediate frames are computed automatically by interpolation, saving effort and producing smooth motion.

animation
9short5 marks

What is hypermedia? Differentiate between hypertext and hypermedia.

Hypermedia

Hypermedia is an extension of hypertext in which the linked information is not limited to text but also includes multiple media types — text, images, audio, video, graphics and animation — connected by hyperlinks that let users navigate non-linearly. The World Wide Web is the most common example of hypermedia.

Hypertext vs Hypermedia

AspectHypertextHypermedia
ContentText only, linked non-linearlyText plus images, audio, video, animation
ScopeA subset of hypermediaSuperset that includes hypertext
NodesText documentsAny media object (text, image, sound, video)
ExampleEarly text-only help systems, classic linked documentsModern web pages, interactive multimedia/e-learning

Summary: Hypertext links only text nodes, while hypermedia generalizes this to link any kind of media. All hypertext is hypermedia, but not all hypermedia is plain hypertext.

hypermedia
10short5 marks

Write short notes on multimedia streaming and the issues involved.

Multimedia Streaming

Streaming is the delivery of multimedia (audio/video) over a network so that it can be played while it is being downloaded, without waiting for the whole file. Data is buffered briefly and rendered continuously. Examples: YouTube, Netflix, live broadcasts.

Types:

  • On-demand (stored) streaming: pre-recorded content delivered when requested.
  • Live streaming: content captured and transmitted in real time.

Common protocols: RTP/RTSP, and HTTP-based adaptive streaming (HLS, MPEG-DASH).

Issues Involved

  • Bandwidth limitations: insufficient or variable bandwidth causes buffering/stalls; adaptive bitrate switching helps.
  • Latency and delay: especially critical for live/interactive streams.
  • Jitter: variation in packet arrival times degrades playback; handled by buffering.
  • Packet loss: the underlying IP network is best-effort, so loss causes glitches; needs error concealment/retransmission.
  • Synchronization: audio and video (lip-sync) must stay aligned.
  • Quality of Service (QoS): networks must provide adequate throughput and bounded delay.
  • Scalability: serving many simultaneous users (mitigated by CDNs/multicast).
streaming
11short5 marks

Explain the Discrete Cosine Transform (DCT) and its importance in image compression.

Discrete Cosine Transform (DCT)

The DCT transforms a block of pixel values from the spatial domain into the frequency domain, expressing the block as a sum of cosine basis functions of increasing frequency. JPEG and MPEG apply the 2-D DCT on 8×8 blocks. The 2-D forward DCT is:

F(u,v)=14C(u)C(v)x=07y=07f(x,y)cos ⁣[(2x+1)uπ16]cos ⁣[(2y+1)vπ16]F(u,v) = \frac{1}{4} C(u)C(v) \sum_{x=0}^{7}\sum_{y=0}^{7} f(x,y)\cos\!\left[\frac{(2x+1)u\pi}{16}\right]\cos\!\left[\frac{(2y+1)v\pi}{16}\right]

where C(k)=12C(k)=\tfrac{1}{\sqrt2} for k=0k=0 and 11 otherwise.

Importance in Image Compression

  • Energy compaction: for natural images, most signal energy concentrates in a few low-frequency coefficients near the top-left (the DC coefficient and low AC terms), while most high-frequency coefficients are near zero.
  • Perceptual relevance: the human eye is less sensitive to high frequencies, so those coefficients can be quantized coarsely or discarded with little visible loss.
  • After DCT, quantization zeroes out many high-frequency coefficients; a zig-zag scan groups the zeros, which then compress very well with run-length and entropy (Huffman) coding.

In short: DCT concentrates image information into a few coefficients, enabling lossy compression (as in JPEG/MPEG) by removing perceptually unimportant high-frequency data.

dct
12short5 marks

Write short notes on common multimedia file formats (JPEG, GIF, PNG, MPEG).

Common Multimedia File Formats

JPEG (.jpg / .jpeg) — Joint Photographic Experts Group. A lossy image format using 8×8 DCT + quantization + entropy coding. Excellent for photographs/continuous-tone images with adjustable quality, but introduces blocking artefacts at high compression and does not support transparency.

GIF (.gif) — Graphics Interchange Format. A lossless format limited to a 256-colour palette (8-bit) using LZW compression. Supports transparency and simple animation; best for logos, icons and small graphics, poor for photos.

PNG (.png) — Portable Network Graphics. A lossless raster format supporting true colour (24-bit) and an alpha channel for smooth transparency; designed as a patent-free replacement for GIF. Larger files than JPEG for photos, but no quality loss and no 256-colour limit.

MPEG (.mpg / .mp4) — Moving Picture Experts Group. A family of lossy video (and audio) compression standards using DCT plus motion-compensated inter-frame prediction (I/P/B frames). Used for digital video, DVD, streaming and broadcast.

FormatTypeMediaKey feature
JPEGLossyImageDCT, great for photos, no transparency
GIFLosslessImage256 colours, transparency, animation
PNGLosslessImageTrue colour, alpha transparency
MPEGLossyVideo/AudioMotion-compensated DCT video
file-format

Frequently asked questions

Where can I find the BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) question paper 2080?
The full BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2080 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Multimedia Computing (BSc CSIT, CSC467) 2080 paper come with solutions?
Yes. Every question on this Multimedia Computing (BSc CSIT, CSC467) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2080 paper?
The BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Multimedia Computing (BSc CSIT, CSC467) past paper free?
Yes — reading and attempting this Multimedia Computing (BSc CSIT, CSC467) past paper on Kekkei is completely free.