Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

What is multimedia synchronization? Explain intra-media and inter-media synchronization, and discuss the reference model for multimedia synchronization.

Multimedia Synchronization

Multimedia synchronization is the process of maintaining the correct temporal (time), spatial (position), and content relationships between different media objects (text, audio, video, images, animation) during their capture, storage, transmission, and presentation. Its purpose is to ensure that media are played back at the right time and in the right order, so the presentation is meaningful to the user (e.g., audio matching the lip movements in a video).

Intra-media (Intra-stream) Synchronization

This maintains the temporal relationship within a single continuous media stream.

  • It ensures a fixed, constant playout rate for the units (frames/samples) of one stream.
  • Example: a video must display its frames at a constant rate (e.g., 25 frames/second); audio samples must be played at a constant sampling rate (e.g., 44,100 samples/second).
  • Failure causes jitter, frames appearing too fast/slow, or jerky playback.

Inter-media (Inter-stream) Synchronization

This maintains the temporal relationship between two or more different media streams.

  • The most common case is lip synchronization (audio aligned with video). The tolerable skew is roughly ±80\pm 80 ms before users notice the mismatch.
  • It also covers relationships such as a slide appearing together with its narration, or subtitles matching speech.
  • Failure causes audio leading/lagging the video, or annotations appearing at the wrong moment.

Reference Model for Multimedia Synchronization

A layered four-layer reference model is used to describe synchronization at different levels of abstraction:

LayerFunctionSynchronization handled
Media LayerOperates on a single continuous media stream as a sequence of Logical Data Units (LDUs).Intra-media (device-level playout rate).
Stream LayerOperates on continuous streams as a whole; provides Quality of Service (QoS) and ensures intra-stream timing and basic inter-stream timing.Intra- and simple inter-stream sync.
Object LayerOperates on all media types together; hides the difference between continuous and discrete (time-independent) media.Inter-media (e.g., text + audio + video).
Specification LayerThe top, application/authoring layer; lets authors specify synchronization relationships using models such as timeline, hierarchical (Petri-net / OCPN), reference-point, or scripting.High-level specification of sync.

Each layer offers an interface (service) to the layer above and uses the services of the layer below. The Specification Layer is open (no fixed implementation) and includes models like the Object Composition Petri Net (OCPN), timeline-based, and reference-point models for expressing how media should be coordinated.

synchronization
2long10 marks

Explain digital image representation and color models (RGB, CMYK, YUV/YCbCr). Discuss how color is sampled and the concept of chroma subsampling in multimedia.

Digital Image Representation and Color Models

Digital Image Representation

A digital image is a 2-D array (matrix) of pixels (picture elements). Each pixel stores a numeric value representing brightness/color.

  • Spatial resolution: number of pixels (width ×\times height), e.g., 1920×10801920 \times 1080.
  • Color/bit depth: number of bits per pixel. A grayscale image uses 8 bits (0–255); a true-color image uses 24 bits (8 bits each for R, G, B).
  • Storage (uncompressed) =width×height×bytes per pixel= \text{width} \times \text{height} \times \text{bytes per pixel}.

Color Models

RGB (Red, Green, Blue) — an additive model used by displays, cameras and scanners. Colors are produced by adding red, green and blue light; (0,0,0)(0,0,0) = black, (255,255,255)(255,255,255) = white. Device-dependent.

CMYK (Cyan, Magenta, Yellow, Black) — a subtractive model used in color printing. Inks absorb (subtract) light; combining C, M, Y gives a muddy dark, so a separate K (black) channel is added for true black and detail. Conversion (simplified): C=1RC = 1-R', M=1GM = 1-G', Y=1BY = 1-B' on normalized values, then K=min(C,M,Y)K = \min(C,M,Y).

YUV / YCbCr — separates luminance (Y = brightness) from chrominance (color). YY carries the gray-scale information; U/CbU/Cb and V/CrV/Cr carry color difference. Conversion from RGB (BT.601):

Y=0.299R+0.587G+0.114BY = 0.299R + 0.587G + 0.114B Cb=0.169R0.331G+0.5B+128,Cr=0.5R0.419G0.081B+128Cb = -0.169R - 0.331G + 0.5B + 128,\quad Cr = 0.5R - 0.419G - 0.081B + 128

This model exploits the fact that the human eye is more sensitive to brightness than to color, which is the basis for compression.

Color Sampling and Chroma Subsampling

Because the eye perceives less detail in color than in luminance, the chrominance channels can be sampled at a lower resolution than luminance — this is chroma subsampling. It reduces data with little visible quality loss. It is denoted as J:a:bJ{:}a{:}b (luma:chroma in a 4×24\times2 region):

SchemeMeaningCb/Cr horizontal resolution
4:4:4No subsamplingFull (each pixel has its own color)
4:2:2Color sampled at half horizontal rate1/2
4:2:0Color sampled at half horizontal and half vertical1/4 (used in JPEG, MPEG, H.264)

Example: 4:2:0 stores one Cb and one Cr value for every 2×22\times2 block of luma pixels, roughly halving the total data versus 4:4:4.

colorimage
3long10 marks

Explain the JPEG image compression standard. Describe its main steps - DCT, quantization, zig-zag ordering and entropy (Huffman) coding - with the help of a block diagram.

JPEG Image Compression Standard

JPEG (Joint Photographic Experts Group) is a widely used lossy compression standard for continuous-tone still images. It achieves high compression by discarding high-frequency detail the human eye cannot easily perceive.

Block Diagram (encoder)

Input Image -> Color Transform (RGB->YCbCr) -> 8x8 Block split
   -> [Forward DCT] -> [Quantization] -> [Zig-zag scan]
   -> [DC: DPCM] [AC: Run-Length] -> [Entropy (Huffman) coding] -> Compressed bitstream

Main Steps

1. Color transform and downsampling. RGB is converted to YCbCr, and the chroma channels are subsampled (commonly 4:2:0).

2. Block splitting. The image is divided into 8×88\times8 pixel blocks, processed independently. Pixel values are level-shifted (subtract 128).

3. Forward DCT (Discrete Cosine Transform). Each 8×88\times8 block is transformed from the spatial domain to the frequency domain:

F(u,v)=14C(u)C(v)x=07y=07f(x,y)cos ⁣(2x+1)uπ16cos ⁣(2y+1)vπ16F(u,v)=\tfrac{1}{4}C(u)C(v)\sum_{x=0}^{7}\sum_{y=0}^{7} f(x,y)\cos\!\frac{(2x+1)u\pi}{16}\cos\!\frac{(2y+1)v\pi}{16}

The result is one DC coefficient (top-left, average value) and 63 AC coefficients. Energy is concentrated in the low-frequency (top-left) coefficients.

4. Quantization. Each DCT coefficient is divided by a value from a quantization table QQ and rounded:

FQ(u,v)=round ⁣(F(u,v)Q(u,v))F_Q(u,v)=\text{round}\!\left(\frac{F(u,v)}{Q(u,v)}\right)

High-frequency coefficients are divided by larger values, so many become zero. This is the lossy step and controls the quality/compression trade-off.

5. Zig-zag ordering. The 8×88\times8 quantized block is read in a zig-zag pattern from low to high frequency. This groups the many trailing zeros together into a long run, making them easy to compress.

6. Entropy (Huffman) coding.

  • The DC coefficient is coded differentially (DPCM) relative to the previous block's DC.
  • The AC coefficients are run-length encoded as (run-of-zeros, value) pairs and then Huffman coded (variable-length codes giving short codes to frequent symbols). This is lossless.

At the decoder, the steps are reversed: entropy decoding -> de-zig-zag -> dequantization -> inverse DCT -> color transform back to RGB, reconstructing an approximation of the original image.

compressionjpeg
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

What is Huffman coding? Construct a Huffman code for a given set of symbols and explain its working.

Huffman Coding

Huffman coding is a lossless, variable-length, prefix-free entropy coding technique that assigns shorter codes to more frequent symbols and longer codes to rarer symbols, minimizing the average code length. No code is a prefix of another, so the stream is uniquely decodable.

Algorithm

  1. List all symbols with their frequencies (probabilities).
  2. Repeatedly take the two lowest-frequency nodes and merge them into a new node whose frequency is their sum.
  3. Continue until one tree remains.
  4. Label left edges 0 and right edges 1; the path from root to a leaf is that symbol's code.

Example

Symbols and frequencies: A=45, B=13, C=12, D=16, E=9, F=5 (total 100).

Building the tree (merge two smallest each step): F(5)+E(9)=14; C(12)+B(13)=25; 14+D(16)=30; 25+30=55; 55+A(45)=100.

Resulting codes:

SymbolFreqCodeLength
A4501
C121003
B131013
F511004
E911014
D161113

Average length =(451+133+123+163+94+54)/100=224/100=2.24= (45\cdot1 + 13\cdot3 + 12\cdot3 + 16\cdot3 + 9\cdot4 + 5\cdot4)/100 = 224/100 = 2.24 bits/symbol, versus 3 bits with fixed-length coding — a clear saving. Decoding walks the tree bit-by-bit from the root until a leaf is reached.

huffman
5short5 marks

Explain run-length encoding (RLE) with an example.

Run-Length Encoding (RLE)

Run-length encoding is a simple lossless compression technique that replaces sequences of consecutive identical symbols (a run) with a single (symbol, count) pair. It is most effective on data with many long runs, such as simple graphics, fax images, and the long zero-runs in JPEG.

Working

Scan the data; whenever a value repeats consecutively, store it once together with the number of repetitions.

Example

Input string:

AAAAABBBCCDAA

RLE output (value, count):

5A 3B 2C 1D 2A   ->  encoded as: A5 B3 C2 D1 A2

The 13-character input is represented by 10 characters. For an image scan line like WWWWWWWWWWBBWWW (12 W, 2 B, ...), it becomes 12W2B3W, a large saving.

Note: RLE can expand data with few runs (e.g., ABCD becomes A1B1C1D1), so it works best on highly repetitive data.

run-length
6short5 marks

Explain the role of DCT and quantization in JPEG compression.

Role of DCT and Quantization in JPEG

Discrete Cosine Transform (DCT)

The DCT converts each 8×88\times8 block of pixels from the spatial domain to the frequency domain, producing one DC coefficient (the block's average) and 63 AC coefficients.

F(u,v)=14C(u)C(v)x=07y=07f(x,y)cos(2x+1)uπ16cos(2y+1)vπ16F(u,v)=\tfrac14 C(u)C(v)\sum_{x=0}^{7}\sum_{y=0}^{7} f(x,y)\cos\frac{(2x+1)u\pi}{16}\cos\frac{(2y+1)v\pi}{16}
  • It concentrates the image energy into a few low-frequency coefficients (top-left), while high-frequency coefficients are usually small.
  • DCT itself is reversible/lossless (apart from rounding); it does not compress, it prepares the data so that the redundant high-frequency information can be removed efficiently.

Quantization

Quantization divides each DCT coefficient by a value from a quantization table and rounds the result:

FQ(u,v)=round ⁣(F(u,v)Q(u,v))F_Q(u,v)=\text{round}\!\left(\frac{F(u,v)}{Q(u,v)}\right)
  • Larger divisors are used for high frequencies (which the eye is less sensitive to), so many small high-frequency coefficients become zero.
  • This is the only lossy step in JPEG and the main source of compression; the quantization table (scaled by a quality factor) controls the quality vs. compression trade-off.
  • The resulting blocks of zeros are then efficiently compressed by zig-zag scanning, run-length and Huffman coding.

Summary: DCT re-organizes the data by frequency so that quantization can discard perceptually unimportant high-frequency detail, enabling high compression with little visible loss.

jpeg
7short5 marks

Differentiate between I-frames, P-frames and B-frames in MPEG.

I-frames, P-frames and B-frames in MPEG

MPEG video compresses a sequence using three frame (picture) types arranged in a Group of Pictures (GOP), exploiting both spatial and temporal redundancy.

FeatureI-frame (Intra)P-frame (Predictive)B-frame (Bi-directional)
CodingCoded independently, like a JPEG image (intra-coded only)Predicted from a previous I or P frame using motion compensationPredicted from both previous and following I/P frames
ReferencesNoneOne (past)Two (past and future)
CompressionLowest (largest size)Higher than IHighest (smallest size)
Error resilience / random accessBest — acts as an access point; errors do not propagateErrors propagate to later P/B framesNot used as a reference (in classic MPEG), so errors don't propagate

I-frame: a self-contained reference frame; needed for random access (seeking) and to start/refresh a GOP.

P-frame: stores only the motion vectors and differences relative to a preceding reference frame, giving good compression.

B-frame: interpolated from frames before and after it, giving the highest compression; this requires the encoder/decoder to reorder frames (decode order differs from display order).

A typical GOP display order: I B B P B B P B B P ...

mpeg
8short5 marks

Differentiate between the RGB and CMYK color models.

RGB vs CMYK Color Models

FeatureRGBCMYK
Full formRed, Green, BlueCyan, Magenta, Yellow, Black (Key)
TypeAdditive (adds light)Subtractive (subtracts/absorbs light)
Primary actionMixing colored lightMixing colored inks/pigments
Black & whiteAll channels 0 = black; all max = whiteNo ink = white (paper); C+M+Y+K = black
Number of components34
Used byMonitors, TVs, cameras, scanners, webColor printers, printing presses
Color gamutWider (more vivid, bright colors)Narrower (cannot reproduce some bright RGB colors)

Key idea: RGB is device-of-light based — increasing all values moves toward white. CMYK is ink/pigment based — increasing values absorbs more light and moves toward black. A separate K (black) channel is added in CMYK because mixing C, M, Y inks yields a muddy brown rather than true black, and using black ink also saves the costlier colored inks. Images created in RGB on screen must be converted to CMYK for accurate printing, which may shift some out-of-gamut colors.

color-model
9short5 marks

Explain sampling and quantization of digital audio.

Sampling and Quantization of Digital Audio

Converting an analog (continuous) audio signal into digital form requires two steps performed by an Analog-to-Digital Converter (ADC): sampling (discretizing time) and quantization (discretizing amplitude).

Sampling

Sampling measures the amplitude of the continuous signal at regular time intervals. The number of samples per second is the sampling rate / frequency (fsf_s), measured in Hz.

  • By the Nyquist–Shannon theorem, fsf_s must be at least twice the highest frequency in the signal to avoid aliasing: fs2fmaxf_s \ge 2f_{max}.
  • Human hearing extends to ~20 kHz, so CD-quality audio uses fs=44,100f_s = 44{,}100 Hz (44.1 kHz).

Quantization

Each sampled amplitude is rounded to the nearest of a finite set of levels, represented by a fixed number of bits — the bit depth nn. The number of levels =2n= 2^n.

  • 8-bit gives 256 levels; 16-bit (CD quality) gives 65,53665{,}536 levels.
  • The rounding error is called quantization error/noise; more bits = smaller error and higher signal-to-noise ratio.

Storage / Bit Rate

Bit rate=fs×bit depth×channels\text{Bit rate} = f_s \times \text{bit depth} \times \text{channels}

Example (CD stereo): 44,100×16×2=1,411,20044{,}100 \times 16 \times 2 = 1{,}411{,}200 bits/s 1.41\approx 1.41 Mbps.

Together, sampling rate and bit depth determine the quality and size of the digital audio.

audio
10short5 marks

Differentiate between lossy and lossless compression with examples.

Lossy vs Lossless Compression

FeatureLossless CompressionLossy Compression
Data recoveryOriginal is exactly reconstructed (no data lost)Some data permanently discarded; only an approximation is recovered
Compression ratioLower (typically 2:1 to 3:1)Much higher (10:1 to 100:1)
QualityNo quality lossQuality reduced (often imperceptibly)
ReversibilityFully reversibleIrreversible
BasisRemoves statistical / coding redundancyRemoves perceptually irrelevant information
Used forText, program files, medical/legal images, archivesPhotos, audio, video streaming
ExamplesHuffman, RLE, LZW, ZIP, PNG, GIF, FLACJPEG, MPEG, MP3, AAC, H.264

Lossless is essential where every bit matters (e.g., executables, text, medical imaging) — the decompressed file is bit-for-bit identical to the original.

Lossy sacrifices unimportant detail (exploiting limits of human vision/hearing) to achieve far higher compression, which is acceptable for everyday photos, music and video where small quality loss is unnoticeable.

compression
11short5 marks

What are the characteristics of multimedia data? Explain the storage requirements.

Characteristics of Multimedia Data and Storage Requirements

Characteristics of Multimedia Data

  1. Large volume / high data size — images, audio and especially video produce huge amounts of data, demanding large storage and high bandwidth.
  2. Real-time / continuous nature — audio and video are continuous (time-dependent) media that must be captured and played back at a fixed rate; late data is useless.
  3. Temporal requirements / QoS — needs guaranteed throughput, low delay and low jitter; requires intra- and inter-media synchronization.
  4. Heterogeneity — combines different media types (text, graphics, image, audio, video, animation) with different formats and needs.
  5. Voluminous and compressible — contains much redundancy, so it is usually stored/transmitted in compressed form (JPEG, MPEG, MP3).
  6. Interactivity — often requires random access, fast-forward, rewind and user interaction.

Storage Requirements (with examples)

Uncompressed multimedia is very large. Storage size formulas:

  • Image: size=width×height×bytes/pixel\text{size} = \text{width} \times \text{height} \times \text{bytes/pixel}. Example: a 1024×7681024 \times 768 24-bit image =1024×768×32.36= 1024 \times 768 \times 3 \approx 2.36 MB.
  • Audio: rate×bit depth×channels\text{rate} \times \text{bit depth} \times \text{channels}. CD stereo =44,100×16×21.41= 44{,}100 \times 16 \times 2 \approx 1.41 Mbps, i.e. ~10 MB per minute.
  • Video: frame size ×\times frames/sec. Raw SD video can exceed 20 MB/second (over 1 GB/minute).

Because of these sizes, multimedia systems rely heavily on compression and on storage/transmission media with large capacity and high transfer rates (e.g., DVD, Blu-ray, SSDs, streaming servers with QoS).

multimedia-data
12short5 marks

What is multimedia synchronization? Differentiate intra-media and inter-media synchronization.

Multimedia Synchronization: Intra-media vs Inter-media

Multimedia synchronization is the maintenance of the correct temporal, spatial and content relationships between media objects during capture, storage, transmission and, most importantly, playback, so the presentation is coherent (e.g., sound matching the picture).

Difference between Intra-media and Inter-media Synchronization

AspectIntra-media (intra-stream)Inter-media (inter-stream)
ScopeTiming within a single media streamTiming between two or more media streams
GoalMaintain a constant, correct playout rate of units in one streamKeep different streams temporally aligned with each other
ExampleDisplaying video frames at exactly 25 fps; playing audio samples at 44.1 kHzLip-sync: audio matched to video; subtitles matched to speech
Problem if it failsJitter, jerky or too-fast/too-slow playbackAudio leads/lags video; annotations appear at the wrong time
ToleranceDepends on the stream's frame/sample rateLip-sync skew tolerable up to about ±80\pm 80 ms

In short: intra-media synchronization ensures each individual stream plays smoothly at its own correct rate, whereas inter-media synchronization ensures separate streams stay coordinated with one another.

synchronization

Frequently asked questions

Where can I find the BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) question paper 2079?
The full BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2079 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Multimedia Computing (BSc CSIT, CSC467) 2079 paper come with solutions?
Yes. Every question on this Multimedia Computing (BSc CSIT, CSC467) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2079 paper?
The BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2079 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Multimedia Computing (BSc CSIT, CSC467) past paper free?
Yes — reading and attempting this Multimedia Computing (BSc CSIT, CSC467) past paper on Kekkei is completely free.