BSc CSIT (TU) Science Multimedia Computing (BSc CSIT, CSC467) Question Paper 2080 Nepal

Q: Where can I find the BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) question paper 2080?

The full BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2080 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Multimedia Computing (BSc CSIT, CSC467) 2080 paper come with solutions?

Yes. Every question on this Multimedia Computing (BSc CSIT, CSC467) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2080 paper?

The BSc CSIT (TU) Multimedia Computing (BSc CSIT, CSC467) 2080 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Multimedia Computing (BSc CSIT, CSC467) past paper free?

Yes — reading and attempting this Multimedia Computing (BSc CSIT, CSC467) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain the MPEG video compression standard. Discuss I-frames, P-frames and B-frames, motion estimation and compensation, and the group of pictures (GOP) structure.

compressionmpeg

Answer 1

MPEG Video Compression Standard

MPEG (Moving Picture Experts Group) defines a family of standards (MPEG-1, MPEG-2, MPEG-4) for compressing digital video and audio. Video compression in MPEG exploits two kinds of redundancy:

Spatial redundancy within a single frame, removed using block-based DCT, quantization and entropy coding (like JPEG).
Temporal redundancy between successive frames, removed using motion estimation and motion compensation.

Frame Types (Picture Types)

Frame	Name	Coding	Compression	Random access
I	Intra-coded	Coded independently (like JPEG), no reference to other frames	Lowest	Yes (entry point)
P	Predictive	Predicted from the previous I or P frame (forward prediction)	Medium	No
B	Bidirectional	Predicted from both the previous and next I/P frames	Highest	No

I-frame: self-contained reference frame; provides resilience to errors and a point for random access/seeking.
P-frame: stores only the motion vectors plus the residual (difference) relative to the previous reference frame.
B-frame: interpolated from past and future reference frames, giving the best compression but requiring reordering of frames for decoding.

Motion Estimation and Compensation

Each frame is divided into macroblocks (typically 16×16 pixels). For each macroblock, motion estimation searches a window in the reference frame to find the best matching block, producing a motion vector (dx, dy) that minimizes a cost such as SAD (sum of absolute differences).

Motion compensation then forms a prediction of the current block by shifting the reference block by the motion vector. Only the motion vector and the residual error (current block − predicted block) are coded via DCT, quantization and entropy coding. Since the residual is usually small, compression is high.

Group of Pictures (GOP)

A GOP is a repeating sequence of I, P and B frames between successive I-frames, e.g.:

I B B P B B P B B P B B I ...

The GOP length (e.g. 12 or 15 frames) trades compression efficiency against random-access granularity and error resilience.
Open GOP allows B-frames to reference frames in an adjacent GOP; closed GOP does not.
Because B-frames depend on future frames, the display order differs from the decoding (bitstream) order.

Summary

MPEG achieves high compression by combining intra-frame (DCT-based) coding for I-frames with inter-frame motion-compensated prediction for P and B-frames, organized into GOP structures that balance efficiency, error recovery and random access.

Answer 2

Entropy and Source Coding for Multimedia

Entropy measures the average information content of a source. For a source with symbols $s_i$ of probability $p_i$ :

H = -\sum_{i} p_i \log_2 p_i \quad \text{(bits/symbol)}

Entropy gives the theoretical lower bound on the average number of bits per symbol for lossless coding (Shannon's source coding theorem). Source (entropy) coding assigns shorter codewords to more probable symbols to approach this bound.

Huffman Coding

A prefix-free, variable-length code built bottom-up: repeatedly merge the two least-probable symbols into a new node until one tree remains; assign 0/1 along branches. It is optimal among integer-length prefix codes. Limitation: each symbol uses a whole number of bits, so it can be up to ~1 bit/symbol away from entropy.

Run-Length Encoding (RLE)

Replaces runs of identical symbols by a (value, count) pair, e.g. AAAAABBB → 5A3B. Very effective for data with long runs (e.g. quantized DCT coefficients, fax images, GIF) but poor for noisy/varied data.

Arithmetic Coding

Encodes the entire message as a single fractional number in [0, 1). It does not require integer bit lengths, so it can come arbitrarily close to entropy and often outperforms Huffman, especially for skewed probabilities. Cost: more computation and patent/complexity issues historically.

Worked Huffman Example

Symbols and frequencies: A=45, B=13, C=12, D=16, E=9, F=5 (total 100).

Merge least-probable repeatedly:

F(5)+E(9) = 14
C(12)+B(13) = 25
(14)+D(16) = 30
(25)+(30) = 55
A(45)+(55) = 100

Resulting codes (one valid assignment):

Symbol	Freq	Code	Length
A	45	0	1
C	12	100	3
B	13	101	3
F	5	1100	4
E	9	1101	4
D	16	111	3

Average length $= (45\cdot1 + 12\cdot3 + 13\cdot3 + 5\cdot4 + 9\cdot4 + 16\cdot3)/100 = 224/100 = 2.24$ bits/symbol, well below the fixed 3 bits needed for 6 symbols.

Answer 3

Digital Audio Representation

Sound is a continuous (analog) pressure wave. To process it on a computer it is converted to digital form by sampling and quantization (Analog-to-Digital Conversion).

Sampling

The analog signal is measured at regular intervals at the sampling rate $f_s$ . By the Nyquist–Shannon theorem, $f_s$ must be at least twice the highest frequency $f_{max}$ to avoid aliasing:

f_s \ge 2 f_{max}

CD-quality audio uses $f_s = 44.1\,\text{kHz}$ (covers up to ~22 kHz of audible sound).

Quantization

Each sample's amplitude is rounded to the nearest level among $2^n$ levels, where $n$ is the bit depth (e.g. 16 bits → 65,536 levels). More bits give a larger dynamic range and lower quantization noise; the signal-to-quantization-noise ratio is approximately $6.02n$ dB.

PCM (Pulse Code Modulation)

PCM stores the stream of quantized sample values directly as binary numbers — the standard uncompressed representation used in WAV and audio CDs. Uncompressed bitrate:

\text{bitrate} = f_s \times \text{bit depth} \times \text{channels}

e.g. $44100 \times 16 \times 2 \approx 1.41\,\text{Mbps}$ for stereo CD audio.

Audio Compression and MPEG Audio (MP3)

PCM is large, so compression is used. MP3 (MPEG-1/2 Audio Layer III) is a lossy perceptual codec:

The signal is split into frequency subbands (filter bank + MDCT).
A psychoacoustic model analyses the audio to decide what the human ear cannot perceive.
Bits are allocated and coefficients quantized so that quantization noise stays below the audible threshold.
The result is Huffman-coded and packed into frames.

Role of the Psychoacoustic Model

It exploits limits of human hearing:

Absolute threshold of hearing: very quiet sounds are inaudible and can be discarded.
Frequency (simultaneous) masking: a loud tone hides nearby quieter tones in frequency.
Temporal masking: sounds just before/after a loud sound are masked in time.

By removing or coarsely quantizing inaudible components, MP3 achieves roughly 10:1 compression (e.g. 128 kbps) with little perceived quality loss.

Answer 4

Entropy is the average information content (uncertainty) of a source, measured in bits per symbol:

H = -\sum_i p_i \log_2 p_i

where $p_i$ is the probability of symbol $i$ .

Relation to compression: By Shannon's source-coding theorem, $H$ is the theoretical minimum average number of bits per symbol needed to represent the source without loss. A good entropy coder (Huffman, arithmetic) assigns shorter codes to frequent symbols and approaches $H$ . The closer the average code length is to $H$ , the better the compression; a uniform/random source (high entropy) compresses poorly, while a skewed source (low entropy) compresses well.

Answer 5

Arithmetic Coding

Arithmetic coding represents an entire message as a single number in the interval $[0, 1)$ . Starting from the full interval, the range is repeatedly subdivided in proportion to each symbol's probability; the next symbol selects its sub-interval, which becomes the new working interval. After processing all symbols, any number within the final (very narrow) interval — written in binary — uniquely identifies the message. More probable symbols shrink the interval less, so they cost fewer bits.

Differences from Huffman Coding

Aspect	Huffman	Arithmetic
Unit of coding	One codeword per symbol	One number for the whole message
Code length	Integer (whole) bits per symbol	Can use fractional bits per symbol
Optimality	Optimal among integer prefix codes; up to ~1 bit/symbol above entropy	Can come arbitrarily close to entropy
Skewed probabilities	Inefficient when $p \gg 0.5$	Very efficient
Adaptivity	Possible but needs tree rebuilding	Naturally supports adaptive models
Complexity	Simple, fast	More computation; precision/overflow handling

In short: Huffman is simpler but limited to whole-bit codewords, whereas arithmetic coding uses fractional bits and generally achieves better compression, especially for highly skewed symbol distributions.

Answer 6

Motion Estimation and Motion Compensation

These techniques remove temporal redundancy between consecutive video frames (successive frames are usually very similar).

Motion Estimation: The current frame is divided into macroblocks (e.g. 16×16 pixels). For each macroblock, the encoder searches a window in a reference frame to find the best-matching block, minimizing a cost such as SAD (Sum of Absolute Differences). The displacement to that best match is a motion vector $(dx, dy)$ . Search strategies include full (exhaustive) search and faster methods like three-step search or diamond search.

Motion Compensation: Using the motion vector, the encoder forms a prediction of the current block by copying the displaced reference block. It then computes the residual = current block − predicted block. Only the motion vector and the (small) residual, coded with DCT/quantization/entropy coding, are transmitted instead of the full block.

Benefit: Since neighbouring frames differ little, motion vectors plus residuals are far cheaper to store than full frames, giving the large compression gains of P- and B-frames in MPEG/H.26x codecs.

Answer 7

MP3 Audio Compression and Psychoacoustic Models

MP3 (MPEG-1/2 Audio Layer III) is a widely used lossy, perceptual audio compression format. Its goal is to discard sound information the human ear cannot perceive, achieving about 10:1 compression (e.g. 1.4 Mbps CD audio → ~128 kbps) with little audible loss.

Encoding steps:

Split the PCM signal into many frequency subbands using a filter bank and a Modified DCT (MDCT).
Apply a psychoacoustic model to determine the masking threshold per band.
Quantize each band, allocating more bits where the ear is sensitive and fewer (or zero) where sound is masked.
Huffman-code the quantized values and pack them into frames with headers.

Psychoacoustic Models exploit limits of human hearing:

Absolute threshold of hearing: very quiet sounds are inaudible and removed.
Frequency (simultaneous) masking: a loud tone masks nearby quieter tones in frequency.
Temporal masking: sounds just before/after a loud sound are masked in time.

By keeping quantization noise below these thresholds, MP3 removes perceptually irrelevant data while preserving the audio the listener actually hears.

Answer 8

Computer Animation

Computer animation is the process of creating the illusion of motion by displaying a rapid sequence of computer-generated images (frames). When shown fast enough (typically 24–30 fps), persistence of vision makes the viewer perceive continuous movement. It includes 2D and 3D animation and is used in films, games, simulations and multimedia presentations.

Frame-based vs Key-frame Animation

Aspect	Frame-based Animation	Key-frame Animation
Definition	Every individual frame is drawn/specified one by one	Only important key frames are defined; the system fills in the rest
In-betweens	Created manually	Generated automatically by interpolation (tweening)
Effort	High — animator draws all frames	Lower — animator draws only key poses
Control	Full control over each frame	Smooth automatic transitions; less per-frame control
Example	Traditional cel/flipbook animation	Flash/3D software animating between key poses

Summary: In frame-based animation the artist produces every frame, whereas in key-frame animation only key frames are specified and the intermediate frames are computed automatically by interpolation, saving effort and producing smooth motion.

Answer 9

Hypermedia

Hypermedia is an extension of hypertext in which the linked information is not limited to text but also includes multiple media types — text, images, audio, video, graphics and animation — connected by hyperlinks that let users navigate non-linearly. The World Wide Web is the most common example of hypermedia.

Hypertext vs Hypermedia

Aspect	Hypertext	Hypermedia
Content	Text only, linked non-linearly	Text plus images, audio, video, animation
Scope	A subset of hypermedia	Superset that includes hypertext
Nodes	Text documents	Any media object (text, image, sound, video)
Example	Early text-only help systems, classic linked documents	Modern web pages, interactive multimedia/e-learning

Summary: Hypertext links only text nodes, while hypermedia generalizes this to link any kind of media. All hypertext is hypermedia, but not all hypermedia is plain hypertext.

Answer 10

Multimedia Streaming

Streaming is the delivery of multimedia (audio/video) over a network so that it can be played while it is being downloaded, without waiting for the whole file. Data is buffered briefly and rendered continuously. Examples: YouTube, Netflix, live broadcasts.

Types:

On-demand (stored) streaming: pre-recorded content delivered when requested.
Live streaming: content captured and transmitted in real time.

Common protocols: RTP/RTSP, and HTTP-based adaptive streaming (HLS, MPEG-DASH).

Issues Involved

Bandwidth limitations: insufficient or variable bandwidth causes buffering/stalls; adaptive bitrate switching helps.
Latency and delay: especially critical for live/interactive streams.
Jitter: variation in packet arrival times degrades playback; handled by buffering.
Packet loss: the underlying IP network is best-effort, so loss causes glitches; needs error concealment/retransmission.
Synchronization: audio and video (lip-sync) must stay aligned.
Quality of Service (QoS): networks must provide adequate throughput and bounded delay.
Scalability: serving many simultaneous users (mitigated by CDNs/multicast).

Answer 11

Discrete Cosine Transform (DCT)

The DCT transforms a block of pixel values from the spatial domain into the frequency domain, expressing the block as a sum of cosine basis functions of increasing frequency. JPEG and MPEG apply the 2-D DCT on 8×8 blocks. The 2-D forward DCT is:

F(u,v) = \frac{1}{4} C(u)C(v) \sum_{x=0}^{7}\sum_{y=0}^{7} f(x,y)\cos\!\left[\frac{(2x+1)u\pi}{16}\right]\cos\!\left[\frac{(2y+1)v\pi}{16}\right]

where $C(k)=\tfrac{1}{\sqrt2}$ for $k=0$ and $1$ otherwise.

Importance in Image Compression

Energy compaction: for natural images, most signal energy concentrates in a few low-frequency coefficients near the top-left (the DC coefficient and low AC terms), while most high-frequency coefficients are near zero.
Perceptual relevance: the human eye is less sensitive to high frequencies, so those coefficients can be quantized coarsely or discarded with little visible loss.
After DCT, quantization zeroes out many high-frequency coefficients; a zig-zag scan groups the zeros, which then compress very well with run-length and entropy (Huffman) coding.

In short: DCT concentrates image information into a few coefficients, enabling lossy compression (as in JPEG/MPEG) by removing perceptually unimportant high-frequency data.

Answer 12

Common Multimedia File Formats

JPEG (.jpg / .jpeg) — Joint Photographic Experts Group. A lossy image format using 8×8 DCT + quantization + entropy coding. Excellent for photographs/continuous-tone images with adjustable quality, but introduces blocking artefacts at high compression and does not support transparency.

GIF (.gif) — Graphics Interchange Format. A lossless format limited to a 256-colour palette (8-bit) using LZW compression. Supports transparency and simple animation; best for logos, icons and small graphics, poor for photos.

PNG (.png) — Portable Network Graphics. A lossless raster format supporting true colour (24-bit) and an alpha channel for smooth transparency; designed as a patent-free replacement for GIF. Larger files than JPEG for photos, but no quality loss and no 256-colour limit.

MPEG (.mpg / .mp4) — Moving Picture Experts Group. A family of lossy video (and audio) compression standards using DCT plus motion-compensated inter-frame prediction (I/P/B frames). Used for digital video, DVD, streaming and broadcast.

Format	Type	Media	Key feature
JPEG	Lossy	Image	DCT, great for photos, no transparency
GIF	Lossless	Image	256 colours, transparency, animation
PNG	Lossless	Image	True colour, alpha transparency
MPEG	Lossy	Video/Audio	Motion-compensated DCT video

Level	BSc CSIT (TU)
Stream	Science
Subject	Multimedia Computing (BSc CSIT, CSC467)
Year	2080 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions

BSc CSIT (TU) Science Multimedia Computing (BSc CSIT, CSC467) Question Paper 2080 Nepal

Section A: Long Answer Questions

MPEG Video Compression Standard

Frame Types (Picture Types)

Motion Estimation and Compensation

Group of Pictures (GOP)

Summary

Entropy and Source Coding for Multimedia

Huffman Coding

Run-Length Encoding (RLE)

Arithmetic Coding

Worked Huffman Example

Digital Audio Representation

Sampling

Quantization

PCM (Pulse Code Modulation)

Audio Compression and MPEG Audio (MP3)

Role of the Psychoacoustic Model

Section B: Short Answer Questions

Arithmetic Coding

Differences from Huffman Coding

Motion Estimation and Motion Compensation

MP3 Audio Compression and Psychoacoustic Models

Computer Animation

Frame-based vs Key-frame Animation

Hypermedia

Hypertext vs Hypermedia

Multimedia Streaming

Issues Involved

Discrete Cosine Transform (DCT)

Importance in Image Compression

Common Multimedia File Formats

Frequently asked questions