Browse papers
A

Section A: Long Answer Questions

Attempt any TWO questions.

3 questions·10 marks each
1long10 marks

Explain the different types of data and methods of data collection. Describe the construction of a frequency distribution and its graphical representation (histogram, frequency polygon, ogive).

Types of Data

By source:

  • Primary data — collected first-hand by the investigator for the specific purpose (original data).
  • Secondary data — already collected by someone else and reused (e.g. census reports, journals).

By nature/measurement:

  • Qualitative (categorical) data — attributes that cannot be measured numerically (e.g. gender, religion).
  • Quantitative (numerical) data — measurable, further split into discrete (countable, e.g. number of students) and continuous (any value in a range, e.g. height).
  • Also classified by measurement scale: nominal, ordinal, interval, ratio.

Methods of Data Collection

Primary methods: direct personal interview, indirect oral investigation, questionnaire (mailed), schedules filled by enumerators, direct observation, and information from local agents/correspondents.

Secondary sources: published sources (government/CBS reports, journals, newspapers) and unpublished records (office files, research records).

Construction of a Frequency Distribution

  1. Find the range R=maxminR = \text{max} - \text{min}.
  2. Decide the number of classes kk (often by Sturges' rule k1+3.322log10Nk \approx 1 + 3.322\log_{10} N).
  3. Compute class width hR/kh \approx R/k (rounded up).
  4. Form class intervals (exclusive or inclusive) and choose limits.
  5. Tally each observation into its class using tally marks.
  6. Count tallies to get the frequency ff of each class.

The result is a table of class intervals against frequencies; cumulative frequencies (less-than / more-than) may be added.

Graphical Representation

  • Histogram — adjacent rectangles, class intervals on the X-axis and frequency on the Y-axis; the area of each bar is proportional to frequency. Used for continuous data; bars touch each other. The mode can be located graphically from the tallest bar.
  • Frequency polygon — a line graph obtained by plotting frequencies against class mid-points and joining them by straight lines; the ends are joined to the X-axis at the mid-points of the adjacent empty classes. It can be drawn over the histogram by joining the tops of the bars.
  • Ogive (cumulative frequency curve) — a smooth curve obtained by plotting cumulative frequencies against class boundaries. The less-than ogive rises to the right; the more-than ogive falls to the right. Their intersection gives the median; quartiles can also be read off.
datafrequency-distribution
2long10 marks

Define skewness and kurtosis. Explain how they describe the shape of a distribution. Compute the Karl Pearson and Bowley coefficients of skewness for given data.

Skewness

Skewness measures the lack of symmetry of a distribution. A symmetric distribution has skewness 00; a distribution with a longer right tail is positively skewed and one with a longer left tail is negatively skewed.

Shape relations:

  • Symmetric: Mean=Median=Mode\text{Mean} = \text{Median} = \text{Mode}.
  • Positive skew: Mean>Median>Mode\text{Mean} > \text{Median} > \text{Mode}.
  • Negative skew: Mean<Median<Mode\text{Mean} < \text{Median} < \text{Mode}.

Kurtosis

Kurtosis measures the peakedness (and tail weight) of a distribution relative to the normal curve. Using β2=μ4/μ22\beta_2 = \mu_4/\mu_2^2:

  • β2=3\beta_2 = 3: mesokurtic (normal).
  • β2>3\beta_2 > 3: leptokurtic (more peaked, heavy tails).
  • β2<3\beta_2 < 3: platykurtic (flatter).

Coefficients of Skewness

Karl Pearson's coefficient:

Sk=MeanModeσorSk=3(MeanMedian)σS_k = \frac{\text{Mean} - \text{Mode}}{\sigma} \quad\text{or}\quad S_k = \frac{3(\text{Mean} - \text{Median})}{\sigma}

Bowley's (quartile) coefficient:

Sk=Q3+Q12Q2Q3Q1S_k = \frac{Q_3 + Q_1 - 2Q_2}{Q_3 - Q_1}

Worked example

For a distribution with Mean = 50, Mode = 44, σ\sigma = 12:

Sk=504412=0.5  (positively skewed).S_k = \frac{50 - 44}{12} = 0.5 \;(\text{positively skewed}).

For Q1=30,  Q2=40,  Q3=55Q_1 = 30,\; Q_2 = 40,\; Q_3 = 55 (Bowley):

Sk=55+302(40)5530=525=0.2  (positively skewed).S_k = \frac{55 + 30 - 2(40)}{55 - 30} = \frac{5}{25} = 0.2 \;(\text{positively skewed}).

(Substitute the actual values given in the paper; the method is identical.)

skewnesskurtosis
3long10 marks

Define the binomial distribution. State its properties, mean and variance. A coin is tossed 6 times; find the probability of getting exactly 4 heads and at least 4 heads.

Binomial Distribution

A discrete random variable XX follows a binomial distribution if it counts the number of successes in nn independent trials, each with constant success probability pp (and q=1pq = 1-p). The probability mass function is:

P(X=x)=(nx)pxqnx,x=0,1,2,,n.P(X = x) = \binom{n}{x} p^x q^{\,n-x}, \quad x = 0,1,2,\dots,n.

Properties

  • It is a discrete distribution with parameters nn and pp.
  • Trials are independent (Bernoulli trials) with two outcomes (success/failure).
  • pp remains constant across trials.
  • Mean =np= np, Variance =npq= npq, so Variance \le Mean (since q1q \le 1).
  • Standard deviation =npq= \sqrt{npq}.
  • It is positively skewed if p<0.5p < 0.5, symmetric if p=0.5p = 0.5.

Numerical (coin tossed 6 times)

Here n=6n = 6, p=P(head)=12p = P(\text{head}) = \tfrac12, q=12q = \tfrac12. So

P(X=x)=(6x)(12)6=(6x)64.P(X = x) = \binom{6}{x}\left(\tfrac12\right)^{6} = \frac{\binom{6}{x}}{64}.

Exactly 4 heads:

P(X=4)=(64)64=15640.2344.P(X = 4) = \frac{\binom{6}{4}}{64} = \frac{15}{64} \approx 0.2344.

At least 4 heads =P(X=4)+P(X=5)+P(X=6)= P(X=4) + P(X=5) + P(X=6):

=(64)+(65)+(66)64=15+6+164=2264=11320.3438.= \frac{\binom{6}{4} + \binom{6}{5} + \binom{6}{6}}{64} = \frac{15 + 6 + 1}{64} = \frac{22}{64} = \frac{11}{32} \approx 0.3438.
binomialdistribution
B

Section B: Short Answer Questions

Attempt any EIGHT questions.

9 questions·5 marks each
4short5 marks

Define skewness.

Skewness is a measure of the degree of asymmetry (lack of symmetry) of a frequency distribution about its mean. If the longer tail lies to the right it is positively skewed (Mean>Mode\text{Mean} > \text{Mode}); if to the left it is negatively skewed (Mean<Mode\text{Mean} < \text{Mode}); a symmetric distribution has zero skewness. A common measure is Karl Pearson's coefficient Sk=MeanModeσS_k = \dfrac{\text{Mean}-\text{Mode}}{\sigma}.

skewness
5short5 marks

What is kurtosis?

Kurtosis measures the peakedness or flatness of a frequency distribution compared with the normal distribution. It is based on the fourth moment, β2=μ4μ22\beta_2 = \dfrac{\mu_4}{\mu_2^{2}}. A distribution is mesokurtic (normal) when β2=3\beta_2 = 3, leptokurtic (more peaked, heavy tails) when β2>3\beta_2 > 3, and platykurtic (flatter) when β2<3\beta_2 < 3.

kurtosis
6short5 marks

State the properties of the binomial distribution.

Properties of the binomial distribution B(n,p)B(n,p) with q=1pq=1-p:

  • It is a discrete probability distribution with two parameters nn and pp.
  • Each of the nn trials is independent with only two outcomes (success/failure) and constant success probability pp.
  • pmf: P(X=x)=(nx)pxqnxP(X=x) = \binom{n}{x}p^x q^{\,n-x}.
  • Mean =np= np and Variance =npq= npq; hence variance is always less than the mean.
  • Standard deviation =npq= \sqrt{npq}.
  • The distribution is symmetric when p=0.5p = 0.5, positively skewed when p<0.5p < 0.5, and negatively skewed when p>0.5p > 0.5.
binomial
7short5 marks

Define quartiles.

Quartiles are the three values that divide an ordered data set into four equal parts, each containing 25% of the observations.

  • First quartile Q1Q_1 (lower quartile) — 25% of the data lie below it.
  • Second quartile Q2Q_2 — the median, with 50% below it.
  • Third quartile Q3Q_3 (upper quartile) — 75% of the data lie below it.

For grouped data, Qi=L+iN4Cf×hQ_i = L + \dfrac{\frac{iN}{4} - C}{f}\times h, where LL is the lower boundary of the quartile class, CC the preceding cumulative frequency, ff the class frequency and hh the class width.

central-tendency
8short5 marks

What is the geometric mean?

The geometric mean (GM) of nn positive observations is the nn-th root of their product:

GM=x1x2xnn=(i=1nxi)1/n.GM = \sqrt[n]{x_1 x_2 \cdots x_n} = \left(\prod_{i=1}^{n} x_i\right)^{1/n}.

In computational form, GM=antilog ⁣(logxin)GM = \text{antilog}\!\left(\dfrac{\sum \log x_i}{n}\right). It is appropriate for averaging ratios, rates of growth, and index numbers, and it is unduly influenced if any value is zero or negative.

Example: GM of 2,4,8=2483=643=42, 4, 8 = \sqrt[3]{2\cdot4\cdot8} = \sqrt[3]{64} = 4.

central-tendency
9short5 marks

Define independent events with an example.

Two events AA and BB are independent if the occurrence of one does not affect the probability of the other, i.e.

P(AB)=P(A)P(B).P(A \cap B) = P(A)\cdot P(B).

Equivalently P(AB)=P(A)P(A\mid B) = P(A).

Example: Tossing a fair coin twice. The outcome of the first toss (say a head) does not influence the second toss, so

P(head then head)=12×12=14.P(\text{head then head}) = \tfrac12 \times \tfrac12 = \tfrac14.
probability
10short5 marks

What is an ogive curve?

An ogive is the graph of a cumulative frequency distribution — a smooth curve obtained by plotting cumulative frequencies (Y-axis) against the upper or lower class boundaries (X-axis).

  • Less-than ogive: plot less-than cumulative frequencies against upper boundaries; the curve rises to the right.
  • More-than ogive: plot more-than cumulative frequencies against lower boundaries; the curve falls to the right.

The abscissa of the point where the two ogives intersect gives the median; quartiles, deciles and percentiles can also be read from an ogive.

graphs
11short5 marks

Find the mean of the first 10 natural numbers.

The first 10 natural numbers are 1,2,3,,101,2,3,\dots,10.

x=n(n+1)2=10×112=55.\sum x = \frac{n(n+1)}{2} = \frac{10\times 11}{2} = 55. xˉ=xn=5510=5.5.\bar{x} = \frac{\sum x}{n} = \frac{55}{10} = \boxed{5.5}.
central-tendency
12short5 marks

Define a discrete and continuous variable.

Discrete variable — a quantitative variable that can take only isolated, countable values (usually whole numbers), with gaps between possible values. Example: number of students in a class, number of cars (you cannot have 2.5 cars).

Continuous variable — a quantitative variable that can take any value within a given range (including fractions and decimals), limited only by measuring precision. Example: height, weight, temperature, or time.

In short, discrete data are obtained by counting while continuous data are obtained by measuring.

data

Frequently asked questions

Where can I find the BSc CSIT (TU) Statistics I (BSc CSIT, STA164) question paper 2077?
The full BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2077 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Statistics I (BSc CSIT, STA164) 2077 paper come with solutions?
Yes. Every question on this Statistics I (BSc CSIT, STA164) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2077 paper?
The BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2077 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
Is practising this Statistics I (BSc CSIT, STA164) past paper free?
Yes — reading and attempting this Statistics I (BSc CSIT, STA164) past paper on Kekkei is completely free.