BSc CSIT (TU) Science Statistics I (BSc CSIT, STA164) Question Paper 2078 Nepal

Q: Where can I find the BSc CSIT (TU) Statistics I (BSc CSIT, STA164) question paper 2078?

The full BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2078 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Statistics I (BSc CSIT, STA164) 2078 paper come with solutions?

Yes. Every question on this Statistics I (BSc CSIT, STA164) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2078 paper?

The BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2078 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Statistics I (BSc CSIT, STA164) past paper free?

Yes — reading and attempting this Statistics I (BSc CSIT, STA164) past paper on Kekkei is completely free.

Question

1Long answer10 marks

Explain the measures of central tendency. Compute mean, median and mode for the given grouped frequency distribution and establish the empirical relationship among them.

central-tendency

Answer 1

Measures of Central Tendency

A measure of central tendency is a single value that represents the centre or typical value of a data set. The three principal measures are:

Mean ( $\bar{x}$ ): the arithmetic average of all observations.
Median ( $M$ ): the middle value when data are arranged in order.
Mode ( $Z$ ): the value that occurs most frequently.

Formulae for Grouped Data

Mean: $\bar{x} = \dfrac{\sum f_i x_i}{\sum f_i}$ , where $x_i$ is the mid-value of each class.

Median: $M = L + \dfrac{\frac{N}{2} - C}{f}\times h$ , where $L$ = lower boundary of the median class, $N = \sum f$ , $C$ = cumulative frequency before the median class, $f$ = frequency of the median class, $h$ = class width.

Mode: $Z = L + \dfrac{f_1 - f_0}{2f_1 - f_0 - f_2}\times h$ , where $f_1$ = frequency of the modal class, $f_0$ and $f_2$ are the frequencies of the preceding and following classes.

Worked Example

Consider the distribution:

Class	0–10	10–20	20–30	30–40	40–50
$f$	5	8	15	9	3

Here $N = 40$ .

Mean: mid-values $x = 5, 15, 25, 35, 45$ .

\sum f x = 25 + 120 + 375 + 315 + 135 = 970,\quad \bar{x} = \frac{970}{40} = 24.25.

Median: $N/2 = 20$ . Cumulative frequencies: $5, 13, 28, 37, 40$ . The median class is 20–30 ( $L = 20$ , $C = 13$ , $f = 15$ , $h = 10$ ).

M = 20 + \frac{20 - 13}{15}\times 10 = 20 + 4.67 = 24.67.

Mode: modal class is 20–30 ( $f_1 = 15$ , $f_0 = 8$ , $f_2 = 9$ , $L = 20$ , $h = 10$ ).

Z = 20 + \frac{15 - 8}{2(15) - 8 - 9}\times 10 = 20 + \frac{7}{13}\times 10 = 20 + 5.38 = 25.38.

Empirical Relationship

For a moderately skewed (asymmetrical) distribution:

\text{Mode} = 3\,\text{Median} - 2\,\text{Mean}.

Check: $3(24.67) - 2(24.25) = 74.01 - 48.5 = 25.51 \approx 25.38$ , which agrees closely with the computed mode. The relation shows that the median always lies between the mean and the mode (about one-third of the way from the mean to the mode).

Answer 2

Poisson Distribution

A discrete random variable $X$ follows a Poisson distribution with parameter $\lambda > 0$ if it gives the probability of a given number of independent rare events occurring in a fixed interval of time or space, when these events occur at a constant average rate. Its probability mass function is

P(X = x) = \frac{e^{-\lambda}\lambda^{x}}{x!},\quad x = 0, 1, 2, \dots

where $\lambda$ is the mean (average) number of occurrences.

Properties

It is a discrete distribution defined for $x = 0, 1, 2, \dots, \infty$ .
Mean = Variance = $\lambda$ (a characteristic property).
It has a single parameter $\lambda$ .
It is the limiting form of the binomial distribution when $n \to \infty$ , $p \to 0$ , with $np = \lambda$ finite.
It is positively skewed; skewness $= 1/\sqrt{\lambda}$ and kurtosis $= 3 + 1/\lambda$ , both decreasing as $\lambda$ increases.
The sum of independent Poisson variates is also Poisson (additive property).

Applications

Number of accidents, telephone calls, or customer arrivals per unit time.
Number of printing errors per page or defects per unit length.
Number of radioactive decay emissions; arrivals in queueing theory.

Numerical Solution

Given mean $\lambda = 2$ , so $e^{-2} = 0.1353$ .

P(X = 0) = \frac{e^{-2}\,2^{0}}{0!} = e^{-2} = 0.1353.

P(X = 1) = \frac{e^{-2}\,2^{1}}{1!} = 2e^{-2} = 0.2707.

P(X \geq 2) = 1 - P(X=0) - P(X=1) = 1 - 0.1353 - 0.2707 = 0.5940.

Results: $P(0) = 0.1353$ , $P(1) = 0.2707$ , $P(\geq 2) = 0.5940$ .

Answer 3

Regression

Regression is a statistical method that estimates the average relationship between two (or more) variables and is used to predict the value of a dependent variable from a known value of an independent variable. The line that gives the best estimate is the line of regression, fitted by the principle of least squares (minimising the sum of squared deviations of observed points from the line).

Derivation of the Regression Line of Y on X

Let the line be $Y = a + bX$ . By least squares we minimise $\sum (Y - a - bX)^2$ . Setting the partial derivatives to zero gives the normal equations:

\sum Y = na + b\sum X,

\sum XY = a\sum X + b\sum X^2.

Solving for $b$ gives the regression coefficient of $Y$ on $X$ :

b_{yx} = \frac{n\sum XY - \sum X \sum Y}{n\sum X^2 - (\sum X)^2} = \frac{\text{Cov}(X,Y)}{\sigma_x^2} = r\frac{\sigma_y}{\sigma_x}.

The line of regression of Y on X is therefore

Y - \bar{Y} = b_{yx}(X - \bar{X}).

Derivation of the Regression Line of X on Y

By symmetry, taking $X = a' + b'Y$ and minimising $\sum (X - a' - b'Y)^2$ gives

b_{xy} = \frac{n\sum XY - \sum X \sum Y}{n\sum Y^2 - (\sum Y)^2} = \frac{\text{Cov}(X,Y)}{\sigma_y^2} = r\frac{\sigma_x}{\sigma_y}.

The line of regression of X on Y is

X - \bar{X} = b_{xy}(Y - \bar{Y}).

Properties of the Regression Coefficients

The correlation coefficient is the geometric mean of the two regression coefficients: $r = \pm\sqrt{b_{yx}\cdot b_{xy}}$ .
Both regression coefficients have the same sign, which is also the sign of $r$ .
If one regression coefficient is greater than 1, the other must be less than 1 (since their product $r^2 \leq 1$ ).
The AM of the two coefficients is greater than or equal to $r$ : $\frac{b_{yx} + b_{xy}}{2} \geq r$ .
Regression coefficients are independent of the change of origin but not of scale.
The two regression lines intersect at the means $(\bar{X}, \bar{Y})$ .

Answer 4

The harmonic mean (HM) of $n$ observations is the reciprocal of the arithmetic mean of the reciprocals of the values:

\text{HM} = \frac{n}{\sum_{i=1}^{n}\frac{1}{x_i}}.

For a frequency distribution, $\text{HM} = \dfrac{N}{\sum (f_i/x_i)}$ , where $N = \sum f_i$ . It is the appropriate average for rates and ratios (e.g. average speed over equal distances, price per unit) and gives more weight to smaller values.

Answer 5

For a moderately skewed (asymmetrical) distribution, the empirical relationship among the three measures of central tendency is

\text{Mode} = 3\,\text{Median} - 2\,\text{Mean}.

Equivalently, $\text{Mean} - \text{Mode} = 3(\text{Mean} - \text{Median})$ . The median always lies between the mean and the mode, dividing the distance in the ratio $1:2$ . For a perfectly symmetrical distribution, Mean = Median = Mode.

Answer 6

Properties of the Poisson Distribution

It is a discrete distribution with pmf $P(X=x) = \dfrac{e^{-\lambda}\lambda^x}{x!}$ , $x = 0,1,2,\dots$
Mean = Variance = $\lambda$ .
It has only one parameter, $\lambda$ (= $np$ in the binomial limit).
It is the limiting case of the binomial distribution when $n \to \infty$ , $p \to 0$ with $np = \lambda$ fixed.
It is positively skewed ( $\beta_1 = 1/\lambda$ ); the skewness decreases as $\lambda$ increases, approaching normality for large $\lambda$ .
Additive property: the sum of independent Poisson variates with parameters $\lambda_1, \lambda_2$ is Poisson with parameter $\lambda_1 + \lambda_2$ .

Answer 7

A line of regression is the straight line that gives the best estimate (in the least-squares sense) of one variable for a given value of the other, by minimising the sum of squared deviations of the observed points from the line. For two variables $X$ and $Y$ there are two such lines:

Regression of Y on X: $Y - \bar{Y} = b_{yx}(X - \bar{X})$ , with $b_{yx} = r\dfrac{\sigma_y}{\sigma_x}$ , used to estimate $Y$ from $X$ .
Regression of X on Y: $X - \bar{X} = b_{xy}(Y - \bar{Y})$ , with $b_{xy} = r\dfrac{\sigma_x}{\sigma_y}$ , used to estimate $X$ from $Y$ .

Both lines pass through the point of means $(\bar{X}, \bar{Y})$ and coincide only when $r = \pm 1$ .

Answer 8

Conditional probability is the probability of an event $A$ occurring given that another event $B$ has already occurred (with $P(B) > 0$ ). It is defined as

P(A \mid B) = \frac{P(A \cap B)}{P(B)}.

It restricts the sample space to those outcomes in which $B$ occurs. From this, the multiplication rule follows: $P(A \cap B) = P(B)\,P(A\mid B)$ . If $A$ and $B$ are independent, then $P(A\mid B) = P(A)$ .

Example: drawing two cards without replacement, $P(\text{2nd is King} \mid \text{1st is King}) = \dfrac{3}{51}$ .

Answer 9

Mean deviation (MD) is a measure of dispersion equal to the arithmetic mean of the absolute deviations of the observations from a central value (usually the mean or median).

About the mean: $\text{MD}_{\bar{x}} = \dfrac{\sum |x_i - \bar{x}|}{n}$ (for grouped data, $\dfrac{\sum f_i |x_i - \bar{x}|}{N}$ ).

About the median: $\text{MD}_{M} = \dfrac{\sum |x_i - M|}{n}$ .

Absolute values are used so that positive and negative deviations do not cancel. Mean deviation is least when taken about the median. The coefficient of mean deviation = $\dfrac{\text{MD}}{\text{central value}}$ .

Answer 10

A probability mass function (pmf) gives the probability that a discrete random variable $X$ takes each of its possible values. If $X$ takes values $x_1, x_2, \dots$ , the pmf is

p(x_i) = P(X = x_i),

and it must satisfy two conditions:

$p(x_i) \geq 0$ for all $i$ (non-negativity), and
$\sum_i p(x_i) = 1$ (total probability is one).

Example: for a fair die, $p(x) = \tfrac{1}{6}$ for $x = 1,2,\dots,6$ . The pmf is the discrete counterpart of the probability density function used for continuous variables.

Answer 11

Properties of a Good Measure of Central Tendency

According to Yule and Kendall, an ideal average should:

Be rigidly (clearly) defined by a mathematical formula.
Be based on all the observations of the data.
Be easy to understand and simple to compute.
Be least affected by sampling fluctuations (sampling stability).
Be suitable for further algebraic / mathematical treatment.
Not be unduly affected by extreme values (outliers).

The arithmetic mean satisfies most of these except the last (it is affected by extreme values).

Answer 12

The mode is the value that occurs most frequently. Arranging the data $2, 3, 3, 5, 7, 3, 8$ by frequency:

$3$ occurs three times,
every other value occurs once.

Since $3$ has the highest frequency, the mode = 3.

Level	BSc CSIT (TU)
Stream	Science
Subject	Statistics I (BSc CSIT, STA164)
Year	2078 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions