BSc CSIT (TU) Science Statistics II (BSc CSIT, STA210) Question Paper 2081 Nepal

Q: Where can I find the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) question paper 2081?

The full BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2081 (Regular (annual)) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.

Q: Does the Statistics II (BSc CSIT, STA210) 2081 paper come with solutions?

Yes. Every question on this Statistics II (BSc CSIT, STA210) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.

Q: How many marks is the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2081 paper?

The BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2081 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.

Q: Is practising this Statistics II (BSc CSIT, STA210) past paper free?

Yes — reading and attempting this Statistics II (BSc CSIT, STA210) past paper on Kekkei is completely free.

Question

1Long answer10 marks

What is sampling? Explain different methods of probability and non-probability sampling with their merits and demerits.

sampling

Answer 1

Sampling

Sampling is the process of selecting a representative subset (the sample) from a larger group (the population) so that conclusions about the whole population can be drawn by studying only the sample. It saves cost, time and effort and is often the only feasible approach when the population is large or testing is destructive.

Sampling methods are broadly classified into probability and non-probability sampling.

A. Probability Sampling

Every unit of the population has a known, non-zero chance of selection. Results can be generalised and sampling error can be estimated.

1. Simple Random Sampling (SRS) Each unit has an equal chance of selection (lottery method or random numbers).

Merits: unbiased, simple, error measurable.
Demerits: needs a complete sampling frame; units may be geographically scattered.

2. Stratified Sampling Population is divided into homogeneous strata and a random sample is drawn from each.

Merits: high precision, ensures representation of every subgroup.
Demerits: requires prior knowledge of strata; complex to organise.

3. Systematic Sampling Every $k^{th}$ unit is selected after a random start, where $k = N/n$ .

Merits: simple, evenly spread over the frame.
Demerits: biased if the list has a periodic/hidden pattern.

4. Cluster / Multistage Sampling Population is divided into clusters; some clusters are selected at random and studied (fully or in further stages).

Merits: economical for wide geographical areas, no full frame needed.
Demerits: lower precision (high intra-cluster correlation).

B. Non-Probability Sampling

Selection is based on judgement/convenience; probability of selection is unknown and sampling error cannot be measured.

1. Convenience Sampling — units that are easiest to reach are chosen. Merit: fast, cheap. Demerit: highly biased, not representative.

2. Judgement (Purposive) Sampling — expert selects typical units. Merit: useful for small/specialised studies. Demerit: depends on investigator's bias.

3. Quota Sampling — units chosen until fixed quotas per category are met. Merit: quick, ensures group coverage. Demerit: selection within quota is biased.

4. Snowball Sampling — existing subjects refer further subjects. Merit: good for rare/hidden populations. Demerit: sample not representative.

Conclusion

Probability sampling allows valid statistical inference and is preferred for scientific surveys, whereas non-probability sampling is cheaper and quicker but cannot guarantee representativeness.

Answer 2

Analysis of Variance (ANOVA)

ANOVA is a statistical technique (developed by R. A. Fisher) used to test whether the means of three or more populations are equal by partitioning the total variation in the data into components attributable to different sources. It compares the variance between groups with the variance within groups using the F-test.

Assumptions: observations are independent, drawn from normal populations, and the populations have equal variances (homoscedasticity).

One-Way ANOVA Procedure

Used when one factor (with $k$ levels/treatments) classifies the data.

Step 1 — Hypotheses

H_0: \mu_1 = \mu_2 = \dots = \mu_k \quad (\text{all means equal})

H_1: \text{at least one mean differs}

Step 2 — Compute totals. Let $T$ = grand total of all $N$ observations and $C = \dfrac{T^2}{N}$ (correction factor).

Step 3 — Sum of squares

SST = \sum x_{ij}^2 - C \quad(\text{Total})

SSB = \sum_{i=1}^{k} \frac{T_i^2}{n_i} - C \quad(\text{Between treatments})

SSE = SST - SSB \quad(\text{Within / Error})

where $T_i$ and $n_i$ are the total and size of the $i^{th}$ group.

Step 4 — Degrees of freedom: Between = $k-1$ , Within = $N-k$ , Total = $N-1$ .

Step 5 — Mean squares and F-ratio

MSB = \frac{SSB}{k-1}, \qquad MSE = \frac{SSE}{N-k}, \qquad F = \frac{MSB}{MSE}

ANOVA Table

Source of Variation	Sum of Squares	d.f.	Mean Square	F-ratio
Between treatments	$SSB$	$k-1$	$MSB = SSB/(k-1)$	$MSB/MSE$
Within (Error)	$SSE$	$N-k$	$MSE = SSE/(N-k)$
Total	$SST$	$N-1$

Step 6 — Decision. Compare calculated $F$ with the table value $F_{\alpha,(k-1,N-k)}$ . If $F_{cal} > F_{tab}$ , reject $H_0$ and conclude the treatment means differ significantly; otherwise accept $H_0$ .

Answer 3

Theory of Estimation

Estimation is the branch of statistical inference concerned with using sample data to estimate the unknown value of a population parameter (such as $\mu$ , $\sigma^2$ or $P$ ). The sample statistic used for this purpose is called an estimator and a particular numerical value of it is an estimate. Estimation is of two types: point estimation and interval estimation.

Point Estimation vs Interval Estimation

Basis	Point Estimation	Interval Estimation
Meaning	Gives a single value as the estimate of the parameter	Gives a range (interval) within which the parameter is expected to lie
Example	$\bar{x}$ estimates $\mu$ ; $s^2$ estimates $\sigma^2$	$\bar{x} \pm Z_{\alpha/2}\,\dfrac{\sigma}{\sqrt{n}}$ for $\mu$
Reliability	Does not indicate the error/precision	Carries a confidence level (e.g. 95%) and shows precision
Form	Number	Interval $(L, U)$

A point estimate by itself tells nothing about how close it is to the true value, whereas an interval estimate quantifies the uncertainty.

Properties of a Good Estimator

1. Unbiasedness — The expected value of the estimator equals the parameter: $E(\hat{\theta}) = \theta$ . (e.g. $E(\bar{x}) = \mu$ .)

2. Consistency — As the sample size $n \to \infty$ , the estimator converges to the true parameter: $\hat{\theta} \xrightarrow{P} \theta$ .

3. Efficiency — Among all unbiased estimators, the efficient one has the smallest variance, giving the most precise estimate.

4. Sufficiency — The estimator uses all the information in the sample about the parameter, so no other statistic can add more information.

An estimator that is unbiased, consistent, efficient and sufficient is regarded as the best (ideal) estimator.

Answer 4

Karl Pearson's Coefficient of Correlation

Karl Pearson's coefficient of correlation, denoted $r$ , is a numerical measure of the degree and direction of linear relationship between two quantitative variables $X$ and $Y$ . It is defined as the ratio of the covariance of the variables to the product of their standard deviations:

r = \frac{\text{Cov}(X,Y)}{\sigma_X\,\sigma_Y} = \frac{\sum (X-\bar{X})(Y-\bar{Y})}{\sqrt{\sum (X-\bar{X})^2}\,\sqrt{\sum (Y-\bar{Y})^2}}

Properties

Range: $-1 \le r \le +1$ . $r=+1$ means perfect positive, $r=-1$ perfect negative, and $r=0$ no linear correlation.
Sign: indicates the direction — positive ( $X,Y$ move together), negative (move oppositely).
Independent of origin and scale (unit-free): $r$ is unchanged if a constant is added to or each value is multiplied by a constant.
Symmetric: $r_{XY} = r_{YX}$ .
Geometric mean of regression coefficients: $r = \pm\sqrt{b_{xy}\cdot b_{yx}}$ , with the sign of the regression coefficients.
It measures only linear association; $r=0$ does not imply the variables are unrelated (a non-linear relation may exist).

Answer 5

Regression Coefficients

In the two linear regression equations of $Y$ on $X$ and $X$ on $Y$ , the slopes are called the regression coefficients. They measure the average change in one variable for a unit change in the other.

Regression coefficient of $Y$ on $X$ : $\;b_{yx} = r\dfrac{\sigma_y}{\sigma_x} = \dfrac{\text{Cov}(X,Y)}{\sigma_x^2}$
Regression coefficient of $X$ on $Y$ : $\;b_{xy} = r\dfrac{\sigma_x}{\sigma_y} = \dfrac{\text{Cov}(X,Y)}{\sigma_y^2}$

Properties

Geometric mean gives correlation: $r = \pm\sqrt{b_{yx}\cdot b_{xy}}$ , the sign being that of the regression coefficients.
Same sign: both $b_{yx}$ and $b_{xy}$ have the same sign, which is also the sign of $r$ .
Product $\le 1$ : since $r^2 = b_{yx}\,b_{xy} \le 1$ , the product of the two regression coefficients cannot exceed unity; hence if one is greater than 1 the other must be less than 1.
Independent of origin but not of scale: adding a constant does not change them, but changing the scale does.
The arithmetic mean of the two regression coefficients is greater than or equal to the correlation coefficient $r$ (when $r>0$ ).

Answer 6

Sampling Distribution

If all possible samples of a fixed size $n$ are drawn from a population and a statistic (e.g. the sample mean $\bar{x}$ ) is computed for each, the probability distribution of that statistic over all such samples is called its sampling distribution. For example, the sampling distribution of the mean has its own mean $\mu_{\bar{x}} = \mu$ and describes how sample means vary from sample to sample. It forms the theoretical basis of statistical inference (estimation and testing of hypotheses).

Standard Error

The standard error (S.E.) is the standard deviation of the sampling distribution of a statistic. It measures the magnitude of the sampling fluctuation, i.e. how much the statistic is expected to vary from the true parameter due to chance.

For the sample mean (population s.d. $\sigma$ , sample size $n$ ):

\text{S.E.}(\bar{x}) = \frac{\sigma}{\sqrt{n}}

For a sample proportion $p$ : $\;\text{S.E.}(p) = \sqrt{\dfrac{PQ}{n}}$ .

Uses of standard error:

Smaller S.E. means a more precise/reliable estimate; it decreases as $n$ increases.
It is used to set up confidence intervals and to compute test statistics in hypothesis testing.

Answer 7

Confidence Interval for a Population Mean

A confidence interval (CI) is a range of values, computed from a sample, that is expected to contain the true population mean $\mu$ with a stated probability called the confidence level $(1-\alpha)$ (e.g. 95%).

Case 1 — Population variance $\sigma^2$ known (or large sample, $n \ge 30$ ): The statistic $\dfrac{\bar{x}-\mu}{\sigma/\sqrt{n}}$ follows the standard normal distribution, so the $100(1-\alpha)\%$ CI is

\bar{x} \pm Z_{\alpha/2}\,\frac{\sigma}{\sqrt{n}}

where $Z_{\alpha/2}=1.96$ for 95% and $2.58$ for 99%. (If $\sigma$ is unknown in a large sample, replace it by the sample s.d. $s$ .)

Case 2 — $\sigma$ unknown and small sample ( $n<30$ ): Use the $t$ -distribution with $n-1$ degrees of freedom:

\bar{x} \pm t_{\alpha/2,\,n-1}\,\frac{s}{\sqrt{n}}

Steps: (1) compute $\bar{x}$ and the standard error; (2) choose the confidence level and find $Z_{\alpha/2}$ or $t_{\alpha/2}$ ; (3) form the limits $\bar{x} \pm (\text{critical value})\times \text{S.E.}$

Interpretation: we are 95% confident that the interval contains $\mu$ ; over repeated sampling, 95% of such intervals would capture the true mean.

Answer 8

F-Test for Equality of Two Population Variances

The F-test is used to test whether two independent normal populations have equal variances. It compares the ratio of two sample variances.

Step 1 — Hypotheses

H_0: \sigma_1^2 = \sigma_2^2 \qquad H_1: \sigma_1^2 \neq \sigma_2^2

Step 2 — Test statistic. From samples of sizes $n_1, n_2$ with unbiased sample variances $s_1^2$ and $s_2^2$ , where $s^2 = \dfrac{\sum(x-\bar{x})^2}{n-1}$ , the statistic is

F = \frac{s_1^2}{s_2^2}, \qquad s_1^2 > s_2^2

The larger variance is always placed in the numerator so that $F \ge 1$ .

Step 3 — Degrees of freedom: numerator $= n_1 - 1$ , denominator $= n_2 - 1$ .

Step 4 — Decision. Compare $F_{cal}$ with the table value $F_{\alpha,(n_1-1,\,n_2-1)}$ .

If $F_{cal} > F_{tab}$ : reject $H_0$ — the population variances differ significantly.
If $F_{cal} \le F_{tab}$ : accept $H_0$ — variances may be regarded as equal.

Assumptions: both samples are random and independent and drawn from normal populations. (This equality-of-variance test is also a prerequisite for the t-test of two means.)

Answer 9

Index Numbers

An index number is a statistical measure that expresses the relative change in the level of a variable (price, quantity, value, production, etc.) over time or place, compared with a chosen base period taken as 100. Price index numbers are widely used to measure inflation and changes in the cost of living.

Notation: $p_0, q_0$ = price and quantity in the base year; $p_1, q_1$ = price and quantity in the current year.

Laspeyres' Price Index

Uses base-year quantities $(q_0)$ as weights:

P_{01}^{L} = \frac{\sum p_1 q_0}{\sum p_0 q_0} \times 100

Merit: requires only one set of weights (base-year), so easy to compute over time. Demerit: tends to overstate the price rise (consumers shift away from costlier goods).

Paasche's Price Index

Uses current-year quantities $(q_1)$ as weights:

P_{01}^{P} = \frac{\sum p_1 q_1}{\sum p_0 q_1} \times 100

Merit: reflects the current consumption pattern. Demerit: current-year weights must be collected every period (costly); tends to understate the price rise.

Note: Fisher's ideal index is the geometric mean of the two: $P^{F} = \sqrt{P^{L}\times P^{P}}$ .

Answer 10

Addition and Multiplication Theorems of Probability

1. Addition Theorem (probability of $A$ OR $B$ )

For any two events $A$ and $B$ :

P(A \cup B) = P(A) + P(B) - P(A \cap B)

If $A$ and $B$ are mutually exclusive (cannot occur together, $A \cap B = \varnothing$ ):

P(A \cup B) = P(A) + P(B)

Example: Drawing one card from a pack, $P(\text{King or Queen}) = \dfrac{4}{52} + \dfrac{4}{52} = \dfrac{8}{52} = \dfrac{2}{13}$ (mutually exclusive). For a King or a Heart (not exclusive): $\dfrac{4}{52}+\dfrac{13}{52}-\dfrac{1}{52} = \dfrac{16}{52} = \dfrac{4}{13}$ .

2. Multiplication Theorem (probability of $A$ AND $B$ )

For any two events:

P(A \cap B) = P(A)\cdot P(B\mid A)

where $P(B\mid A)$ is the conditional probability of $B$ given $A$ . If $A$ and $B$ are independent, $P(B\mid A)=P(B)$ , so:

P(A \cap B) = P(A)\cdot P(B)

Example: Two cards drawn one after another without replacement; probability both are aces:

P = \frac{4}{52}\times\frac{3}{51} = \frac{1}{221}

If tossing two fair coins (independent), $P(\text{both heads}) = \dfrac12\times\dfrac12 = \dfrac14$ .

Answer 11

Poisson Distribution

The Poisson distribution is a discrete probability distribution that gives the probability of a given number of independent events occurring in a fixed interval of time, space or volume, when these events occur with a known constant average rate $\lambda$ and rarely (large $n$ , small $p$ ). The probability mass function is

P(X = x) = \frac{e^{-\lambda}\,\lambda^{x}}{x!}, \qquad x = 0,1,2,\dots,\;\; \lambda > 0

where $\lambda = np$ is the average number of occurrences. It is the limiting form of the binomial distribution as $n \to \infty$ , $p \to 0$ with $np = \lambda$ finite.

Mean and Variance

A distinctive feature is that the mean and variance are equal:

\text{Mean} = \lambda, \qquad \text{Variance} = \lambda

Applications

Number of telephone calls arriving at an exchange per minute.
Number of printing/typing errors per page of a book.
Number of accidents on a highway or defects per unit length of cloth/wire.
Number of radioactive decays per second; arrivals in queueing (servers, network packets).
Number of customers arriving at a counter per hour.

Answer 12

Random Variable

A random variable is a real-valued function that assigns a numerical value to each outcome of a random experiment (sample space). It is usually denoted by capital letters $X, Y, Z$ . For example, in tossing two coins, the number of heads $X$ can take the values $0, 1, 2$ .

Random variables are of two types: discrete and continuous.

Discrete vs Continuous Random Variables

Basis	Discrete Random Variable	Continuous Random Variable
Values taken	Countable / isolated values (finite or countably infinite)	Any value within an interval (uncountable)
Probability described by	Probability mass function $P(X=x)$	Probability density function $f(x)$
Probability of a single point	Can be non-zero	Always zero; only $P(a\le X\le b)$ is meaningful
Summation/Integration	$\sum P(x) = 1$	$\int_{-\infty}^{\infty} f(x)\,dx = 1$
Examples	Number of heads in coin tosses, number of defective items, number of calls	Height, weight, temperature, time taken

Examples:

Discrete: the number of students present in a class (0, 1, 2, …).
Continuous: the height of a student (e.g. any value between 150 cm and 180 cm).

Level	BSc CSIT (TU)
Stream	Science
Subject	Statistics II (BSc CSIT, STA210)
Year	2081 BS
Exam session	Regular (annual)
Full marks	60
Time allowed	180 minutes
Questions	12, all with step-by-step solutions

BSc CSIT (TU) Science Statistics II (BSc CSIT, STA210) Question Paper 2081 Nepal

Section A: Long Answer Questions

Sampling

A. Probability Sampling

B. Non-Probability Sampling

Conclusion

Analysis of Variance (ANOVA)

One-Way ANOVA Procedure

ANOVA Table

Theory of Estimation

Point Estimation vs Interval Estimation

Properties of a Good Estimator

Section B: Short Answer Questions

Karl Pearson's Coefficient of Correlation

Properties

Regression Coefficients

Properties

Sampling Distribution

Standard Error

Confidence Interval for a Population Mean

F-Test for Equality of Two Population Variances

Index Numbers

Laspeyres' Price Index

Paasche's Price Index

Addition and Multiplication Theorems of Probability

1. Addition Theorem (probability of $A$ OR $B$ )

2. Multiplication Theorem (probability of $A$ AND $B$ )

Poisson Distribution

Mean and Variance

Applications

Random Variable

Discrete vs Continuous Random Variables

Frequently asked questions

Section A: Long Answer Questions

Sampling

A. Probability Sampling

B. Non-Probability Sampling

Conclusion

Analysis of Variance (ANOVA)

One-Way ANOVA Procedure

ANOVA Table

Theory of Estimation

Point Estimation vs Interval Estimation

Properties of a Good Estimator

Section B: Short Answer Questions

Karl Pearson's Coefficient of Correlation

Properties

Regression Coefficients

Properties

Sampling Distribution

Standard Error

Confidence Interval for a Population Mean

F-Test for Equality of Two Population Variances

Index Numbers

Laspeyres' Price Index

Paasche's Price Index

Addition and Multiplication Theorems of Probability

1. Addition Theorem (probability of AAA OR BBB)

2. Multiplication Theorem (probability of AAA AND BBB)

Poisson Distribution

Mean and Variance

Applications

Random Variable

Discrete vs Continuous Random Variables

Frequently asked questions

1. Addition Theorem (probability of $A$ OR $B$ )

2. Multiplication Theorem (probability of $A$ AND $B$ )