BSc CSIT (TU) Science B.Sc. II Year Statistics (STA201) (Model) Question Paper 2075 Nepal
This is the official BSc CSIT (TU) (Science stream) B.Sc. II Year Statistics (STA201) (Model) question paper for 2075, as set in the model model examination. It carries 100 full marks and a time allowance of 180 minutes, across 26 questions. On Kekkei you can attempt this B.Sc. II Year Statistics (STA201) (Model) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) exam or solving previous years' question papers, this 2075 paper is a great way to practise under real exam conditions.
Group A
Attempt any FOUR questions. [4 x 10 = 40]
Define Negative binomial distribution. Derive its mean and variance.
Definition. The negative binomial distribution gives the probability that the success occurs on the trial in a sequence of independent Bernoulli trials with success probability (failure ). If is the number of failures before the success, its p.m.f. is
Mean and variance (via the m.g.f.). The moment generating function is
Differentiating and evaluating at :
Using ,
Thus Mean and Variance . Note that variance mean (over-dispersion), since .
Explain Gamma distribution. Show that a gamma distribution with parameter tends to Normal distribution as (i.e. for large value of parameter ).
Gamma distribution. A continuous random variable follows a gamma distribution with shape parameter (and scale 1) if its p.d.f. is
Its mean is and variance .
Normal limit as . Standardise by
The m.g.f. of is for . Then
Taking logarithms and expanding in a Taylor series:
This simplifies to
Hence , which is the m.g.f. of the standard normal distribution. Therefore, for large , the gamma distribution tends to .
Define the probability density function and distribution function for a bivariate random variable. Write down the properties of the bivariate distribution function.
Bivariate random variable. Let be a pair of jointly distributed random variables.
Joint distribution function (c.d.f.):
Joint probability density function (continuous case): If is differentiable, the joint p.d.f. is
Equivalently,
Properties of the bivariate distribution function :
- .
- is monotonically non-decreasing in each argument.
- .
- .
- The marginal distribution functions are and .
- For and :
- is right-continuous in each variable.
Derive Student's -distribution. If follows an distribution with degrees of freedom, then show that the distribution converts to the -distribution when .
Derivation of Student's -distribution. Let and be a chi-square variate with degrees of freedom, independent of . Define
Using the joint density of and and the transformation, integrating out the chi-square variable gives the p.d.f. of :
This is the Student's -distribution with degrees of freedom.
Relation between and when . The statistic with degrees of freedom is
For , where . Hence
Therefore ; that is, the square of a -variate with d.f. follows an distribution with d.f. Equivalently, .
Define the method of maximum likelihood estimation. What are the properties of maximum likelihood estimators?
Method of maximum likelihood estimation (MLE). Let be a random sample from a population with density . The likelihood function is
The maximum likelihood estimate is the value of that maximises (equivalently ). It is obtained by solving
Properties of MLEs:
- Consistency — MLEs are consistent: in probability as .
- Asymptotic normality — for large , is approximately normally distributed with mean and variance equal to the Cramer-Rao lower bound.
- Asymptotic efficiency — MLEs attain the minimum possible variance asymptotically.
- Sufficiency — if a sufficient statistic exists, the MLE is a function of it.
- Invariance — if is the MLE of , then is the MLE of .
- Not always unbiased — MLEs may be biased for small samples, though the bias vanishes as .
Differentiate between parametric and non-parametric tests. Explain the process of carrying out a one-sample run test with a suitable example.
Parametric vs non-parametric tests:
| Basis | Parametric test | Non-parametric test |
|---|---|---|
| Assumptions | Assume a specific population distribution (usually normal) | Distribution-free; no assumption about population form |
| Data type | Require interval/ratio (quantitative) data | Suitable for nominal/ordinal data |
| Parameters | Test hypotheses about parameters () | Do not involve population parameters directly |
| Power | More powerful when assumptions hold | Less powerful but more robust |
| Examples | -test, -test, -test | Run test, sign test, Mann-Whitney U, Chi-square |
One-sample run test (test of randomness). A run is a sequence of identical symbols bounded by different symbols (or boundaries). The run test checks whether a sequence of two types of outcomes occurs in a random order.
Procedure:
- Arrange the observations in the order obtained and classify each into one of two categories (e.g. above/below the median, denoted and ).
- Let = number of symbols, = number of symbols, and = total number of runs.
- Hypotheses: : the sequence is random; : the sequence is not random.
- For large samples, under , is approximately normal with
- Compute the test statistic and compare with the critical value (e.g. at 5%). Reject if exceeds the critical value. For small samples, use the run-test tables.
Example: Suppose the sequence of defective (D) and non-defective (N) items is: N N D N D D N N D N. Here (N), (D), and the runs are NN | D | N | DD | NN | D | N giving . Compute and the variance, then form and compare with . Since is small, we do not reject and conclude the order is random.
Group B
Attempt any Eight questions. [8 x 5 = 40]
Obtain the moment generating function of the negative exponential distribution and find its mean and variance.
The negative exponential distribution has p.d.f.
Moment generating function:
Mean:
Second moment:
Variance:
If is distributed as a beta distribution of the first kind with parameters and , then find the mean, mode and variance of the beta distribution.
For a beta distribution of the first kind with parameters and :
- Mean:
- Mode: (valid since ).
- Variance:
If and are two independent rectangular (uniform) variates on , find the distribution of .
Let where are independent, so the joint density is on the unit square.
For , the c.d.f. is
For , so the inner probability is ; for it equals . Hence
Differentiating gives the p.d.f. of :
Thus has p.d.f. for (and otherwise).
The joint distribution of and is given by
Examine whether the random variables and are independent.
Marginal of :
Marginal of :
Check independence:
Since the joint density factorises into the product of the marginal densities for all , the random variables and are independent.
Define the likelihood function and prove its properties.
Likelihood function. For a random sample from a population with density , the likelihood function is the joint density viewed as a function of the parameter :
Properties:
- Non-negativity and integration: , and as a density in the sample it integrates to 1 over the sample space:
- Score has zero expectation. Differentiating the identity with respect to (under regularity conditions allowing interchange of integral and derivative):
- Information identity. Differentiating again gives
the Fisher information, which is positive.
These properties form the basis for maximum likelihood estimation and the Cramer-Rao bound.
Examine whether the Minimum Variance Bound (MVB) estimator of exists or not, when sample observations are taken from a population. If it exists, find the MVB.
For a random sample from (with known), the log-likelihood is
Differentiating with respect to :
This is of the form with and . By the Cramer-Rao theory, an MVB estimator exists because the score factorises in this form, and the MVB estimator of is the sample mean .
Minimum variance bound (Cramer-Rao lower bound):
Thus the MVB estimator of is with minimum variance .
Define interval estimation. If is an unbiased estimator of , then show that is a biased estimator of .
Interval estimation. Interval estimation specifies a random interval , computed from the sample, that contains the unknown parameter with a stated probability (confidence level) :
The interval is called a confidence interval and the confidence coefficient.
Proof that is biased for . Since is unbiased for , . By the definition of variance,
Unless , we have
Hence overestimates by an amount equal to , so is a biased (positively biased) estimator of .
State and prove the Neyman-Pearson lemma. Also write its applications.
Statement. To test a simple null hypothesis against a simple alternative , the most powerful (MP) critical region of size is given by
where is chosen so that .
Proof. Let be the region defined above and be any other region of size , so . The power of is . Consider
Inside , ; outside , . Therefore
Hence the difference Thus the power of is at least that of , proving is most powerful.
Applications:
- Construction of most powerful tests for simple hypotheses.
- Basis for likelihood ratio tests and uniformly most powerful (UMP) tests.
- Used to derive optimal critical regions for normal, binomial, Poisson, etc.
Obtain the values of Type I and Type II errors if is the critical region for testing against the alternative hypothesis , on the basis of a single observation from the population with density , .
Let the critical region be and the density be , .
Type I error (rejecting when is true):
Type II error (accepting when is true), i.e. under :
Thus the Type I error and the Type II error .
Two groups of rats, one group consisting of trained ones and another group of untrained ones, have the following number of trials to achieve a certain criterion:
| Group | Trials |
|---|---|
| Trained rats | 78, 64, 75, 45, 82 |
| Untrained rats | 110, 70, 53, 51 |
Use the Mann-Whitney U test to determine whether there is a difference between the two average numbers of trials of trained and untrained rats.
Step 1 — Rank all observations (combined, smallest = rank 1):
| Value | 45 | 51 | 53 | 64 | 70 | 75 | 78 | 82 | 110 |
|---|---|---|---|---|---|---|---|---|---|
| Rank | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Group | T | U | U | T | U | T | T | T | U |
( = trained, = untrained.)
Step 2 — Rank sums:
- Trained (): ranks 1, 4, 6, 7, 8 → .
- Untrained (): ranks 2, 3, 5, 9 → .
Step 3 — Compute U statistics:
Check: . Take .
Step 4 — Decision. For , at the 5% level (two-tailed), the critical value of is . Since the calculated , we do not reject .
Conclusion: There is no significant difference between the average number of trials of trained and untrained rats.
Group C
Attempt ALL questions. [10 x 2 = 20]
Find the standard error of the sample proportion.
If is the population proportion and , then for a random sample of size the sample proportion has standard error
When is unknown it is estimated by , giving
What are one-tailed and two-tailed tests in testing of hypothesis?
One-tailed test: The critical (rejection) region lies entirely in one tail of the sampling distribution. It is used when the alternative hypothesis is directional, e.g. (right-tailed) or (left-tailed).
Two-tailed test: The critical region is split between both tails of the distribution. It is used when the alternative is non-directional, e.g. , so that significantly large or small values lead to rejection of .
Give an example for the outcome of a random experiment that is a two-dimensional random variable.
A two-dimensional (bivariate) random variable assigns a pair of real numbers to each outcome of a random experiment.
Example: Select a student at random and record where = the student's height and = the student's weight. Each outcome gives an ordered pair , so is a two-dimensional random variable. (Another example: tossing two dice and recording the pair of numbers that appear.)
A plant produces steel sheets whose weights are normally distributed with a standard deviation of 2.4 kg. A sample of 10 had a mean weight of 31.4 kg. Find the 95% confidence limits for the population mean.
Given kg, , kg. Since is known, use the value for 95% confidence.
Standard error: kg.
95% confidence limits:
Thus the limits are kg approximately, i.e. the 95% confidence interval for the population mean is .
What are the four main features of the -distribution curve?
Four main features of the -distribution curve:
- It is a continuous distribution defined only for non-negative values ().
- It is positively skewed (skewed to the right), the skewness decreasing as the degrees of freedom increase.
- Its shape depends on two parameters — the numerator and denominator degrees of freedom .
- The total area under the curve is 1, and it is unimodal; as both degrees of freedom become large it approaches the normal curve.
Write down the mean and variance of the hypergeometric distribution.
For a hypergeometric distribution with population size , successes in the population, and sample size :
Mean: , where .
Variance:
where and is the finite population correction factor.
Write down the recurrence relation of the Chi-square distribution with degrees of freedom.
For the chi-square distribution with degrees of freedom, the central moments satisfy the recurrence relation
with and . Using this, , , and . (The raw moments satisfy .)
Give the statement of Cramer-Rao's Inequality.
Cramer-Rao Inequality. If is an unbiased estimator of a parameter based on a random sample, then under regularity conditions the variance of cannot be smaller than the reciprocal of the Fisher information:
The right-hand side is the minimum variance bound (MVB). An estimator attaining this bound is the most efficient (MVB) estimator.
What are the characteristics of a good estimator?
The four characteristics (properties) of a good estimator are:
- Unbiasedness — ; on average the estimator equals the parameter.
- Consistency — in probability as the sample size .
- Efficiency — it has the smallest variance among all unbiased estimators (minimum variance).
- Sufficiency — it utilises all the information in the sample relevant to the parameter.
Give the moment generating function of the Cauchy distribution.
The Cauchy distribution does not possess a moment generating function, because the defining integral diverges for every (its moments, including the mean, do not exist).
Instead, the Cauchy distribution is characterised by its characteristic function. For the standard Cauchy distribution with p.d.f. ,
For a general Cauchy with location and scale :
Frequently asked questions
- Where can I find the BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) question paper 2075?
- The full BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) 2075 (model) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the B.Sc. II Year Statistics (STA201) (Model) 2075 paper come with solutions?
- Yes. Every question on this B.Sc. II Year Statistics (STA201) (Model) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) 2075 paper?
- The BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) 2075 paper carries 100 full marks and is meant to be completed in 180 minutes, across 26 questions.
- Is practising this B.Sc. II Year Statistics (STA201) (Model) past paper free?
- Yes — reading and attempting this B.Sc. II Year Statistics (STA201) (Model) past paper on Kekkei is completely free.