Browse papers
A

Group A

Attempt any FOUR questions. [4 x 10 = 40]

6 questions·10 marks each
1long10 marks

Define Negative binomial distribution. Derive its mean and variance.

Definition. The negative binomial distribution gives the probability that the rthr^{th} success occurs on the (r+k)th(r+k)^{th} trial in a sequence of independent Bernoulli trials with success probability pp (failure q=1pq = 1-p). If XX is the number of failures before the rthr^{th} success, its p.m.f. is

P(X=k)=(k+r1k)prqk,k=0,1,2,P(X = k) = \binom{k+r-1}{k} p^{r} q^{k}, \quad k = 0, 1, 2, \dots

Mean and variance (via the m.g.f.). The moment generating function is

MX(t)=(p1qet)r.M_X(t) = \left(\frac{p}{1 - q e^{t}}\right)^{r}.

Differentiating and evaluating at t=0t = 0:

E(X)=MX(0)=rqp.E(X) = M_X'(0) = \frac{rq}{p}.

Using E(X2)=MX(0)E(X^2) = M_X''(0),

Var(X)=E(X2)[E(X)]2=rqp2.\operatorname{Var}(X) = E(X^2) - [E(X)]^2 = \frac{rq}{p^{2}}.

Thus Mean =rqp= \dfrac{rq}{p} and Variance =rqp2= \dfrac{rq}{p^{2}}. Note that variance >> mean (over-dispersion), since 1/p>11/p > 1.

negative-binomial-distributionmean-varianceprobability-distributions
2long10 marks

Explain Gamma distribution. Show that a gamma distribution with parameter α\alpha tends to Normal distribution as α\alpha \to \infty (i.e. for large value of parameter α\alpha).

Gamma distribution. A continuous random variable XX follows a gamma distribution with shape parameter α>0\alpha > 0 (and scale 1) if its p.d.f. is

f(x)=1Γ(α)exxα1,x>0.f(x) = \frac{1}{\Gamma(\alpha)} e^{-x} x^{\alpha-1}, \quad x > 0.

Its mean is E(X)=αE(X) = \alpha and variance Var(X)=α\operatorname{Var}(X) = \alpha.

Normal limit as α\alpha \to \infty. Standardise XX by

Z=Xαα.Z = \frac{X - \alpha}{\sqrt{\alpha}}.

The m.g.f. of XX is MX(t)=(1t)αM_X(t) = (1-t)^{-\alpha} for t<1t < 1. Then

MZ(t)=eαtMX ⁣(tα)=eαt(1tα)α.M_Z(t) = e^{-\sqrt{\alpha}\,t}\, M_X\!\left(\frac{t}{\sqrt{\alpha}}\right) = e^{-\sqrt{\alpha}\,t}\left(1 - \frac{t}{\sqrt{\alpha}}\right)^{-\alpha}.

Taking logarithms and expanding log(1t/α)\log(1 - t/\sqrt{\alpha}) in a Taylor series:

logMZ(t)=αtαlog ⁣(1tα)=αt+α(tα+t22α+t33α3/2+).\log M_Z(t) = -\sqrt{\alpha}\,t - \alpha\log\!\left(1 - \frac{t}{\sqrt{\alpha}}\right) = -\sqrt{\alpha}\,t + \alpha\left(\frac{t}{\sqrt{\alpha}} + \frac{t^{2}}{2\alpha} + \frac{t^{3}}{3\alpha^{3/2}} + \dots\right).

This simplifies to

logMZ(t)=t22+t33α+αt22.\log M_Z(t) = \frac{t^{2}}{2} + \frac{t^{3}}{3\sqrt{\alpha}} + \dots \xrightarrow{\alpha \to \infty} \frac{t^{2}}{2}.

Hence MZ(t)et2/2M_Z(t) \to e^{t^{2}/2}, which is the m.g.f. of the standard normal distribution. Therefore, for large α\alpha, the gamma distribution tends to N(α,α)N(\alpha, \alpha).

gamma-distributionnormal-approximationlimiting-distributions
3long10 marks

Define the probability density function and distribution function for a bivariate random variable. Write down the properties of the bivariate distribution function.

Bivariate random variable. Let (X,Y)(X, Y) be a pair of jointly distributed random variables.

Joint distribution function (c.d.f.):

F(x,y)=P(Xx,Yy),<x,y<.F(x, y) = P(X \le x, Y \le y), \quad -\infty < x, y < \infty.

Joint probability density function (continuous case): If F(x,y)F(x,y) is differentiable, the joint p.d.f. is

f(x,y)=2F(x,y)xy,with f(x,y)0 and  ⁣f(x,y)dxdy=1.f(x, y) = \frac{\partial^{2} F(x, y)}{\partial x\, \partial y}, \quad \text{with } f(x,y) \ge 0 \text{ and } \int_{-\infty}^{\infty}\!\int_{-\infty}^{\infty} f(x,y)\,dx\,dy = 1.

Equivalently, F(x,y)=xyf(u,v)dvdu.F(x,y) = \int_{-\infty}^{x}\int_{-\infty}^{y} f(u,v)\,dv\,du.

Properties of the bivariate distribution function F(x,y)F(x,y):

  1. 0F(x,y)10 \le F(x, y) \le 1.
  2. FF is monotonically non-decreasing in each argument.
  3. F(,y)=F(x,)=F(,)=0F(-\infty, y) = F(x, -\infty) = F(-\infty, -\infty) = 0.
  4. F(+,+)=1F(+\infty, +\infty) = 1.
  5. The marginal distribution functions are FX(x)=F(x,+)F_X(x) = F(x, +\infty) and FY(y)=F(+,y)F_Y(y) = F(+\infty, y).
  6. For a1<a2a_1 < a_2 and b1<b2b_1 < b_2:
P(a1<Xa2,b1<Yb2)=F(a2,b2)F(a1,b2)F(a2,b1)+F(a1,b1)0.P(a_1 < X \le a_2, b_1 < Y \le b_2) = F(a_2, b_2) - F(a_1, b_2) - F(a_2, b_1) + F(a_1, b_1) \ge 0.
  1. F(x,y)F(x,y) is right-continuous in each variable.
bivariate-distributionjoint-distributiondistribution-function
4long10 marks

Derive Student's tt-distribution. If FF follows an FF distribution with (m,n)(m, n) degrees of freedom, then show that the F(m,n)F(m, n) distribution converts to the tt-distribution when m=1m = 1.

Derivation of Student's tt-distribution. Let ZN(0,1)Z \sim N(0,1) and χ2\chi^{2} be a chi-square variate with nn degrees of freedom, independent of ZZ. Define

t=Zχ2/n.t = \frac{Z}{\sqrt{\chi^{2}/n}}.

Using the joint density of ZZ and χ2\chi^{2} and the transformation, integrating out the chi-square variable gives the p.d.f. of tt:

f(t)=1nπΓ ⁣(n+12)Γ ⁣(n2)(1+t2n)n+12,<t<.f(t) = \frac{1}{\sqrt{n\pi}}\, \frac{\Gamma\!\left(\frac{n+1}{2}\right)}{\Gamma\!\left(\frac{n}{2}\right)} \left(1 + \frac{t^{2}}{n}\right)^{-\frac{n+1}{2}}, \quad -\infty < t < \infty.

This is the Student's tt-distribution with nn degrees of freedom.

Relation between FF and tt when m=1m = 1. The FF statistic with (m,n)(m, n) degrees of freedom is

F=χm2/mχn2/n.F = \frac{\chi_m^{2}/m}{\chi_n^{2}/n}.

For m=1m = 1, χ12=Z2\chi_1^{2} = Z^{2} where ZN(0,1)Z \sim N(0,1). Hence

F(1,n)=Z2/1χn2/n=(Zχn2/n)2=t2.F(1, n) = \frac{Z^{2}/1}{\chi_n^{2}/n} = \left(\frac{Z}{\sqrt{\chi_n^{2}/n}}\right)^{2} = t^{2}.

Therefore F(1,n)=tn2F(1, n) = t_n^{2}; that is, the square of a tt-variate with nn d.f. follows an FF distribution with (1,n)(1, n) d.f. Equivalently, tn=F(1,n)t_n = \sqrt{F(1, n)}.

t-distributionf-distributionsampling-distributions
5long10 marks

Define the method of maximum likelihood estimation. What are the properties of maximum likelihood estimators?

Method of maximum likelihood estimation (MLE). Let x1,x2,,xnx_1, x_2, \dots, x_n be a random sample from a population with density f(x;θ)f(x; \theta). The likelihood function is

L(θ)=i=1nf(xi;θ).L(\theta) = \prod_{i=1}^{n} f(x_i; \theta).

The maximum likelihood estimate θ^\hat{\theta} is the value of θ\theta that maximises L(θ)L(\theta) (equivalently logL(θ)\log L(\theta)). It is obtained by solving

logLθ=0,2logLθ2<0  (maximum condition).\frac{\partial \log L}{\partial \theta} = 0, \qquad \frac{\partial^{2} \log L}{\partial \theta^{2}} < 0 \;\text{(maximum condition)}.

Properties of MLEs:

  1. Consistency — MLEs are consistent: θ^θ\hat{\theta} \to \theta in probability as nn \to \infty.
  2. Asymptotic normality — for large nn, θ^\hat{\theta} is approximately normally distributed with mean θ\theta and variance equal to the Cramer-Rao lower bound.
  3. Asymptotic efficiency — MLEs attain the minimum possible variance asymptotically.
  4. Sufficiency — if a sufficient statistic exists, the MLE is a function of it.
  5. Invariance — if θ^\hat{\theta} is the MLE of θ\theta, then g(θ^)g(\hat{\theta}) is the MLE of g(θ)g(\theta).
  6. Not always unbiased — MLEs may be biased for small samples, though the bias vanishes as nn \to \infty.
maximum-likelihood-estimationestimationestimator-properties
6long10 marks

Differentiate between parametric and non-parametric tests. Explain the process of carrying out a one-sample run test with a suitable example.

Parametric vs non-parametric tests:

BasisParametric testNon-parametric test
AssumptionsAssume a specific population distribution (usually normal)Distribution-free; no assumption about population form
Data typeRequire interval/ratio (quantitative) dataSuitable for nominal/ordinal data
ParametersTest hypotheses about parameters (μ,σ2\mu, \sigma^2)Do not involve population parameters directly
PowerMore powerful when assumptions holdLess powerful but more robust
Examplestt-test, FF-test, zz-testRun test, sign test, Mann-Whitney U, Chi-square

One-sample run test (test of randomness). A run is a sequence of identical symbols bounded by different symbols (or boundaries). The run test checks whether a sequence of two types of outcomes occurs in a random order.

Procedure:

  1. Arrange the observations in the order obtained and classify each into one of two categories (e.g. above/below the median, denoted ++ and -).
  2. Let n1n_1 = number of ++ symbols, n2n_2 = number of - symbols, and RR = total number of runs.
  3. Hypotheses: H0H_0: the sequence is random; H1H_1: the sequence is not random.
  4. For large samples, under H0H_0, RR is approximately normal with
E(R)=2n1n2n1+n2+1,Var(R)=2n1n2(2n1n2n1n2)(n1+n2)2(n1+n21).E(R) = \frac{2 n_1 n_2}{n_1 + n_2} + 1, \qquad \operatorname{Var}(R) = \frac{2 n_1 n_2 (2 n_1 n_2 - n_1 - n_2)}{(n_1 + n_2)^{2}(n_1 + n_2 - 1)}.
  1. Compute the test statistic Z=RE(R)Var(R)Z = \dfrac{R - E(R)}{\sqrt{\operatorname{Var}(R)}} and compare with the critical value (e.g. ±1.96\pm 1.96 at 5%). Reject H0H_0 if Z|Z| exceeds the critical value. For small samples, use the run-test tables.

Example: Suppose the sequence of defective (D) and non-defective (N) items is: N N D N D D N N D N. Here n1=6n_1 = 6 (N), n2=4n_2 = 4 (D), and the runs are NN | D | N | DD | NN | D | N giving R=7R = 7. Compute E(R)=2(6)(4)10+1=5.8E(R) = \frac{2(6)(4)}{10} + 1 = 5.8 and the variance, then form ZZ and compare with 1.961.96. Since ZZ is small, we do not reject H0H_0 and conclude the order is random.

nonparametric-testsrun-testhypothesis-testing
B

Group B

Attempt any Eight questions. [8 x 5 = 40]

10 questions·5 marks each
7short5 marks

Obtain the moment generating function of the negative exponential distribution and find its mean and variance.

The negative exponential distribution has p.d.f.

f(x)=θeθx,x>0,  θ>0.f(x) = \theta e^{-\theta x}, \quad x > 0, \; \theta > 0.

Moment generating function:

MX(t)=E(etX)=0etxθeθxdx=θ0e(θt)xdx=θθt,t<θ.M_X(t) = E(e^{tX}) = \int_0^{\infty} e^{tx}\,\theta e^{-\theta x}\,dx = \theta \int_0^{\infty} e^{-(\theta - t)x}\,dx = \frac{\theta}{\theta - t}, \quad t < \theta.

Mean: E(X)=MX(0)=θ(θt)2t=0=1θ.E(X) = M_X'(0) = \dfrac{\theta}{(\theta - t)^2}\Big|_{t=0} = \dfrac{1}{\theta}.

Second moment: E(X2)=MX(0)=2θ(θt)3t=0=2θ2.E(X^2) = M_X''(0) = \dfrac{2\theta}{(\theta - t)^3}\Big|_{t=0} = \dfrac{2}{\theta^{2}}.

Variance: Var(X)=E(X2)[E(X)]2=2θ21θ2=1θ2.\operatorname{Var}(X) = E(X^2) - [E(X)]^2 = \dfrac{2}{\theta^{2}} - \dfrac{1}{\theta^{2}} = \dfrac{1}{\theta^{2}}.

exponential-distributionmoment-generating-functionmean-variance
8short5 marks

If XX is distributed as a beta distribution of the first kind with parameters m=3m = 3 and n=4n = 4, then find the mean, mode and variance of the beta distribution.

For a beta distribution of the first kind with parameters mm and nn:

  • Mean: mm+n=33+4=370.4286.\dfrac{m}{m+n} = \dfrac{3}{3+4} = \dfrac{3}{7} \approx 0.4286.
  • Mode: m1m+n2=313+42=25=0.40\dfrac{m-1}{m+n-2} = \dfrac{3-1}{3+4-2} = \dfrac{2}{5} = 0.40 (valid since m,n>1m, n > 1).
  • Variance: mn(m+n)2(m+n+1)=3×4(7)2(8)=1249×8=12392=3980.0306.\dfrac{mn}{(m+n)^{2}(m+n+1)} = \dfrac{3 \times 4}{(7)^{2}\,(8)} = \dfrac{12}{49 \times 8} = \dfrac{12}{392} = \dfrac{3}{98} \approx 0.0306.
beta-distributionmean-mode-varianceprobability-distributions
9short5 marks

If X1X_1 and X2X_2 are two independent rectangular (uniform) variates on [0,1][0, 1], find the distribution of X1X2X_1 X_2.

Let U=X1X2U = X_1 X_2 where X1,X2U(0,1)X_1, X_2 \sim U(0,1) are independent, so the joint density is f(x1,x2)=1f(x_1, x_2) = 1 on the unit square.

For 0<u<10 < u < 1, the c.d.f. is

G(u)=P(X1X2u)=01P ⁣(X2ux1)dx1.G(u) = P(X_1 X_2 \le u) = \int_0^1 P\!\left(X_2 \le \frac{u}{x_1}\right) dx_1.

For x1ux_1 \le u, u/x11u/x_1 \ge 1 so the inner probability is 11; for x1>ux_1 > u it equals u/x1u/x_1. Hence

G(u)=0u1dx1+u1ux1dx1=u+u[lnx1]u1=uulnu.G(u) = \int_0^{u} 1\, dx_1 + \int_u^{1} \frac{u}{x_1}\, dx_1 = u + u\big[\ln x_1\big]_u^1 = u - u\ln u.

Differentiating gives the p.d.f. of UU:

g(u)=ddu(uulnu)=1(lnu+1)=lnu,0<u<1.g(u) = \frac{d}{du}(u - u\ln u) = 1 - (\ln u + 1) = -\ln u, \quad 0 < u < 1.

Thus X1X2X_1 X_2 has p.d.f. g(u)=lnug(u) = -\ln u for 0<u<10 < u < 1 (and 00 otherwise).

rectangular-distributiontransformationproduct-distribution
10short5 marks

The joint distribution of XX and YY is given by

f(x,y)=4xy,0x1,  0y1,f(x, y) = 4xy, \quad 0 \le x \le 1,\; 0 \le y \le 1, =0,otherwise.\quad = 0, \quad \text{otherwise.}

Examine whether the random variables XX and YY are independent.

Marginal of XX:

fX(x)=014xydy=4x[y22]01=2x,0x1.f_X(x) = \int_0^1 4xy\, dy = 4x\left[\frac{y^2}{2}\right]_0^1 = 2x, \quad 0 \le x \le 1.

Marginal of YY:

fY(y)=014xydx=4y[x22]01=2y,0y1.f_Y(y) = \int_0^1 4xy\, dx = 4y\left[\frac{x^2}{2}\right]_0^1 = 2y, \quad 0 \le y \le 1.

Check independence:

fX(x)fY(y)=(2x)(2y)=4xy=f(x,y).f_X(x)\, f_Y(y) = (2x)(2y) = 4xy = f(x, y).

Since the joint density factorises into the product of the marginal densities for all (x,y)(x, y), the random variables XX and YY are independent.

joint-distributionindependencebivariate-distribution
11short5 marks

Define the likelihood function and prove its properties.

Likelihood function. For a random sample x1,x2,,xnx_1, x_2, \dots, x_n from a population with density f(x;θ)f(x; \theta), the likelihood function is the joint density viewed as a function of the parameter θ\theta:

L(θx1,,xn)=i=1nf(xi;θ).L(\theta \mid x_1, \dots, x_n) = \prod_{i=1}^{n} f(x_i; \theta).

Properties:

  1. Non-negativity and integration: L(θ)0L(\theta) \ge 0, and as a density in the sample it integrates to 1 over the sample space:
L(θ)dx1dxn=1.\int \cdots \int L(\theta)\, dx_1 \cdots dx_n = 1.
  1. Score has zero expectation. Differentiating the identity Ldx=1\int L\, dx = 1 with respect to θ\theta (under regularity conditions allowing interchange of integral and derivative):
Lθdx=0    logLθLdx=0    E ⁣(logLθ)=0.\int \frac{\partial L}{\partial \theta}\, dx = 0 \;\Rightarrow\; \int \frac{\partial \log L}{\partial \theta}\, L\, dx = 0 \;\Rightarrow\; E\!\left(\frac{\partial \log L}{\partial \theta}\right) = 0.
  1. Information identity. Differentiating again gives
E ⁣[(logLθ)2]=E ⁣(2logLθ2)=I(θ),E\!\left[\left(\frac{\partial \log L}{\partial \theta}\right)^{2}\right] = -E\!\left(\frac{\partial^{2} \log L}{\partial \theta^{2}}\right) = I(\theta),

the Fisher information, which is positive.

These properties form the basis for maximum likelihood estimation and the Cramer-Rao bound.

likelihood-functionestimationproperties
12short5 marks

Examine whether the Minimum Variance Bound (MVB) estimator of μ\mu exists or not, when nn sample observations are taken from a N(μ,σ2)N(\mu, \sigma^2) population. If it exists, find the MVB.

For a random sample x1,,xnx_1, \dots, x_n from N(μ,σ2)N(\mu, \sigma^2) (with σ2\sigma^2 known), the log-likelihood is

logL=n2log(2πσ2)12σ2i=1n(xiμ)2.\log L = -\frac{n}{2}\log(2\pi\sigma^2) - \frac{1}{2\sigma^2}\sum_{i=1}^{n}(x_i - \mu)^2.

Differentiating with respect to μ\mu:

logLμ=1σ2i=1n(xiμ)=nσ2(xˉμ).\frac{\partial \log L}{\partial \mu} = \frac{1}{\sigma^2}\sum_{i=1}^{n}(x_i - \mu) = \frac{n}{\sigma^2}(\bar{x} - \mu).

This is of the form logLμ=A(μ)(Tμ)\dfrac{\partial \log L}{\partial \mu} = A(\mu)\,(T - \mu) with T=xˉT = \bar{x} and A(μ)=n/σ2A(\mu) = n/\sigma^2. By the Cramer-Rao theory, an MVB estimator exists because the score factorises in this form, and the MVB estimator of μ\mu is the sample mean xˉ\bar{x}.

Minimum variance bound (Cramer-Rao lower bound):

I(μ)=E ⁣(2logLμ2)=nσ2,MVB=1I(μ)=σ2n.I(\mu) = -E\!\left(\frac{\partial^{2}\log L}{\partial \mu^{2}}\right) = \frac{n}{\sigma^2}, \qquad \text{MVB} = \frac{1}{I(\mu)} = \frac{\sigma^{2}}{n}.

Thus the MVB estimator of μ\mu is xˉ\bar{x} with minimum variance σ2n\dfrac{\sigma^{2}}{n}.

mvb-estimatorcramer-raoestimation
13short5 marks

Define interval estimation. If TT is an unbiased estimator of θ\theta, then show that T2T^2 is a biased estimator of θ2\theta^2.

Interval estimation. Interval estimation specifies a random interval (T1,T2)(T_1, T_2), computed from the sample, that contains the unknown parameter θ\theta with a stated probability (confidence level) 1α1 - \alpha:

P(T1<θ<T2)=1α.P(T_1 < \theta < T_2) = 1 - \alpha.

The interval (T1,T2)(T_1, T_2) is called a confidence interval and 1α1-\alpha the confidence coefficient.

Proof that T2T^2 is biased for θ2\theta^2. Since TT is unbiased for θ\theta, E(T)=θE(T) = \theta. By the definition of variance,

Var(T)=E(T2)[E(T)]2    E(T2)=Var(T)+θ2.\operatorname{Var}(T) = E(T^2) - [E(T)]^2 \;\Rightarrow\; E(T^2) = \operatorname{Var}(T) + \theta^{2}.

Unless Var(T)=0\operatorname{Var}(T) = 0, we have

E(T2)=θ2+Var(T)θ2.E(T^2) = \theta^{2} + \operatorname{Var}(T) \ne \theta^{2}.

Hence T2T^2 overestimates θ2\theta^2 by an amount equal to Var(T)>0\operatorname{Var}(T) > 0, so T2T^2 is a biased (positively biased) estimator of θ2\theta^2.

interval-estimationunbiased-estimatorbias
14short5 marks

State and prove the Neyman-Pearson lemma. Also write its applications.

Statement. To test a simple null hypothesis H0:θ=θ0H_0: \theta = \theta_0 against a simple alternative H1:θ=θ1H_1: \theta = \theta_1, the most powerful (MP) critical region ww of size α\alpha is given by

L1L0=L(xθ1)L(xθ0)k  inside w,L1L0<k  outside w,\frac{L_1}{L_0} = \frac{L(x \mid \theta_1)}{L(x \mid \theta_0)} \ge k \;\text{inside } w, \qquad \frac{L_1}{L_0} < k \;\text{outside } w,

where k>0k > 0 is chosen so that P(xwH0)=αP(x \in w \mid H_0) = \alpha.

Proof. Let ww be the region defined above and ww^{*} be any other region of size α\alpha, so wL0dx=wL0dx=α\int_w L_0\,dx = \int_{w^*} L_0\,dx = \alpha. The power of ww is wL1dx\int_w L_1\,dx. Consider

wL1dxwL1dx=wwL1dxwwL1dx.\int_w L_1\,dx - \int_{w^*} L_1\,dx = \int_{w \setminus w^*} L_1\,dx - \int_{w^* \setminus w} L_1\,dx.

Inside ww, L1kL0L_1 \ge k L_0; outside ww, L1<kL0L_1 < k L_0. Therefore

wwL1dxk ⁣wwL0dx,wwL1dxk ⁣wwL0dx.\int_{w\setminus w^*} L_1\,dx \ge k\!\int_{w\setminus w^*} L_0\,dx, \qquad \int_{w^*\setminus w} L_1\,dx \le k\!\int_{w^*\setminus w} L_0\,dx.

Hence the difference k(wwL0dxwwL0dx)=k(αα)=0.\ge k\big(\int_{w\setminus w^*} L_0\,dx - \int_{w^*\setminus w} L_0\,dx\big) = k(\alpha - \alpha) = 0. Thus the power of ww is at least that of ww^*, proving ww is most powerful.

Applications:

  1. Construction of most powerful tests for simple hypotheses.
  2. Basis for likelihood ratio tests and uniformly most powerful (UMP) tests.
  3. Used to derive optimal critical regions for normal, binomial, Poisson, etc.
neyman-pearson-lemmahypothesis-testingmost-powerful-test
15short5 marks

Obtain the values of Type I and Type II errors if x1x \ge 1 is the critical region for testing H0:θ=2H_0: \theta = 2 against the alternative hypothesis H1:θ=1H_1: \theta = 1, on the basis of a single observation from the population with density f(x,θ)=θeθxf(x, \theta) = \theta e^{-\theta x}, 0x<0 \le x < \infty.

Let the critical region be w:x1w: x \ge 1 and the density be f(x,θ)=θeθxf(x, \theta) = \theta e^{-\theta x}, x0x \ge 0.

Type I error (rejecting H0H_0 when H0:θ=2H_0: \theta = 2 is true):

α=P(x1θ=2)=12e2xdx=[e2x]1=e20.1353.\alpha = P(x \ge 1 \mid \theta = 2) = \int_1^{\infty} 2 e^{-2x}\, dx = \big[-e^{-2x}\big]_1^{\infty} = e^{-2} \approx 0.1353.

Type II error (accepting H0H_0 when H1:θ=1H_1: \theta = 1 is true), i.e. x<1x < 1 under θ=1\theta = 1:

β=P(x<1θ=1)=01exdx=[ex]01=1e10.6321.\beta = P(x < 1 \mid \theta = 1) = \int_0^{1} e^{-x}\, dx = \big[-e^{-x}\big]_0^{1} = 1 - e^{-1} \approx 0.6321.

Thus the Type I error α=e20.135\alpha = e^{-2} \approx 0.135 and the Type II error β=1e10.632\beta = 1 - e^{-1} \approx 0.632.

type-i-errortype-ii-errorhypothesis-testing
16short5 marks

Two groups of rats, one group consisting of trained ones and another group of untrained ones, have the following number of trials to achieve a certain criterion:

GroupTrials
Trained rats78, 64, 75, 45, 82
Untrained rats110, 70, 53, 51

Use the Mann-Whitney U test to determine whether there is a difference between the two average numbers of trials of trained and untrained rats.

Step 1 — Rank all observations (combined, smallest = rank 1):

Value4551536470757882110
Rank123456789
GroupTUUTUTTTU

(TT = trained, UU = untrained.)

Step 2 — Rank sums:

  • Trained (n1=5n_1 = 5): ranks 1, 4, 6, 7, 8 → R1=26R_1 = 26.
  • Untrained (n2=4n_2 = 4): ranks 2, 3, 5, 9 → R2=19R_2 = 19.

Step 3 — Compute U statistics:

U1=n1n2+n1(n1+1)2R1=20+1526=9.U_1 = n_1 n_2 + \frac{n_1(n_1+1)}{2} - R_1 = 20 + 15 - 26 = 9. U2=n1n2+n2(n2+1)2R2=20+1019=11.U_2 = n_1 n_2 + \frac{n_2(n_2+1)}{2} - R_2 = 20 + 10 - 19 = 11.

Check: U1+U2=20=n1n2U_1 + U_2 = 20 = n_1 n_2. Take U=min(U1,U2)=9U = \min(U_1, U_2) = 9.

Step 4 — Decision. For n1=5n_1 = 5, n2=4n_2 = 4 at the 5% level (two-tailed), the critical value of UU is 11. Since the calculated U=9>1U = 9 > 1, we do not reject H0H_0.

Conclusion: There is no significant difference between the average number of trials of trained and untrained rats.

mann-whitney-u-testnonparametric-testshypothesis-testing
C

Group C

Attempt ALL questions. [10 x 2 = 20]

10 questions·2 marks each
18 (a)short2 marks

Find the standard error of the sample proportion.

If pp is the population proportion and q=1pq = 1 - p, then for a random sample of size nn the sample proportion p^\hat{p} has standard error

S.E.(p^)=pqn=p(1p)n.S.E.(\hat{p}) = \sqrt{\frac{pq}{n}} = \sqrt{\frac{p(1-p)}{n}}.

When pp is unknown it is estimated by p^\hat{p}, giving S.E.(p^)=p^(1p^)n.S.E.(\hat{p}) = \sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}.

standard-errorsample-proportionsampling
18 (b)short2 marks

What are one-tailed and two-tailed tests in testing of hypothesis?

One-tailed test: The critical (rejection) region lies entirely in one tail of the sampling distribution. It is used when the alternative hypothesis is directional, e.g. H1:μ>μ0H_1: \mu > \mu_0 (right-tailed) or H1:μ<μ0H_1: \mu < \mu_0 (left-tailed).

Two-tailed test: The critical region is split between both tails of the distribution. It is used when the alternative is non-directional, e.g. H1:μμ0H_1: \mu \ne \mu_0, so that significantly large or small values lead to rejection of H0H_0.

one-tailed-testtwo-tailed-testhypothesis-testing
18 (c)short2 marks

Give an example for the outcome of a random experiment that is a two-dimensional random variable.

A two-dimensional (bivariate) random variable assigns a pair of real numbers to each outcome of a random experiment.

Example: Select a student at random and record (X,Y)(X, Y) where XX = the student's height and YY = the student's weight. Each outcome gives an ordered pair (x,y)(x, y), so (X,Y)(X, Y) is a two-dimensional random variable. (Another example: tossing two dice and recording the pair of numbers (X,Y)(X, Y) that appear.)

random-variabletwo-dimensional-random-variableprobability
18 (d)short2 marks

A plant produces steel sheets whose weights are normally distributed with a standard deviation of 2.4 kg. A sample of 10 had a mean weight of 31.4 kg. Find the 95% confidence limits for the population mean.

Given σ=2.4\sigma = 2.4 kg, n=10n = 10, xˉ=31.4\bar{x} = 31.4 kg. Since σ\sigma is known, use the zz value 1.961.96 for 95% confidence.

Standard error: σn=2.410=2.43.162=0.759\dfrac{\sigma}{\sqrt{n}} = \dfrac{2.4}{\sqrt{10}} = \dfrac{2.4}{3.162} = 0.759 kg.

95% confidence limits:

xˉ±1.96σn=31.4±1.96×0.759=31.4±1.488.\bar{x} \pm 1.96 \cdot \frac{\sigma}{\sqrt{n}} = 31.4 \pm 1.96 \times 0.759 = 31.4 \pm 1.488.

Thus the limits are (29.91,32.89)(29.91, 32.89) kg approximately, i.e. the 95% confidence interval for the population mean is 29.91 kg<μ<32.89 kg29.91 \text{ kg} < \mu < 32.89 \text{ kg}.

confidence-intervalnormal-distributionestimation
18 (e)short2 marks

What are the four main features of the FF-distribution curve?

Four main features of the FF-distribution curve:

  1. It is a continuous distribution defined only for non-negative values (F0F \ge 0).
  2. It is positively skewed (skewed to the right), the skewness decreasing as the degrees of freedom increase.
  3. Its shape depends on two parameters — the numerator and denominator degrees of freedom (m,n)(m, n).
  4. The total area under the curve is 1, and it is unimodal; as both degrees of freedom become large it approaches the normal curve.
f-distributionsampling-distributionsproperties
18 (f)short2 marks

Write down the mean and variance of the hypergeometric distribution.

For a hypergeometric distribution with population size NN, MM successes in the population, and sample size nn:

Mean: E(X)=nMN=npE(X) = \dfrac{nM}{N} = np, where p=M/Np = M/N.

Variance: Var(X)=nMN(1MN)NnN1=npq(NnN1),\operatorname{Var}(X) = n\,\dfrac{M}{N}\left(1 - \dfrac{M}{N}\right)\dfrac{N - n}{N - 1} = npq\left(\dfrac{N - n}{N - 1}\right),

where q=1pq = 1 - p and NnN1\dfrac{N-n}{N-1} is the finite population correction factor.

hypergeometric-distributionmean-varianceprobability-distributions
18 (g)short2 marks

Write down the recurrence relation of the Chi-square distribution with nn degrees of freedom.

For the chi-square distribution with nn degrees of freedom, the central moments μr\mu_r satisfy the recurrence relation

μr+1=2r(μr+nμr1),r1,\mu_{r+1} = 2r\left(\mu_r + n\,\mu_{r-1}\right), \quad r \ge 1,

with μ0=1\mu_0 = 1 and μ1=0\mu_1 = 0. Using this, μ2=2n\mu_2 = 2n, μ3=8n\mu_3 = 8n, and μ4=48n+12n2\mu_4 = 48n + 12n^2. (The raw moments satisfy μr=n(n+2)(n+4)(n+2r2)\mu_r' = n(n+2)(n+4)\cdots(n+2r-2).)

chi-square-distributionrecurrence-relationmoments
18 (h)short2 marks

Give the statement of Cramer-Rao's Inequality.

Cramer-Rao Inequality. If TT is an unbiased estimator of a parameter θ\theta based on a random sample, then under regularity conditions the variance of TT cannot be smaller than the reciprocal of the Fisher information:

Var(T)1I(θ)=1E ⁣[(logLθ)2]=1E ⁣(2logLθ2).\operatorname{Var}(T) \ge \frac{1}{I(\theta)} = \frac{1}{E\!\left[\left(\dfrac{\partial \log L}{\partial \theta}\right)^{2}\right]} = \frac{1}{-E\!\left(\dfrac{\partial^{2} \log L}{\partial \theta^{2}}\right)}.

The right-hand side is the minimum variance bound (MVB). An estimator attaining this bound is the most efficient (MVB) estimator.

cramer-rao-inequalityestimationinformation
18 (i)short2 marks

What are the characteristics of a good estimator?

The four characteristics (properties) of a good estimator are:

  1. UnbiasednessE(θ^)=θE(\hat{\theta}) = \theta; on average the estimator equals the parameter.
  2. Consistencyθ^θ\hat{\theta} \to \theta in probability as the sample size nn \to \infty.
  3. Efficiency — it has the smallest variance among all unbiased estimators (minimum variance).
  4. Sufficiency — it utilises all the information in the sample relevant to the parameter.
good-estimatorestimationproperties
18 (j)short2 marks

Give the moment generating function of the Cauchy distribution.

The Cauchy distribution does not possess a moment generating function, because the defining integral etxf(x)dx\int_{-\infty}^{\infty} e^{tx} f(x)\,dx diverges for every t0t \ne 0 (its moments, including the mean, do not exist).

Instead, the Cauchy distribution is characterised by its characteristic function. For the standard Cauchy distribution with p.d.f. f(x)=1π(1+x2)f(x) = \dfrac{1}{\pi(1 + x^2)},

ϕ(t)=E(eitX)=et.\phi(t) = E(e^{itX}) = e^{-|t|}.

For a general Cauchy with location μ\mu and scale λ\lambda: ϕ(t)=eiμtλt.\phi(t) = e^{i\mu t - \lambda |t|}.

cauchy-distributionmoment-generating-functionprobability-distributions

Frequently asked questions

Where can I find the BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) question paper 2075?
The full BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) 2075 (model) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the B.Sc. II Year Statistics (STA201) (Model) 2075 paper come with solutions?
Yes. Every question on this B.Sc. II Year Statistics (STA201) (Model) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) 2075 paper?
The BSc CSIT (TU) B.Sc. II Year Statistics (STA201) (Model) 2075 paper carries 100 full marks and is meant to be completed in 180 minutes, across 26 questions.
Is practising this B.Sc. II Year Statistics (STA201) (Model) past paper free?
Yes — reading and attempting this B.Sc. II Year Statistics (STA201) (Model) past paper on Kekkei is completely free.