Browse papers
A

Section A: Long Answer Questions

Attempt all / any as specified.

3 questions
1long16 marks

(a) State and prove Bayes' theorem for a sample space partitioned into mutually exclusive and exhaustive events E1,E2,,EnE_1, E_2, \dots, E_n. (6)

(b) In an electronics assembly plant, three machines M1M_1, M2M_2 and M3M_3 produce 30%, 45% and 25% of the total output of microcontroller boards respectively. The proportion of defective boards produced by these machines is 2%, 3% and 4% respectively. A board is selected at random from the total output and is found to be defective.

(i) What is the probability that the selected board is defective?

(ii) Given that the board is defective, find the probability that it was produced by machine M2M_2.

(iii) Which machine is most likely to have produced the defective board? Justify your answer using the posterior probabilities. (10)

(a) Statement and Proof of Bayes' Theorem

Statement. Let E1,E2,,EnE_1, E_2, \dots, E_n be mutually exclusive and exhaustive events of a sample space SS (i.e. EiEj=E_i \cap E_j = \varnothing for iji \ne j and i=1nEi=S\bigcup_{i=1}^{n} E_i = S), each with P(Ei)>0P(E_i) > 0. Let AA be any event with P(A)>0P(A) > 0. Then for each kk:

P(EkA)=P(Ek)P(AEk)i=1nP(Ei)P(AEi)P(E_k \mid A) = \frac{P(E_k)\,P(A \mid E_k)}{\sum_{i=1}^{n} P(E_i)\,P(A \mid E_i)}

Proof. Since the EiE_i partition SS, we can write

A=AS=A(i=1nEi)=i=1n(AEi).A = A \cap S = A \cap \left(\bigcup_{i=1}^{n} E_i\right) = \bigcup_{i=1}^{n} (A \cap E_i).

The events AEiA \cap E_i are mutually exclusive, so by the axiom of additivity (theorem of total probability):

P(A)=i=1nP(AEi)=i=1nP(Ei)P(AEi).P(A) = \sum_{i=1}^{n} P(A \cap E_i) = \sum_{i=1}^{n} P(E_i)\,P(A \mid E_i).

By the definition of conditional probability and the multiplication rule:

P(EkA)=P(EkA)P(A)=P(Ek)P(AEk)P(A).P(E_k \mid A) = \frac{P(E_k \cap A)}{P(A)} = \frac{P(E_k)\,P(A \mid E_k)}{P(A)}.

Substituting the total-probability expression for P(A)P(A) gives

P(EkA)=P(Ek)P(AEk)i=1nP(Ei)P(AEi).P(E_k \mid A) = \frac{P(E_k)\,P(A \mid E_k)}{\sum_{i=1}^{n} P(E_i)\,P(A \mid E_i)}. \qquad \blacksquare

(b) Numerical

Let DD = "board is defective." Given:

P(M1)=0.30,  P(M2)=0.45,  P(M3)=0.25,P(M_1)=0.30,\; P(M_2)=0.45,\; P(M_3)=0.25, P(DM1)=0.02,  P(DM2)=0.03,  P(DM3)=0.04.P(D\mid M_1)=0.02,\; P(D\mid M_2)=0.03,\; P(D\mid M_3)=0.04.

(i) Probability the board is defective (total probability):

P(D)=(0.30)(0.02)+(0.45)(0.03)+(0.25)(0.04)P(D) = (0.30)(0.02) + (0.45)(0.03) + (0.25)(0.04) =0.0060+0.0135+0.0100=0.0295.= 0.0060 + 0.0135 + 0.0100 = 0.0295.

So P(D)=0.0295P(D) = 0.0295 (about 2.95%2.95\%).

(ii) Probability it came from M2M_2 (Bayes):

P(M2D)=P(M2)P(DM2)P(D)=0.01350.02950.4576.P(M_2 \mid D) = \frac{P(M_2)P(D\mid M_2)}{P(D)} = \frac{0.0135}{0.0295} \approx 0.4576.

(iii) Most likely source. Compute all posteriors:

P(M1D)=0.00600.02950.2034,P(M_1\mid D) = \frac{0.0060}{0.0295} \approx 0.2034, P(M2D)0.4576,P(M_2\mid D) \approx 0.4576, P(M3D)=0.01000.02950.3390.P(M_3\mid D) = \frac{0.0100}{0.0295} \approx 0.3390.

Since P(M2D)0.458P(M_2 \mid D) \approx 0.458 is the largest posterior probability, machine M2M_2 is most likely to have produced the defective board. Although M2M_2 has only a moderate defect rate, its large share of output (45%) dominates, so it contributes the most defectives overall.

probability-theorybayes-theorem
2long16 marks

(a) Define a continuous random variable and state the properties that a probability density function (pdf) f(x)f(x) must satisfy. (4)

(b) The lifetime (in thousands of hours) of an electronic component is a continuous random variable XX with pdf

f(x)={kx(4x),0x40,otherwisef(x) = \begin{cases} kx(4-x), & 0 \le x \le 4 \\ 0, & \text{otherwise} \end{cases}

(i) Determine the value of the constant kk.

(ii) Find the mean and variance of the lifetime XX.

(iii) Compute P(1X3)P(1 \le X \le 3). (8)

(c) Distinguish between a binomial distribution and a Poisson distribution, stating one engineering situation where each is appropriate. (4)

(a) Continuous random variable and pdf properties

A continuous random variable XX is one that can take any value in an interval (or union of intervals) of the real line; the probability that it equals any single value is zero, and probabilities are obtained by integrating a probability density function (pdf) f(x)f(x) over intervals.

A valid pdf f(x)f(x) must satisfy:

  1. Non-negativity: f(x)0f(x) \ge 0 for all xx.
  2. Normalization (total area = 1): f(x)dx=1.\displaystyle \int_{-\infty}^{\infty} f(x)\,dx = 1.
  3. Probability via integration: P(aXb)=abf(x)dx.\displaystyle P(a \le X \le b) = \int_a^b f(x)\,dx.

(b) Numerical

f(x)=kx(4x),0x4.f(x) = kx(4-x), \quad 0 \le x \le 4.

(i) Find kk. Require total area =1= 1:

04kx(4x)dx=k04(4xx2)dx=k[2x2x33]04=k(32643)=k323.\int_0^4 kx(4-x)\,dx = k\int_0^4 (4x - x^2)\,dx = k\left[2x^2 - \tfrac{x^3}{3}\right]_0^4 = k\left(32 - \tfrac{64}{3}\right) = k\cdot\tfrac{32}{3}.

Setting 323k=1\tfrac{32}{3}k = 1 gives k=332.\boxed{k = \dfrac{3}{32}}.

(ii) Mean and variance.

E[X]=04xf(x)dx=33204(4x2x3)dx=332[4x33x44]04=332(256364)=332643=2.E[X] = \int_0^4 x\,f(x)\,dx = \tfrac{3}{32}\int_0^4 (4x^2 - x^3)\,dx = \tfrac{3}{32}\left[\tfrac{4x^3}{3} - \tfrac{x^4}{4}\right]_0^4 = \tfrac{3}{32}\left(\tfrac{256}{3} - 64\right) = \tfrac{3}{32}\cdot\tfrac{64}{3} = 2.

So E[X]=2E[X] = 2 (thousand hours). By symmetry of x(4x)x(4-x) about x=2x=2, this is expected.

E[X2]=33204(4x3x4)dx=332[x4x55]04=332(25610245)=3322565=245=4.8.E[X^2] = \tfrac{3}{32}\int_0^4 (4x^3 - x^4)\,dx = \tfrac{3}{32}\left[x^4 - \tfrac{x^5}{5}\right]_0^4 = \tfrac{3}{32}\left(256 - \tfrac{1024}{5}\right) = \tfrac{3}{32}\cdot\tfrac{256}{5} = \tfrac{24}{5} = 4.8. Var(X)=E[X2](E[X])2=4.84=0.8.\text{Var}(X) = E[X^2] - (E[X])^2 = 4.8 - 4 = 0.8.

So mean =2= 2, variance =0.8= 0.8 (thousand-hours, hours2^2 respectively).

(iii) P(1X3)P(1 \le X \le 3).

P(1X3)=33213(4xx2)dx=332[2x2x33]13.P(1\le X\le 3) = \tfrac{3}{32}\int_1^3 (4x - x^2)\,dx = \tfrac{3}{32}\left[2x^2 - \tfrac{x^3}{3}\right]_1^3.

At x=3x=3: 189=918 - 9 = 9. At x=1x=1: 213=532 - \tfrac13 = \tfrac53. Difference =953=223.= 9 - \tfrac53 = \tfrac{22}{3}.

P=332223=2232=1116=0.6875.P = \tfrac{3}{32}\cdot\tfrac{22}{3} = \tfrac{22}{32} = \tfrac{11}{16} = 0.6875.

(c) Binomial vs Poisson

BinomialPoisson
Parametersnn (fixed trials), ppλ\lambda (mean rate)
TrialsFixed, finite number nnEvents over continuous interval / large nn
Mean / Variancenpnp / np(1p)np(1-p)λ\lambda / λ\lambda (equal)
UseCounts of successes in nn trialsCounts of rare events per unit time/area

Poisson is the limiting case of binomial as nn\to\infty, p0p\to 0 with np=λnp=\lambda fixed.

  • Binomial example: number of defective chips in a batch of n=20n=20 inspected chips, each defective with probability pp.
  • Poisson example: number of packet arrivals at a router per second (rare events at a known average rate).
probability-distributionsrandom-variables
3long16 marks

(a) Explain the general procedure of testing a statistical hypothesis. Clearly define the terms null hypothesis, alternative hypothesis, Type I error, Type II error, level of significance, and critical region. (6)

(b) A manufacturer claims that the mean tensile strength of a certain type of wire is at least 250 MPa. A random sample of 36 wires gives a sample mean of 244 MPa with a sample standard deviation of 18 MPa.

(i) Formulate the appropriate null and alternative hypotheses.

(ii) Test the manufacturer's claim at the 5% level of significance.

(iii) State your conclusion and explain whether the manufacturer's claim is supported by the data. (10)

(a) Procedure for testing a statistical hypothesis

General steps:

  1. Formulate hypotheses: state the null hypothesis H0H_0 and alternative hypothesis H1H_1.
  2. Choose level of significance α\alpha (e.g. 0.05).
  3. Select the test statistic appropriate to the parameter and sampling distribution (zz, tt, χ2\chi^2, FF).
  4. Determine the critical region (rejection region) from α\alpha and whether the test is one- or two-tailed.
  5. Compute the test statistic from the sample data.
  6. Decision: reject H0H_0 if the statistic falls in the critical region; otherwise do not reject H0H_0.
  7. Conclusion in terms of the original problem.

Definitions:

  • Null hypothesis (H0H_0): the statement of no effect / no difference, assumed true until evidence contradicts it (e.g. μ=μ0\mu = \mu_0).
  • Alternative hypothesis (H1H_1): the claim accepted if H0H_0 is rejected (e.g. μμ0\mu \ne \mu_0, μ<μ0\mu < \mu_0, or μ>μ0\mu > \mu_0).
  • Type I error: rejecting H0H_0 when it is actually true; its probability is α\alpha.
  • Type II error: failing to reject H0H_0 when it is actually false; its probability is β\beta.
  • Level of significance (α\alpha): the maximum probability of committing a Type I error, fixed in advance.
  • Critical region: the set of values of the test statistic for which H0H_0 is rejected.

(b) Numerical

Given: claimed μ250\mu \ge 250, n=36n=36, xˉ=244\bar{x}=244, s=18s=18, α=0.05\alpha=0.05.

(i) Hypotheses (one-tailed, testing whether mean is less than claimed):

H0:μ250 MPaH1:μ<250 MPa.H_0: \mu \ge 250 \text{ MPa} \qquad H_1: \mu < 250 \text{ MPa}.

(ii) Test. Since n=36n=36 is large, use the zz-test (with ss for σ\sigma):

z=xˉμ0s/n=24425018/36=618/6=63=2.0.z = \frac{\bar{x} - \mu_0}{s/\sqrt{n}} = \frac{244 - 250}{18/\sqrt{36}} = \frac{-6}{18/6} = \frac{-6}{3} = -2.0.

For a left-tailed test at α=0.05\alpha=0.05, the critical value is z0.05=1.645z_{0.05} = -1.645. Rejection region: z<1.645z < -1.645.

Since z=2.0<1.645z = -2.0 < -1.645, the test statistic lies in the critical region, so we reject H0H_0.

(iii) Conclusion. At the 5% level of significance there is sufficient evidence that the true mean tensile strength is less than 250 MPa. The manufacturer's claim that the mean strength is at least 250 MPa is not supported by the sample data.

hypothesis-testingsampling-and-estimation
B

Section B: Short Answer Questions

Attempt all / any as specified.

8 questions
4short7 marks

The marks obtained by 50 students in a programming examination are summarized below:

Marks10–2020–3030–4040–5050–6060–70
No. of students48141284

(a) Compute the mean and the median of the distribution.

(b) Calculate the standard deviation and the coefficient of variation, and comment on the consistency of the data.

Using class midpoints xx and frequencies ff (N=50N=50):

ClassxxfffxfxCFf(xxˉ)2f(x-\bar{x})^2
10–201546044(1539)2=23044(15-39)^2=2304
20–30258200128(2539)2=15688(25-39)^2=1568
30–4035144902614(3539)2=22414(35-39)^2=224
40–5045125403812(4539)2=43212(45-39)^2=432
50–60558440468(5539)2=20488(55-39)^2=2048
60–70654260504(6539)2=27044(65-39)^2=2704
Total5019909280

(a) Mean and Median

Mean: xˉ=fxN=199050=39.8.\bar{x} = \dfrac{\sum fx}{N} = \dfrac{1990}{50} = 39.8.

Median: N/2=25N/2 = 25 lies in the class 30–40 (CF just exceeds 25 there). With L=30L=30, CF=12CF=12 (before), f=14f=14, h=10h=10:

Median=L+N/2CFf×h=30+251214×10=30+9.29=39.29.\text{Median} = L + \frac{N/2 - CF}{f}\times h = 30 + \frac{25-12}{14}\times 10 = 30 + 9.29 = 39.29.

(b) Standard deviation and CV

σ=f(xxˉ)2N=928050=185.613.62.\sigma = \sqrt{\frac{\sum f(x-\bar{x})^2}{N}} = \sqrt{\frac{9280}{50}} = \sqrt{185.6} \approx 13.62. CV=σxˉ×100=13.6239.8×10034.2%.\text{CV} = \frac{\sigma}{\bar{x}}\times 100 = \frac{13.62}{39.8}\times 100 \approx 34.2\%.

Comment: A coefficient of variation of about 34%34\% is fairly high, indicating that the marks are not very consistent — there is considerable dispersion of student performance about the mean.

descriptive-statistics
5short7 marks

The following data show the number of hours (xx) eight students spent practising coding problems per week and their corresponding scores (yy) in a competitive test:

xx2456891012
yy2030344052566070

(a) Fit the least-squares regression line of yy on xx.

(b) Estimate the test score of a student who practises for 7 hours per week.

(c) Compute the Karl Pearson coefficient of correlation and interpret it.

With n=8n=8, compute the sums:

x=56,y=362,xy=2962,x2=470,y2=18696.\sum x = 56,\quad \sum y = 362,\quad \sum xy = 2962,\quad \sum x^2 = 470,\quad \sum y^2 = 18696.

(xˉ=7, yˉ=45.25\bar{x}=7,\ \bar{y}=45.25.)

(a) Least-squares line of yy on xx

b=nxyxynx2(x)2=8(2962)(56)(362)8(470)562=236962027237603136=34246245.487.b = \frac{n\sum xy - \sum x\sum y}{n\sum x^2 - (\sum x)^2} = \frac{8(2962) - (56)(362)}{8(470) - 56^2} = \frac{23696 - 20272}{3760 - 3136} = \frac{3424}{624} \approx 5.487. a=yˉbxˉ=45.255.487(7)=45.2538.41=6.84.a = \bar{y} - b\bar{x} = 45.25 - 5.487(7) = 45.25 - 38.41 = 6.84.

Regression line: y^=6.84+5.487x.\hat{y} = 6.84 + 5.487x.

(b) Estimate for x=7x = 7 hours

y^=6.84+5.487(7)=6.84+38.4145.25.\hat{y} = 6.84 + 5.487(7) = 6.84 + 38.41 \approx 45.25.

The estimated test score is about 45 marks.

(c) Karl Pearson correlation coefficient

r=nxyxy[nx2(x)2][ny2(y)2].r = \frac{n\sum xy - \sum x\sum y}{\sqrt{[n\sum x^2-(\sum x)^2][n\sum y^2-(\sum y)^2]}}.

Denominator pieces: nx2(x)2=624n\sum x^2-(\sum x)^2 = 624; ny2(y)2=8(18696)3622=149568131044=18524.n\sum y^2-(\sum y)^2 = 8(18696) - 362^2 = 149568 - 131044 = 18524.

r=3424624×18524=342411558976=34243399.850.993.r = \frac{3424}{\sqrt{624\times 18524}} = \frac{3424}{\sqrt{11558976}} = \frac{3424}{3399.85} \approx 0.993.

Interpretation: r0.99r \approx 0.99 indicates a very strong positive linear correlation — students who practise more hours tend to score substantially higher.

regression-and-correlation
6short6 marks

The number of requests arriving at a web server follows a Poisson distribution with an average of 5 requests per minute.

(a) Find the probability that exactly 3 requests arrive in a given minute.

(b) Find the probability that at most 2 requests arrive in a given minute.

(c) Find the probability that more than 1 request arrives in a 30-second interval.

Let XPoisson(λ)X \sim \text{Poisson}(\lambda) with λ=5\lambda = 5 per minute. PMF: P(X=k)=eλλkk!.P(X=k)=\dfrac{e^{-\lambda}\lambda^k}{k!}.

(a) Exactly 3 requests in a minute (λ=5\lambda=5)

P(X=3)=e5533!=e51256=20.833e520.833(0.006738)0.1404.P(X=3) = \frac{e^{-5}5^3}{3!} = \frac{e^{-5}\cdot 125}{6} = 20.833\,e^{-5} \approx 20.833(0.006738) \approx 0.1404.

(b) At most 2 requests in a minute

P(X2)=e5(1+5+252)=e5(18.5)=18.5(0.006738)0.1247.P(X\le 2) = e^{-5}\left(1 + 5 + \frac{25}{2}\right) = e^{-5}(18.5) = 18.5(0.006738) \approx 0.1247.

(c) More than 1 request in a 30-second interval

For a 30-second (half-minute) interval the mean is λ=5×0.5=2.5.\lambda' = 5\times 0.5 = 2.5.

P(X>1)=1P(X=0)P(X=1)=1e2.5(1+2.5)=13.5e2.5.P(X>1) = 1 - P(X=0) - P(X=1) = 1 - e^{-2.5}(1 + 2.5) = 1 - 3.5\,e^{-2.5}.

With e2.50.082085e^{-2.5}\approx 0.082085: P(X>1)=13.5(0.082085)=10.28730.7127.P(X>1) = 1 - 3.5(0.082085) = 1 - 0.2873 \approx 0.7127.

So P(X>1)0.713.P(X>1) \approx 0.713.

probability-distributions
7short7 marks

The diameters of ball bearings produced by a machine are normally distributed with a mean of 12.0 mm and a standard deviation of 0.04 mm. A bearing is acceptable if its diameter lies between 11.92 mm and 12.08 mm.

(a) Find the probability that a randomly selected bearing is acceptable.

(b) If 5000 bearings are produced, estimate how many will be rejected.

(c) Find the diameter exceeded by only 2.5% of the bearings.

Let XN(μ=12.0, σ=0.04)X \sim N(\mu=12.0,\ \sigma=0.04) mm. Standardize with z=xμσ.z=\dfrac{x-\mu}{\sigma}.

(a) Probability a bearing is acceptable (11.92X12.0811.92 \le X \le 12.08)

z1=11.9212.00.04=2,z2=12.0812.00.04=+2.z_1 = \frac{11.92-12.0}{0.04} = -2,\qquad z_2 = \frac{12.08-12.0}{0.04} = +2. P(2Z2)=Φ(2)Φ(2)=0.97720.0228=0.9544.P(-2 \le Z \le 2) = \Phi(2) - \Phi(-2) = 0.9772 - 0.0228 = 0.9544.

So about 95.44% of bearings are acceptable.

(b) Number rejected out of 5000

Probability of rejection =10.9544=0.0456.= 1 - 0.9544 = 0.0456.

Expected rejects=5000×0.0456=228 bearings.\text{Expected rejects} = 5000 \times 0.0456 = 228 \text{ bearings}.

(c) Diameter exceeded by only 2.5% of bearings

We need dd with P(X>d)=0.025P(X > d) = 0.025, i.e. P(Xd)=0.975P(X \le d) = 0.975, so z=1.96.z = 1.96.

d=μ+zσ=12.0+1.96(0.04)=12.0+0.0784=12.078 mm.d = \mu + z\sigma = 12.0 + 1.96(0.04) = 12.0 + 0.0784 = 12.078 \text{ mm}.

Only about 2.5%2.5\% of bearings exceed 12.07812.078 mm.

probability-distributionsrandom-variables
8short6 marks

(a) State the addition and multiplication theorems of probability for two events AA and BB. (2)

(b) Two independent components AA and BB in a system have reliabilities (probabilities of functioning) 0.9 and 0.8 respectively. Find the probability that the system functions if the components are connected (i) in series, and (ii) in parallel. (4)

(a) Addition and multiplication theorems

Addition theorem (for any two events A,BA,B):

P(AB)=P(A)+P(B)P(AB).P(A \cup B) = P(A) + P(B) - P(A \cap B).

If AA and BB are mutually exclusive, P(AB)=0P(A\cap B)=0, so P(AB)=P(A)+P(B)P(A\cup B)=P(A)+P(B).

Multiplication theorem:

P(AB)=P(A)P(BA)=P(B)P(AB).P(A \cap B) = P(A)\,P(B \mid A) = P(B)\,P(A \mid B).

If AA and BB are independent, P(AB)=P(A)P(B)P(A\cap B)=P(A)\,P(B).

(b) System reliability

Given P(A)=0.9P(A)=0.9, P(B)=0.8P(B)=0.8, independent.

(i) Series — system works only if both components work:

Rseries=P(A)P(B)=0.9×0.8=0.72.R_{\text{series}} = P(A)\,P(B) = 0.9 \times 0.8 = 0.72.

(ii) Parallel — system works if at least one component works:

Rparallel=1(10.9)(10.8)=1(0.1)(0.2)=10.02=0.98.R_{\text{parallel}} = 1 - (1-0.9)(1-0.8) = 1 - (0.1)(0.2) = 1 - 0.02 = 0.98.

The parallel configuration (0.980.98) is far more reliable than the series configuration (0.720.72), illustrating redundancy.

probability-theory
9short7 marks

A random sample of 100 resistors drawn from a large production lot has a mean resistance of 102 ohms with a sample standard deviation of 8 ohms.

(a) Construct a 95% confidence interval for the true mean resistance of the lot.

(b) Interpret the meaning of this confidence interval.

(c) What sample size would be required to estimate the mean resistance within ±1\pm 1 ohm at the same confidence level?

Given n=100n=100, xˉ=102\bar{x}=102, s=8s=8, large sample so use zz.

(a) 95% confidence interval for μ\mu

Standard error =sn=8100=0.8.= \dfrac{s}{\sqrt{n}} = \dfrac{8}{\sqrt{100}} = 0.8. For 95% confidence, z0.025=1.96.z_{0.025}=1.96.

CI=xˉ±zsn=102±1.96(0.8)=102±1.568.\text{CI} = \bar{x} \pm z\cdot\frac{s}{\sqrt{n}} = 102 \pm 1.96(0.8) = 102 \pm 1.568. (100.43, 103.57) ohms.\boxed{(100.43,\ 103.57) \text{ ohms}}.

(b) Interpretation

We are 95% confident that the true mean resistance of the production lot lies between about 100.43 and 103.57 ohms. Operationally, if many such samples were taken and an interval computed from each, about 95% of those intervals would contain the true mean resistance.

(c) Required sample size for margin ±1\pm 1 ohm

Margin of error E=zσn1E = z\cdot\dfrac{\sigma}{\sqrt{n}} \le 1, so

n(zsE)2=(1.96×81)2=(15.68)2=245.86.n \ge \left(\frac{z\,s}{E}\right)^2 = \left(\frac{1.96 \times 8}{1}\right)^2 = (15.68)^2 = 245.86.

Rounding up, the required sample size is n=246n = 246 resistors.

sampling-and-estimation
10short6 marks

In a survey of software developers, 240 out of 400 respondents stated that they prefer working with statically typed programming languages. Test, at the 1% level of significance, whether the proportion of developers who prefer statically typed languages differs significantly from 0.5. State the hypotheses, compute the test statistic, and give your conclusion.

This is a test of a single proportion (large sample, two-tailed).

Data: n=400n=400, successes x=240x=240, so sample proportion p^=240400=0.6.\hat{p} = \dfrac{240}{400} = 0.6. Hypothesized p0=0.5p_0 = 0.5, α=0.01.\alpha = 0.01.

Hypotheses:

H0:p=0.5H1:p0.5(two-tailed).H_0: p = 0.5 \qquad H_1: p \ne 0.5 \quad (\text{two-tailed}).

Test statistic (under H0H_0, standard error uses p0p_0):

z=p^p0p0(1p0)n=0.60.50.5×0.5400=0.10.000625=0.10.025=4.0.z = \frac{\hat{p} - p_0}{\sqrt{\dfrac{p_0(1-p_0)}{n}}} = \frac{0.6 - 0.5}{\sqrt{\dfrac{0.5 \times 0.5}{400}}} = \frac{0.1}{\sqrt{0.000625}} = \frac{0.1}{0.025} = 4.0.

Critical value: for a two-tailed test at α=0.01\alpha=0.01, z0.005=±2.576.z_{0.005} = \pm 2.576.

Decision: z=4.0>2.576|z| = 4.0 > 2.576, so the statistic falls in the rejection region — reject H0H_0.

Conclusion: At the 1% level of significance, the proportion of developers preferring statically typed languages (0.60.6) differs significantly from 0.50.5; the data provide strong evidence of a genuine preference for statically typed languages.

hypothesis-testing
11short6 marks

(a) Define skewness and kurtosis. Explain how they describe the shape of a frequency distribution. (3)

(b) For a moderately skewed distribution, the mean is 36, the median is 34, and the standard deviation is 6. Compute the Karl Pearson coefficient of skewness and comment on the nature of the distribution. (3)

(a) Skewness and kurtosis

Skewness measures the asymmetry of a frequency distribution about its mean. If the longer tail is on the right (mean > median), skewness is positive (right-skewed); if on the left (mean < median), it is negative (left-skewed); a symmetric distribution has zero skewness.

Kurtosis measures the peakedness (and tail weight) of a distribution relative to the normal curve. A leptokurtic distribution (β2>3\beta_2 > 3) is more peaked with heavier tails; a platykurtic distribution (β2<3\beta_2 < 3) is flatter; a mesokurtic distribution (β2=3\beta_2 = 3) matches the normal curve.

Together they describe the shape of a distribution: skewness tells how lopsided it is, kurtosis tells how sharp/flat its peak and how heavy its tails are.

(b) Karl Pearson coefficient of skewness

Given mean =36=36, median =34=34, σ=6\sigma = 6. Using the median-based formula:

Sk=3(MeanMedian)σ=3(3634)6=66=1.0.S_k = \frac{3(\text{Mean} - \text{Median})}{\sigma} = \frac{3(36 - 34)}{6} = \frac{6}{6} = 1.0.

Comment: Since Sk=+1.0>0S_k = +1.0 > 0, the distribution is positively (right) skewed — it has a longer tail toward the higher values, and the value +1+1 indicates a fairly strong degree of skewness.

descriptive-statisticsprobability-theory

Frequently asked questions

Where can I find the BE Computer Engineering (Pokhara University) Probability and Statistics (PU, MTH 216) question paper 2079?
The full BE Computer Engineering (Pokhara University) Probability and Statistics (PU, MTH 216) 2079 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Probability and Statistics (PU, MTH 216) 2079 paper come with solutions?
Yes. Every question on this Probability and Statistics (PU, MTH 216) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the BE Computer Engineering (Pokhara University) Probability and Statistics (PU, MTH 216) 2079 paper?
The BE Computer Engineering (Pokhara University) Probability and Statistics (PU, MTH 216) 2079 paper carries 100 full marks and is meant to be completed in 180 minutes, across 11 questions.
Is practising this Probability and Statistics (PU, MTH 216) past paper free?
Yes — reading and attempting this Probability and Statistics (PU, MTH 216) past paper on Kekkei is completely free.