BSc CSIT (TU) Science Statistics II (BSc CSIT, STA210) Question Paper 2075 Nepal
This is the official BSc CSIT (TU) (Science stream) Statistics II (BSc CSIT, STA210) question paper for 2075, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Statistics II (BSc CSIT, STA210) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Statistics II (BSc CSIT, STA210) exam or solving previous years' question papers, this 2075 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
What is hypothesis testing? Explain the procedure of testing of hypothesis including null and alternative hypotheses, level of significance, types of errors and the critical region.
Hypothesis Testing
Hypothesis testing is a statistical procedure that uses sample data to decide whether a claim (assumption) about a population parameter is supported by the evidence. It provides a rule for accepting or rejecting the claim with a controlled probability of error.
Procedure of Testing a Hypothesis
Step 1: Set up the Null and Alternative Hypotheses
- Null hypothesis (): A statement of no difference / no effect that is assumed true until evidence contradicts it, e.g. .
- Alternative hypothesis (): The claim accepted if is rejected. It may be:
- Two-tailed:
- One-tailed: or
Step 2: Choose the Level of Significance ()
The level of significance is the maximum probability of rejecting when it is actually true. Common values are (5%) or (1%). It is fixed before collecting data.
Step 3: Identify the Test Statistic
Select an appropriate statistic (e.g. , , , ) whose sampling distribution under is known, e.g.
Step 4: Determine the Critical Region (Rejection Region)
The critical region is the set of values of the test statistic for which is rejected. Its boundary is the critical value, fixed so that .
Step 5: Compute the test statistic from the sample and compare with the critical value.
Step 6: Decision — Reject if the computed value falls in the critical region; otherwise do not reject . State the conclusion in the context of the problem.
Types of Errors
| True | False | |
|---|---|---|
| Reject | Type I error () | Correct (power ) |
| Accept | Correct | Type II error () |
- Type I error: Rejecting a true ; probability .
- Type II error: Accepting a false ; probability .
Reducing one error (for fixed ) tends to increase the other; increasing sample size reduces both.
Define correlation and regression. Explain the method of fitting two regression lines and the relationship between correlation coefficient and regression coefficients.
Correlation and Regression
Correlation measures the degree and direction of the linear association between two variables and . It tells whether the variables move together (positive), oppositely (negative), or are unrelated. Karl Pearson's coefficient is
Regression is the statistical method of estimating (predicting) the value of one (dependent) variable from the known value of another (independent) variable by fitting an average mathematical relationship between them.
Fitting the Two Regression Lines
There are two regression lines because either variable may be treated as dependent.
(a) Regression line of on (used to estimate from ):
The coefficient is obtained by minimising (least squares).
(b) Regression line of on (used to estimate from ):
Both lines pass through the mean point .
Relationship between and the Regression Coefficients
Multiplying the two coefficients:
Hence the correlation coefficient is the geometric mean of the two regression coefficients:
Key points: takes the same sign as the regression coefficients; since , both coefficients cannot exceed 1 simultaneously; if one coefficient the other must be .
What is sampling? Explain different methods of probability and non-probability sampling with their merits and demerits.
Sampling
Sampling is the process of selecting a representative subset (sample) from a population in order to draw inferences about the whole population, saving time, cost and effort compared with a complete census.
Sampling methods are broadly classified as probability and non-probability sampling.
A. Probability Sampling
Every unit has a known, non-zero chance of selection; results can be generalized with measurable error.
- Simple Random Sampling — every unit has an equal chance of selection (lottery / random numbers).
- Merits: unbiased, simple, sampling error measurable.
- Demerits: needs complete frame; units may be geographically scattered.
- Systematic Sampling — select every unit after a random start ().
- Merits: simple, fast, evenly spread.
- Demerits: biased if there is a hidden periodicity in the list.
- Stratified Sampling — divide the population into homogeneous strata and sample from each.
- Merits: high precision, ensures representation of subgroups.
- Demerits: requires prior knowledge of strata; complex.
- Cluster / Multi-stage Sampling — divide into clusters, randomly select whole clusters (or sample within them in stages).
- Merits: cheap, no full frame needed, good for wide areas.
- Demerits: larger sampling error than other probability methods.
B. Non-Probability Sampling
Selection probability is unknown; based on judgement or convenience; generalization is limited.
- Convenience Sampling — units easiest to reach are selected. Merit: quick, cheap. Demerit: highly biased, not representative.
- Judgement (Purposive) Sampling — expert selects typical units. Merit: useful for small specialised studies. Demerit: subjective, bias-prone.
- Quota Sampling — fixed quotas filled for each category. Merit: convenient, ensures group coverage. Demerit: selection within quota is biased.
- Snowball Sampling — existing subjects refer further subjects. Merit: reaches hidden/rare populations. Demerit: strong selection bias.
Summary: Probability sampling gives unbiased, measurable results but is costlier; non-probability sampling is cheaper and faster but prone to bias.
Section B: Short Answer Questions
Attempt any EIGHT questions.
Define mathematical expectation. State and prove its properties.
Mathematical Expectation
For a random variable , the mathematical expectation (expected value / mean) is
provided the sum/integral converges absolutely. It is the long-run average value of .
Properties (with proofs)
1. Expectation of a constant: . Proof:
2. Linearity (constant multiple): . Proof:
3. Addition of a constant: . Proof:
4. Addition theorem: (always true). Proof:
5. Multiplication theorem (independence): If and are independent, . Proof: For independence , so
Explain the t-test for testing the significance of the difference between two sample means.
t-test for the Difference between Two Sample Means
Used for small independent samples () drawn from normal populations having a common but unknown variance, to test whether their means differ.
Hypotheses: vs (or one-tailed).
Test statistic:
where the pooled estimate of the common variance is
Degrees of freedom: .
Decision rule: Compare with the table value .
- If : do not reject (means not significantly different).
- If : reject (means significantly different).
Assumptions: populations normal, samples independent and random, equal (homogeneous) population variances.
Explain the z-test for a large sample test of a single mean with an example.
z-test for a Single Mean (Large Sample)
When the sample is large (), the sampling distribution of the mean is approximately normal, so a -test tests whether the sample mean differs from a hypothesised population mean .
Hypotheses: vs .
Test statistic:
where is the population standard deviation (the sample s.d. is used if is unknown, valid for large ).
Decision rule (5% level, two-tailed): Reject if (use at 1%). For one-tailed tests use .
Example
A machine is set to fill packets of mean weight g. A sample of packets gives g with g. Test at 5%.
Since , we reject : the mean filling weight differs significantly from 500 g.
Define Karl Pearson's coefficient of correlation and state its properties.
Karl Pearson's Coefficient of Correlation
It is a numerical measure of the degree and direction of the linear relationship between two variables and , defined as the ratio of their covariance to the product of their standard deviations:
Properties
- Range: . perfect positive, perfect negative, no linear correlation.
- Independent of origin and scale (units): is unchanged when each variable is transformed by , .
- Symmetric: .
- It is the geometric mean of the two regression coefficients: .
- It is a pure (dimensionless) number, independent of the units of measurement.
- Measures only linear association; does not imply the variables are unrelated (could be non-linear).
What are regression coefficients? State their properties.
Regression Coefficients
The regression coefficient is the slope of a regression line; it measures the average change in the dependent variable for a unit change in the independent variable.
- Regression coefficient of on :
- Regression coefficient of on :
Properties
- Geometric mean relation with : , i.e. .
- Same sign: both regression coefficients (and ) have the same sign; if both are positive is positive, if both negative is negative.
- Both cannot exceed unity: since , the product , so if one coefficient is greater than 1 the other must be less than 1.
- Independent of change of origin but not of scale.
- Arithmetic mean of the two coefficients is greater than or equal to : .
Explain the concept of sampling distribution and standard error.
Sampling Distribution and Standard Error
Sampling distribution: If all possible samples of a fixed size are drawn from a population and a statistic (e.g. the mean , proportion ) is computed for each sample, the probability distribution of that statistic over all such samples is called its sampling distribution. For example, the sampling distribution of the mean has its own mean and variance , and by the Central Limit Theorem it is approximately normal for large .
Standard error (S.E.): The standard deviation of the sampling distribution of a statistic is called its standard error. It measures the variability of the statistic due to sampling and is used to set confidence limits and construct test statistics. For the sample mean:
For a sample proportion: .
Importance / uses:
- A smaller S.E. means the estimate is more reliable; S.E. decreases as increases.
- It provides the denominator of test statistics, e.g. .
- It is used to construct confidence intervals and to judge the precision of an estimate.
Explain how to construct a confidence interval for a population mean.
Confidence Interval for a Population Mean
A confidence interval gives a range of values, computed from sample data, within which the unknown population mean is expected to lie with a stated probability (confidence level , e.g. 95%).
Construction
Point estimate: the sample mean .
General form: .
Case 1 — Large sample () or known (use ):
For 95% confidence, ; for 99%, .
Case 2 — Small sample (), unknown (use ):
Example: . 95% interval .
Interpretation: We are 95% confident that the true population mean lies between the lower and upper limits; i.e. in repeated sampling, 95% of such intervals would contain .
Explain the F-test for the equality of two population variances.
F-test for Equality of Two Population Variances
Used to test whether two independent normal populations have equal variances (also the basis of ANOVA and testing the validity of pooling variances in the t-test).
Hypotheses: vs .
Test statistic: the ratio of the two sample variances, with the larger variance in the numerator:
where the unbiased sample variances are
Degrees of freedom: , corresponding to numerator and denominator.
Decision rule: Compare with the table value .
- If : do not reject — variances are homogeneous.
- If : reject — variances differ significantly.
Assumptions: populations normally distributed; samples independent and random. Since the larger variance is on top, and only the upper tail is used.
Define index numbers and explain Laspeyres' and Paasche's price index methods.
Index Numbers
An index number is a statistical measure that shows the relative change in the value of a variable (or a group of variables such as prices, quantities or values) over time or place, with respect to a chosen base period taken as 100. It is a specialised average used to study, for example, changes in the cost of living or price level.
Let be base-year price and quantity and the current-year price and quantity.
Laspeyres' Price Index (base-year weights)
Uses base-year quantities () as weights:
It measures the cost of buying base-year quantities at current prices versus base prices. It tends to overstate the price rise (ignores substitution away from costlier goods).
Paasche's Price Index (current-year weights)
Uses current-year quantities () as weights:
It tends to understate the price rise. (Fisher's ideal index is the geometric mean of the two: .)
Key difference: Laspeyres uses base-period weights (data needed only once), whereas Paasche uses current-period weights (must be recollected each period).
Frequently asked questions
- Where can I find the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) question paper 2075?
- The full BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2075 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Statistics II (BSc CSIT, STA210) 2075 paper come with solutions?
- Yes. Every question on this Statistics II (BSc CSIT, STA210) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2075 paper?
- The BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2075 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Statistics II (BSc CSIT, STA210) past paper free?
- Yes — reading and attempting this Statistics II (BSc CSIT, STA210) past paper on Kekkei is completely free.