BSc CSIT (TU) Science Statistics II (BSc CSIT, STA210) Question Paper 2081 Nepal
This is the official BSc CSIT (TU) (Science stream) Statistics II (BSc CSIT, STA210) question paper for 2081, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Statistics II (BSc CSIT, STA210) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Statistics II (BSc CSIT, STA210) exam or solving previous years' question papers, this 2081 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
What is sampling? Explain different methods of probability and non-probability sampling with their merits and demerits.
Sampling
Sampling is the process of selecting a representative subset (the sample) from a larger group (the population) so that conclusions about the whole population can be drawn by studying only the sample. It saves cost, time and effort and is often the only feasible approach when the population is large or testing is destructive.
Sampling methods are broadly classified into probability and non-probability sampling.
A. Probability Sampling
Every unit of the population has a known, non-zero chance of selection. Results can be generalised and sampling error can be estimated.
1. Simple Random Sampling (SRS) Each unit has an equal chance of selection (lottery method or random numbers).
- Merits: unbiased, simple, error measurable.
- Demerits: needs a complete sampling frame; units may be geographically scattered.
2. Stratified Sampling Population is divided into homogeneous strata and a random sample is drawn from each.
- Merits: high precision, ensures representation of every subgroup.
- Demerits: requires prior knowledge of strata; complex to organise.
3. Systematic Sampling Every unit is selected after a random start, where .
- Merits: simple, evenly spread over the frame.
- Demerits: biased if the list has a periodic/hidden pattern.
4. Cluster / Multistage Sampling Population is divided into clusters; some clusters are selected at random and studied (fully or in further stages).
- Merits: economical for wide geographical areas, no full frame needed.
- Demerits: lower precision (high intra-cluster correlation).
B. Non-Probability Sampling
Selection is based on judgement/convenience; probability of selection is unknown and sampling error cannot be measured.
1. Convenience Sampling — units that are easiest to reach are chosen. Merit: fast, cheap. Demerit: highly biased, not representative.
2. Judgement (Purposive) Sampling — expert selects typical units. Merit: useful for small/specialised studies. Demerit: depends on investigator's bias.
3. Quota Sampling — units chosen until fixed quotas per category are met. Merit: quick, ensures group coverage. Demerit: selection within quota is biased.
4. Snowball Sampling — existing subjects refer further subjects. Merit: good for rare/hidden populations. Demerit: sample not representative.
Conclusion
Probability sampling allows valid statistical inference and is preferred for scientific surveys, whereas non-probability sampling is cheaper and quicker but cannot guarantee representativeness.
What is analysis of variance (ANOVA)? Explain the procedure of one-way ANOVA with the construction of the ANOVA table.
Analysis of Variance (ANOVA)
ANOVA is a statistical technique (developed by R. A. Fisher) used to test whether the means of three or more populations are equal by partitioning the total variation in the data into components attributable to different sources. It compares the variance between groups with the variance within groups using the F-test.
Assumptions: observations are independent, drawn from normal populations, and the populations have equal variances (homoscedasticity).
One-Way ANOVA Procedure
Used when one factor (with levels/treatments) classifies the data.
Step 1 — Hypotheses
Step 2 — Compute totals. Let = grand total of all observations and (correction factor).
Step 3 — Sum of squares
where and are the total and size of the group.
Step 4 — Degrees of freedom: Between = , Within = , Total = .
Step 5 — Mean squares and F-ratio
ANOVA Table
| Source of Variation | Sum of Squares | d.f. | Mean Square | F-ratio |
|---|---|---|---|---|
| Between treatments | ||||
| Within (Error) | ||||
| Total |
Step 6 — Decision. Compare calculated with the table value . If , reject and conclude the treatment means differ significantly; otherwise accept .
Explain the theory of estimation. Differentiate between point estimation and interval estimation and explain the properties of a good estimator.
Theory of Estimation
Estimation is the branch of statistical inference concerned with using sample data to estimate the unknown value of a population parameter (such as , or ). The sample statistic used for this purpose is called an estimator and a particular numerical value of it is an estimate. Estimation is of two types: point estimation and interval estimation.
Point Estimation vs Interval Estimation
| Basis | Point Estimation | Interval Estimation |
|---|---|---|
| Meaning | Gives a single value as the estimate of the parameter | Gives a range (interval) within which the parameter is expected to lie |
| Example | estimates ; estimates | for |
| Reliability | Does not indicate the error/precision | Carries a confidence level (e.g. 95%) and shows precision |
| Form | Number | Interval |
A point estimate by itself tells nothing about how close it is to the true value, whereas an interval estimate quantifies the uncertainty.
Properties of a Good Estimator
1. Unbiasedness — The expected value of the estimator equals the parameter: . (e.g. .)
2. Consistency — As the sample size , the estimator converges to the true parameter: .
3. Efficiency — Among all unbiased estimators, the efficient one has the smallest variance, giving the most precise estimate.
4. Sufficiency — The estimator uses all the information in the sample about the parameter, so no other statistic can add more information.
An estimator that is unbiased, consistent, efficient and sufficient is regarded as the best (ideal) estimator.
Section B: Short Answer Questions
Attempt any EIGHT questions.
Define Karl Pearson's coefficient of correlation and state its properties.
Karl Pearson's Coefficient of Correlation
Karl Pearson's coefficient of correlation, denoted , is a numerical measure of the degree and direction of linear relationship between two quantitative variables and . It is defined as the ratio of the covariance of the variables to the product of their standard deviations:
Properties
- Range: . means perfect positive, perfect negative, and no linear correlation.
- Sign: indicates the direction — positive ( move together), negative (move oppositely).
- Independent of origin and scale (unit-free): is unchanged if a constant is added to or each value is multiplied by a constant.
- Symmetric: .
- Geometric mean of regression coefficients: , with the sign of the regression coefficients.
- It measures only linear association; does not imply the variables are unrelated (a non-linear relation may exist).
What are regression coefficients? State their properties.
Regression Coefficients
In the two linear regression equations of on and on , the slopes are called the regression coefficients. They measure the average change in one variable for a unit change in the other.
- Regression coefficient of on :
- Regression coefficient of on :
Properties
- Geometric mean gives correlation: , the sign being that of the regression coefficients.
- Same sign: both and have the same sign, which is also the sign of .
- Product : since , the product of the two regression coefficients cannot exceed unity; hence if one is greater than 1 the other must be less than 1.
- Independent of origin but not of scale: adding a constant does not change them, but changing the scale does.
- The arithmetic mean of the two regression coefficients is greater than or equal to the correlation coefficient (when ).
Explain the concept of sampling distribution and standard error.
Sampling Distribution
If all possible samples of a fixed size are drawn from a population and a statistic (e.g. the sample mean ) is computed for each, the probability distribution of that statistic over all such samples is called its sampling distribution. For example, the sampling distribution of the mean has its own mean and describes how sample means vary from sample to sample. It forms the theoretical basis of statistical inference (estimation and testing of hypotheses).
Standard Error
The standard error (S.E.) is the standard deviation of the sampling distribution of a statistic. It measures the magnitude of the sampling fluctuation, i.e. how much the statistic is expected to vary from the true parameter due to chance.
For the sample mean (population s.d. , sample size ):
For a sample proportion : .
Uses of standard error:
- Smaller S.E. means a more precise/reliable estimate; it decreases as increases.
- It is used to set up confidence intervals and to compute test statistics in hypothesis testing.
Explain how to construct a confidence interval for a population mean.
Confidence Interval for a Population Mean
A confidence interval (CI) is a range of values, computed from a sample, that is expected to contain the true population mean with a stated probability called the confidence level (e.g. 95%).
Case 1 — Population variance known (or large sample, ): The statistic follows the standard normal distribution, so the CI is
where for 95% and for 99%. (If is unknown in a large sample, replace it by the sample s.d. .)
Case 2 — unknown and small sample (): Use the -distribution with degrees of freedom:
Steps: (1) compute and the standard error; (2) choose the confidence level and find or ; (3) form the limits
Interpretation: we are 95% confident that the interval contains ; over repeated sampling, 95% of such intervals would capture the true mean.
Explain the F-test for the equality of two population variances.
F-Test for Equality of Two Population Variances
The F-test is used to test whether two independent normal populations have equal variances. It compares the ratio of two sample variances.
Step 1 — Hypotheses
Step 2 — Test statistic. From samples of sizes with unbiased sample variances and , where , the statistic is
The larger variance is always placed in the numerator so that .
Step 3 — Degrees of freedom: numerator , denominator .
Step 4 — Decision. Compare with the table value .
- If : reject — the population variances differ significantly.
- If : accept — variances may be regarded as equal.
Assumptions: both samples are random and independent and drawn from normal populations. (This equality-of-variance test is also a prerequisite for the t-test of two means.)
Define index numbers and explain Laspeyres' and Paasche's price index methods.
Index Numbers
An index number is a statistical measure that expresses the relative change in the level of a variable (price, quantity, value, production, etc.) over time or place, compared with a chosen base period taken as 100. Price index numbers are widely used to measure inflation and changes in the cost of living.
Notation: = price and quantity in the base year; = price and quantity in the current year.
Laspeyres' Price Index
Uses base-year quantities as weights:
Merit: requires only one set of weights (base-year), so easy to compute over time. Demerit: tends to overstate the price rise (consumers shift away from costlier goods).
Paasche's Price Index
Uses current-year quantities as weights:
Merit: reflects the current consumption pattern. Demerit: current-year weights must be collected every period (costly); tends to understate the price rise.
Note: Fisher's ideal index is the geometric mean of the two: .
State and explain the addition and multiplication theorems of probability with examples.
Addition and Multiplication Theorems of Probability
1. Addition Theorem (probability of OR )
For any two events and :
If and are mutually exclusive (cannot occur together, ):
Example: Drawing one card from a pack, (mutually exclusive). For a King or a Heart (not exclusive): .
2. Multiplication Theorem (probability of AND )
For any two events:
where is the conditional probability of given . If and are independent, , so:
Example: Two cards drawn one after another without replacement; probability both are aces:
If tossing two fair coins (independent), .
Explain the Poisson distribution with its mean and variance and state its applications.
Poisson Distribution
The Poisson distribution is a discrete probability distribution that gives the probability of a given number of independent events occurring in a fixed interval of time, space or volume, when these events occur with a known constant average rate and rarely (large , small ). The probability mass function is
where is the average number of occurrences. It is the limiting form of the binomial distribution as , with finite.
Mean and Variance
A distinctive feature is that the mean and variance are equal:
Applications
- Number of telephone calls arriving at an exchange per minute.
- Number of printing/typing errors per page of a book.
- Number of accidents on a highway or defects per unit length of cloth/wire.
- Number of radioactive decays per second; arrivals in queueing (servers, network packets).
- Number of customers arriving at a counter per hour.
Define a random variable. Differentiate between discrete and continuous random variables with examples.
Random Variable
A random variable is a real-valued function that assigns a numerical value to each outcome of a random experiment (sample space). It is usually denoted by capital letters . For example, in tossing two coins, the number of heads can take the values .
Random variables are of two types: discrete and continuous.
Discrete vs Continuous Random Variables
| Basis | Discrete Random Variable | Continuous Random Variable |
|---|---|---|
| Values taken | Countable / isolated values (finite or countably infinite) | Any value within an interval (uncountable) |
| Probability described by | Probability mass function | Probability density function |
| Probability of a single point | Can be non-zero | Always zero; only is meaningful |
| Summation/Integration | ||
| Examples | Number of heads in coin tosses, number of defective items, number of calls | Height, weight, temperature, time taken |
Examples:
- Discrete: the number of students present in a class (0, 1, 2, …).
- Continuous: the height of a student (e.g. any value between 150 cm and 180 cm).
Frequently asked questions
- Where can I find the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) question paper 2081?
- The full BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2081 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Statistics II (BSc CSIT, STA210) 2081 paper come with solutions?
- Yes. Every question on this Statistics II (BSc CSIT, STA210) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2081 paper?
- The BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2081 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Statistics II (BSc CSIT, STA210) past paper free?
- Yes — reading and attempting this Statistics II (BSc CSIT, STA210) past paper on Kekkei is completely free.