BSc CSIT (TU) Science Statistics II (BSc CSIT, STA210) Question Paper 2077 Nepal
This is the official BSc CSIT (TU) (Science stream) Statistics II (BSc CSIT, STA210) question paper for 2077, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Statistics II (BSc CSIT, STA210) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Statistics II (BSc CSIT, STA210) exam or solving previous years' question papers, this 2077 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
What is sampling? Explain different methods of probability and non-probability sampling with their merits and demerits.
Sampling
Sampling is the statistical process of selecting a subset (a sample) of individuals or items from a larger group (the population) in order to estimate characteristics of the whole population. It is used because studying the entire population (a census) is often costly, time-consuming, or practically impossible.
Sampling methods are broadly divided into probability and non-probability sampling.
A. Probability Sampling
Every unit of the population has a known, non-zero chance of selection. Results can be generalized and sampling error can be estimated.
1. Simple Random Sampling
Every unit has an equal chance of selection (lottery method or random numbers).
- Merits: Unbiased; easy to analyse; sampling error measurable.
- Demerits: Needs a complete sampling frame; may not represent small subgroups; expensive for widely scattered populations.
2. Stratified Random Sampling
Population divided into homogeneous strata, then random samples drawn from each.
- Merits: Greater precision; ensures representation of every subgroup.
- Demerits: Requires prior knowledge of strata; complex; faulty stratification reduces efficiency.
3. Systematic Sampling
Every unit is selected after a random start, where .
- Merits: Simple, quick, evenly spread over the frame.
- Demerits: Biased if the list has a hidden periodic pattern.
4. Cluster / Multistage Sampling
Population divided into clusters; some clusters are selected and all (or sampled) units within them studied.
- Merits: Economical; no full frame of units needed; suited to geographically spread populations.
- Demerits: Higher sampling error; less precise than other methods.
B. Non-Probability Sampling
Units are selected on a non-random basis; probability of selection is unknown, so sampling error cannot be measured.
1. Convenience Sampling
Units chosen because they are easy to reach.
- Merits: Fast and cheap. Demerits: Highly biased, not generalizable.
2. Judgement (Purposive) Sampling
Expert chooses units believed to be representative.
- Merits: Useful for small/specialized studies. Demerits: Subjective; depends on investigator's judgement.
3. Quota Sampling
Units selected to fill fixed quotas for sub-groups.
- Merits: Quick; ensures representation of groups. Demerits: Selection within quota is biased.
4. Snowball Sampling
Existing respondents recruit further respondents.
- Merits: Good for hidden/rare populations. Demerits: Strong selection bias.
Conclusion
Probability sampling is preferred when accuracy and generalization are required, whereas non-probability sampling is used when speed, cost, or accessibility dominate.
What is analysis of variance (ANOVA)? Explain the procedure of one-way ANOVA with the construction of the ANOVA table.
Analysis of Variance (ANOVA)
ANOVA, developed by R. A. Fisher, is a technique used to test the equality of means of three or more populations simultaneously by partitioning the total variation in the data into components attributable to different sources. It compares the variance between groups with the variance within groups using the -statistic.
Assumptions: observations are independent, drawn from normal populations, and the populations have equal variances (homogeneity).
One-Way ANOVA Procedure
A single factor with treatments (groups) and total observations is studied.
Step 1 — Hypotheses
Step 2 — Grand total and correction factor
Step 3 — Sum of squares
where is the total of the group having observations.
Step 4 — Degrees of freedom Between , Error , Total .
Step 5 — Mean squares and F-ratio
ANOVA Table
| Source of Variation | SS | d.f. | Mean Square | F-ratio |
|---|---|---|---|---|
| Between treatments | ||||
| Within (Error) | ||||
| Total |
Step 6 — Decision: Compare calculated with the table value . If , reject and conclude that the treatment means differ significantly.
Explain the theory of estimation. Differentiate between point estimation and interval estimation and explain the properties of a good estimator.
Theory of Estimation
Estimation is the branch of statistical inference concerned with using sample data to assign numerical values (estimates) to the unknown parameters of a population (e.g., mean , variance , proportion ). A sample statistic used for this purpose is called an estimator, and a particular numerical value it takes is an estimate.
Estimation is of two types: point estimation and interval estimation.
Point Estimation vs Interval Estimation
| Basis | Point Estimation | Interval Estimation |
|---|---|---|
| Result | A single value as the estimate of the parameter | A range (interval) within which the parameter is expected to lie |
| Example | estimates (e.g., ) | (e.g., ) |
| Probability statement | No measure of reliability attached | Attached with a confidence level (e.g., 95%) |
| Error | Probability of being exactly correct is essentially zero | Accounts for sampling error via the confidence coefficient |
| Information given | Less informative | More informative and realistic |
In interval estimation the interval is called a confidence interval and the probability that it contains the parameter is the confidence coefficient.
Properties of a Good Estimator
- Unbiasedness: The expected value of the estimator equals the parameter, . (e.g., .)
- Consistency: As the sample size , the estimator converges to the true parameter value.
- Efficiency: Among unbiased estimators, the one with the smallest variance is the most efficient.
- Sufficiency: A sufficient estimator uses all the information in the sample relevant to the parameter, leaving nothing more to be gained from the data.
An ideal estimator is unbiased, consistent, efficient, and sufficient.
Section B: Short Answer Questions
Attempt any EIGHT questions.
Define Karl Pearson's coefficient of correlation and state its properties.
Karl Pearson's Coefficient of Correlation
Karl Pearson's coefficient of correlation, denoted , measures the degree and direction of the linear relationship between two quantitative variables and . It is defined as the ratio of the covariance of the variables to the product of their standard deviations:
Properties
- Range: always lies between and , i.e. .
- Direction: indicates positive correlation, negative correlation, and no linear correlation.
- Unit-free: is a pure number, independent of the units of measurement.
- Independent of change of origin and scale: correlation is unaffected by adding/subtracting a constant or multiplying/dividing by a positive constant.
- Symmetric: .
- It is the geometric mean of the two regression coefficients: , taking the sign of the regression coefficients.
What are regression coefficients? State their properties.
Regression Coefficients
In a linear regression between two variables and , the regression coefficient is the slope of the regression line and measures the average change in the dependent variable for a unit change in the independent variable.
- Regression coefficient of on :
- Regression coefficient of on :
Properties
- The correlation coefficient is the geometric mean of the two regression coefficients: .
- Both regression coefficients have the same sign, which is also the sign of .
- The product of the two regression coefficients cannot exceed 1: .
- If one regression coefficient is greater than 1, the other must be less than 1.
- Regression coefficients are independent of change of origin but not of change of scale.
- The arithmetic mean of the two regression coefficients is greater than or equal to (when ).
Explain the concept of sampling distribution and standard error.
Sampling Distribution
If all possible samples of a fixed size are drawn from a population and a statistic (such as the mean , proportion, or variance) is computed for each sample, the probability distribution of that statistic over all such samples is called the sampling distribution of the statistic.
For example, the sampling distribution of the mean describes how sample means vary from sample to sample. By the Central Limit Theorem, for large this distribution is approximately normal with
Standard Error (S.E.)
The standard error is the standard deviation of the sampling distribution of a statistic. It measures the variability of the statistic due to sampling and is a key indicator of the precision/reliability of an estimate.
- S.E. of the mean:
- S.E. of a proportion:
Uses of S.E.: it is used to construct confidence intervals, to test hypotheses (test statistic = (estimate − parameter)/S.E.), and to judge accuracy. A smaller standard error (larger ) indicates a more reliable estimate.
Explain how to construct a confidence interval for a population mean.
Confidence Interval for a Population Mean
A confidence interval (CI) is a range of values, computed from a sample, that is expected to contain the unknown population mean with a stated probability , called the confidence level (e.g., 95%).
Case 1: Population variance known (or large sample, )
Use the standard normal () distribution:
where is the sample mean and is the critical value (e.g., 1.96 for 95%, 2.58 for 99%).
Case 2: Population variance unknown and small sample ()
Replace by the sample standard deviation and use the -distribution with degrees of freedom:
Steps
- Compute the sample mean (and if needed).
- Fix the confidence level and obtain the critical value or .
- Compute the standard error (or ).
- Compute the margin of error .
- The interval is .
Interpretation: A 95% CI means that if the sampling were repeated many times, about 95% of such intervals would contain the true mean .
Explain the F-test for the equality of two population variances.
F-test for Equality of Two Population Variances
The F-test is used to test whether two normal populations have equal variances, based on the ratio of two independent sample variances. It is the basis for ANOVA and for testing the homogeneity assumption.
Hypotheses
Test Statistic
Given two independent samples of sizes and with unbiased sample variances
the statistic is
which follows the -distribution with degrees of freedom.
Decision Rule
Compare with the table value .
- If → reject (variances differ significantly).
- If → accept (no significant difference).
Assumptions: both samples are random, independent, and drawn from normal populations.
Define index numbers and explain Laspeyres' and Paasche's price index methods.
Index Numbers
An index number is a statistical measure that expresses the relative change in the level of a variable (or group of variables) such as price, quantity, or value, over time or between places, with respect to a fixed base period (taken as 100). They are often called economic barometers.
A price index measures the relative change in the prices of a basket of commodities between the base year (prices , quantities ) and the current year (prices , quantities ).
Laspeyres' Price Index
Uses base-year quantities () as weights:
- Merit: Requires only base-year weights, so easy to compute over time.
- Demerit: Ignores changes in consumption pattern; tends to overestimate the rise in prices.
Paasche's Price Index
Uses current-year quantities () as weights:
- Merit: Reflects current consumption pattern.
- Demerit: Current-year weights must be collected every period (costly); tends to underestimate the rise in prices.
Note: Fisher's ideal index is the geometric mean of the two, .
State and explain the addition and multiplication theorems of probability with examples.
Addition and Multiplication Theorems of Probability
Addition Theorem
The addition theorem gives the probability of the union of events (occurrence of at least one event).
For any two events and :
If and are mutually exclusive ():
Example: Drawing one card from a deck of 52, (mutually exclusive).
Multiplication Theorem
The multiplication theorem gives the probability of the joint occurrence (intersection) of events.
For any two events:
If and are independent:
Example: Tossing two fair coins, (independent events).
Summary: the addition theorem deals with "OR" (union) of events, while the multiplication theorem deals with "AND" (intersection) of events.
Explain the Poisson distribution with its mean and variance and state its applications.
Poisson Distribution
The Poisson distribution is a discrete probability distribution that models the number of occurrences of a rare event in a fixed interval of time, space, or area, when the events occur independently and at a constant average rate .
A random variable follows a Poisson distribution if its probability mass function is
where is the average number of occurrences (the parameter) and .
It is obtained as a limiting case of the binomial distribution when , , with finite.
Mean and Variance
A distinctive property is that the mean equals the variance:
Applications
- Number of telephone calls received at an exchange per minute.
- Number of printing/typing errors per page of a book.
- Number of accidents at a junction per day.
- Number of defective items in a large batch (rare defects).
- Number of customers arriving at a counter in a given time (queueing theory).
- Number of radioactive particle emissions per unit time.
Define a random variable. Differentiate between discrete and continuous random variables with examples.
Random Variable
A random variable is a real-valued function that assigns a numerical value to each outcome of a random experiment (each point in the sample space). It is usually denoted by capital letters .
Example: In tossing two coins, if = number of heads, then takes values .
Random variables are of two types: discrete and continuous.
Discrete vs Continuous Random Variables
| Basis | Discrete Random Variable | Continuous Random Variable |
|---|---|---|
| Values | Takes only countable (isolated) values | Takes any value within an interval (uncountable) |
| Distribution function | Probability mass function | Probability density function |
| Probability of a point | can be positive | ; only meaningful |
| Total probability | ||
| Example | Number of heads in coin tosses; number of defective items | Height, weight, temperature, time taken |
Examples:
- Discrete: number of children in a family .
- Continuous: the exact height of a student (e.g., any value such as 165.3 cm).
Frequently asked questions
- Where can I find the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) question paper 2077?
- The full BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2077 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Statistics II (BSc CSIT, STA210) 2077 paper come with solutions?
- Yes. Every question on this Statistics II (BSc CSIT, STA210) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2077 paper?
- The BSc CSIT (TU) Statistics II (BSc CSIT, STA210) 2077 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Statistics II (BSc CSIT, STA210) past paper free?
- Yes — reading and attempting this Statistics II (BSc CSIT, STA210) past paper on Kekkei is completely free.