BSc CSIT (TU) Science Statistics I (BSc CSIT, STA164) Question Paper 2074 Nepal
This is the official BSc CSIT (TU) (Science stream) Statistics I (BSc CSIT, STA164) question paper for 2074, as set in the regular annual examination. It carries 60 full marks and a time allowance of 180 minutes, across 12 questions. On Kekkei you can attempt this Statistics I (BSc CSIT, STA164) past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your BSc CSIT (TU) Statistics I (BSc CSIT, STA164) exam or solving previous years' question papers, this 2074 paper is a great way to practise under real exam conditions.
Section A: Long Answer Questions
Attempt any TWO questions.
Define statistics. Explain the importance and limitations of statistics. Describe the various measures of central tendency (mean, median, mode) with their merits and demerits.
Statistics
Definition. Statistics is the branch of science that deals with the collection, organization, presentation, analysis and interpretation of numerical data to aid rational decision-making under uncertainty. In the plural sense it means the data themselves; in the singular sense it means the scientific methods used to handle such data.
Importance of Statistics
- Planning and policy: Governments and businesses use statistical data for planning, budgeting and forecasting.
- Decision-making under uncertainty: Provides tools (estimation, testing) to draw conclusions from samples.
- Comparison: Averages and measures of dispersion allow comparison of different groups.
- Relationship study: Correlation and regression reveal relationships between variables.
- Forecasting: Time-series and trend analysis help predict future values.
- Applications: Indispensable in economics, business, biology, computer science (data mining, ML), and research.
Limitations of Statistics
- Studies only aggregates, not individuals.
- Deals only with quantitative (or quantifiable) data; qualitative facts must be coded.
- Results are true on average, not in every individual case.
- Liable to misuse by unscrupulous persons; can be misleading if methods are wrong.
- Requires expertise; conclusions are probabilistic, not certain.
Measures of Central Tendency
A central value that represents the whole data set.
1. Arithmetic Mean
Merits: rigidly defined, based on all observations, suitable for algebraic treatment, least affected by sampling fluctuation. Demerits: highly affected by extreme values (outliers); cannot be found for open-end classes; may give an impossible value (e.g. 2.5 children).
2. Median
The middle value when data are arranged in order; positional average.
Merits: not affected by extreme values; can be found for open-end classes; can be located graphically (ogive). Demerits: not based on all observations; needs arranging data; less suitable for further algebraic treatment.
3. Mode
The value that occurs most frequently.
Merits: easy to understand; not affected by extreme values; the most typical value; useful for qualitative data. Demerits: ill-defined when data are multimodal or have no repetition; not based on all observations; not suitable for algebraic treatment.
Empirical relation: for a moderately skewed distribution.
Define correlation. Calculate the Karl Pearson coefficient of correlation for a given bivariate data set and interpret the result. Distinguish between correlation and regression.
Correlation
Definition. Correlation is the statistical technique that measures the degree and direction of linear relationship between two quantitative variables and . If both increase together it is positive; if one increases while the other decreases it is negative; the value lies in .
Karl Pearson's Coefficient of Correlation
Worked example
Let the bivariate data be:
| 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|
| 2 | 4 | 5 | 4 | 5 |
| 1 | 2 | 1 | 4 | 2 |
| 2 | 4 | 4 | 16 | 8 |
| 3 | 5 | 9 | 25 | 15 |
| 4 | 4 | 16 | 16 | 16 |
| 5 | 5 | 25 | 25 | 25 |
| 15 | 20 | 55 | 86 | 66 |
Here , .
Interpretation. indicates a fairly strong positive linear correlation: as increases, tends to increase. (With actual exam data, substitute the given values into the same formula.)
Correlation vs Regression
| Correlation | Regression |
|---|---|
| Measures degree/strength of relationship | Measures the nature/form of dependence (predicts one from another) |
| Symmetric: | Asymmetric: in general |
| Value lies in , a pure number | Coefficient has units; line |
| Does not imply cause and effect | Used for estimation/prediction of dependent variable |
| No distinction between dependent/independent variable | Clear dependent and independent variables |
Relation: .
Define probability. State and explain the addition and multiplication theorems of probability. State Bayes' theorem and solve a related problem.
Probability
Definition (classical). If a random experiment has equally likely, mutually exclusive and exhaustive outcomes, of which are favourable to event , then
Addition Theorem
For any two events and :
If and are mutually exclusive (), then
It gives the probability that at least one of the events occurs.
Multiplication Theorem
For any two events:
If and are independent, , so
It gives the probability that both events occur simultaneously.
Bayes' Theorem
If are mutually exclusive and exhaustive events with , and is any event, then for each :
The are prior probabilities and are posterior probabilities.
Worked problem
Three machines produce 50%, 30% and 20% of the items, with defective rates 3%, 4% and 5% respectively. An item drawn at random is defective. Find the probability it came from machine .
Given: ; .
Total probability of a defective item:
By Bayes' theorem:
Result: there is about a 27% chance that the defective item was produced by machine .
Section B: Short Answer Questions
Attempt any EIGHT questions.
Define mean, median and mode.
- Mean (arithmetic mean): the sum of all observations divided by their number, . It is the most common average and uses every value.
- Median: the middle value of the data arranged in ascending (or descending) order; it divides the data into two equal halves and is unaffected by extreme values.
- Mode: the value that occurs most frequently in the data set; a distribution may be unimodal, bimodal or multimodal.
What is the difference between primary and secondary data?
Primary data are data collected originally by the investigator for the first time for a specific purpose (e.g. through surveys, interviews, questionnaires, direct observation or experiments). They are original, more reliable and accurate but costly and time-consuming.
Secondary data are data that have already been collected by someone else and are used by the investigator second-hand (e.g. from published reports, government records, journals, websites). They are cheaper and quicker to obtain but may not exactly fit the purpose and need careful checking for accuracy.
| Basis | Primary data | Secondary data |
|---|---|---|
| Originality | Original, first-hand | Second-hand |
| Collected by | The investigator | Someone else |
| Cost/time | High | Low |
| Reliability | Higher (if collected well) | Depends on source |
Define standard deviation and variance.
Variance is the mean of the squared deviations of observations from their arithmetic mean. It measures how spread out the data are.
Standard deviation is the positive square root of the variance; it is the most reliable measure of dispersion and is expressed in the same units as the data.
A larger standard deviation/variance means greater scatter of the values about the mean.
State the classical definition of probability.
Classical (a priori) definition of probability. If a random experiment results in exhaustive, mutually exclusive and equally likely outcomes, of which are favourable to the occurrence of an event , then the probability of is
Here ; for an impossible event and for a certain event. The probability of non-occurrence is .
Limitation: it fails when outcomes are not equally likely or when is infinite.
What is a frequency distribution?
A frequency distribution is a tabular arrangement of data that shows how the observations are distributed among different values or class intervals, together with the number of times each value or class occurs (its frequency).
- A discrete (ungrouped) frequency distribution lists individual values with their frequencies.
- A continuous (grouped) frequency distribution groups data into class intervals (e.g. 0–10, 10–20) with corresponding frequencies.
It condenses raw data into a compact form, making patterns, the most common values and the overall shape of the data easy to study, and forms the basis for graphs such as histograms and for computing averages and dispersion.
Example:
| Marks | 0–10 | 10–20 | 20–30 | 30–40 |
|---|---|---|---|---|
| No. of students | 5 | 12 | 8 | 3 |
Define correlation coefficient.
The correlation coefficient is a numerical measure of the degree and direction of the linear relationship between two variables and . Karl Pearson's coefficient is defined as
Properties:
- .
- : perfect positive correlation; : perfect negative correlation; : no linear correlation.
- It is a pure number, independent of units and of change of origin and scale.
What is a histogram?
A histogram is a graphical representation of a continuous (grouped) frequency distribution using a set of adjacent rectangles. Each rectangle is drawn over a class interval on the -axis, and its area is proportional to the frequency of that class.
- For equal class widths, the height of each bar equals the class frequency.
- For unequal widths, the height is taken as the frequency density so that area stays proportional to frequency.
- The bars touch one another (no gaps), reflecting the continuity of data, which distinguishes a histogram from a bar diagram.
It is used to study the shape, central tendency and spread of a distribution, and the mode can be located graphically from it.
Define mutually exclusive events.
Two (or more) events are said to be mutually exclusive (or disjoint) if the occurrence of one prevents the occurrence of the other in the same trial, i.e. they cannot happen simultaneously. In set terms their intersection is empty:
For such events the addition rule simplifies to
Example: in a single toss of a coin, getting a head and getting a tail are mutually exclusive; in rolling a die, the events {even number} and {odd number} are mutually exclusive.
What is the coefficient of variation?
The coefficient of variation (CV) is a relative measure of dispersion that expresses the standard deviation as a percentage of the mean:
Because it is a unitless quantity, it is used to compare the variability (consistency) of two or more series that have different units or very different means.
- A higher CV means greater variability and less consistency/uniformity.
- A lower CV means less variability and more consistency/stability.
Example: comparing two batsmen, the one with the smaller CV of runs is the more consistent player.
Frequently asked questions
- Where can I find the BSc CSIT (TU) Statistics I (BSc CSIT, STA164) question paper 2074?
- The full BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2074 (regular) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Statistics I (BSc CSIT, STA164) 2074 paper come with solutions?
- Yes. Every question on this Statistics I (BSc CSIT, STA164) past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2074 paper?
- The BSc CSIT (TU) Statistics I (BSc CSIT, STA164) 2074 paper carries 60 full marks and is meant to be completed in 180 minutes, across 12 questions.
- Is practising this Statistics I (BSc CSIT, STA164) past paper free?
- Yes — reading and attempting this Statistics I (BSc CSIT, STA164) past paper on Kekkei is completely free.