Master in Data Science (SMS, TU) Statistical Computing with R Question Paper 2080 Nepal
This is the official Master in Data Science (SMS, TU) Statistical Computing with R question paper for 2080, as set in the Board examination. It carries 45 full marks and a time allowance of 120 minutes, across 10 questions. On Kekkei you can attempt this Statistical Computing with R past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your Master in Data Science (SMS, TU) Statistical Computing with R exam or solving previous years' question papers, this 2080 paper is a great way to practise under real exam conditions.
| Level | Master in Data Science (SMS, TU) |
|---|---|
| Subject | Statistical Computing with R |
| Year | 2080 BS |
| Exam session | Board |
| Full marks | 45 |
| Time allowed | 120 minutes |
| Questions | 10, all with step-by-step solutions |
Group A
Explain these terms with examples for R:
a) Getting multi-way table with array b) Creating class intervals of continuous variable c) Missingness vs nothingness
Explain following concepts with focus on R software:
a) Raw data b) Data wrangling c) Tidy data
Explain the followings with examples for R:
a) Reference range based on mean b) Reference range based on median c) Outliers and extreme values
Explain the following concepts with focus on R software:
a) Test of normality b) Parametric tests c) Residual analysis
Describe decision tree classification model with focus on:
a) Bagging b) Improved bagging c) Boosting
Group B
Do the following in R Studio with R script:
a) Create a dataset with following variables: age (18-99 years), sex (male/female), educational levels (No education/Primary/Secondary/Beyond secondary), socio-economic status (Low, Middle, High) and body mass index (14 – 38) with 150 random cases of each variable. Your exam roll number must be used to set the random seed. b) Show a sub-divided bar diagram of body mass index variable by sex and socio-economic variables separately with interpretations. c) Show multiple bar diagram of age variable with sex and educational level variables and interpret it carefully. d) Show boxplots of age and body mass index variable separately and interpret the results carefully. e) Create histogram of age and body mass index variable separately and interpret the results carefully.
Do the following in R studio and with R script to knit HTML output:
a) Define an object "rating" with 9, 2, 5, 8, 6, 1, 3, 2, 8, 4, 6, 8, 7, 1, 2, 6, 10, 5, 6, 9, 6, 2, 4, 7 values. b) Replicate the given table obtained from SPSS software for the rating object in R.
rating
| Valid | Frequency | Percent | Valid Percent | Cumulative Percent |
|---|---|---|---|---|
| 1 | 2 | 8.3 | 8.3 | 8.3 |
| 2 | 4 | 16.7 | 16.7 | 25.0 |
| 3 | 1 | 4.2 | 4.2 | 29.2 |
| 4 | 2 | 8.3 | 8.3 | 37.5 |
| 5 | 2 | 8.3 | 8.3 | 45.8 |
| 6 | 5 | 20.8 | 20.8 | 66.7 |
| 7 | 2 | 8.3 | 8.3 | 75.0 |
| 8 | 3 | 12.5 | 12.5 | 87.5 |
| 9 | 2 | 8.3 | 8.3 | 95.8 |
| 10 | 1 | 4.2 | 4.2 | 100.0 |
| Total | 24 | 100.0 | 100.0 |
Do the following in R Studio with R script:
a) Create a dataset with following variables: age (18-99 years), sex (male/female), educational levels (No education/Primary/Secondary/Beyond secondary), socio-economic status (Low, Middle, High) and body mass index (14 – 38) with random 250 cases of each variable. Your exam roll number must be used to set the random seed. b) Create scatterplot of age and body mass index variable and interpret it carefully. c) Which correlation coefficient must be used based on the interpretation of the scatterplot? Why? d) Compute the best correlation coefficient identified from the scatterplot and interpret it carefully. e) Test whether this correlation coefficient is statistically valid or not and justify its value.
OR
Do the following in R Studio with R script:
a) Create a dataset with following variables: age (18-99 years), sex (male/female), educational levels (No education/Primary/Secondary/Beyond secondary), socio-economic status (Low, Middle, High) and body mass index (14 – 38) with random 250 cases of each variable. Your exam roll number must be used to set the random seed. b) Check if body mass index variable follows normal distribution using suggestive plot and confirmative tests and interpret the results carefully. c) Check if body mass index variables have equal variance for sex variable using suggestive plot and confirmatory test and interpret the results carefully. d) Which independent sample t-test must be used to compare body mass index by sex? Why? e) Perform the independent sample t-test identified above and interpret it carefully.
Do the following in R Studio using "mtcars" dataset with R script:
a) Divide the mtcars data into train and test datasets with 70:30 random splits. b) Fit a supervised logistic regression model and naïve bayes classification models on train data with transmission (am) as dependent variable and miles per gallon (mpg), displacement (disp), horse power (hp) and weight (wt) as independent variable. c) Predict the transmission (am) variable in the test data for both the models and interpret the result carefully. d) Get the confusion matrix, sensitivity, specificity of both the models using predicted transmission variable on test data and interpret them carefully. e) Which supervised classification model is the best for doing prediction? Why?
Do as follows using given dataset of 10 US cities in R studio with R script:
| City | Atlanta | Chicago | Denver | Houston | Los Angeles | Miami | New York | San Francisco | Seattle | Washington D.C |
|---|---|---|---|---|---|---|---|---|---|---|
| Atlanta | 0 | 587 | 1212 | 701 | 1936 | 604 | 748 | 2139 | 2182 | 543 |
| Chicago | 587 | 0 | 920 | 940 | 1745 | 1188 | 713 | 1858 | 1737 | 597 |
| Denver | 1212 | 920 | 0 | 879 | 831 | 1726 | 1631 | 949 | 1021 | 1494 |
| Houston | 701 | 940 | 879 | 0 | 1374 | 968 | 1420 | 1645 | 1891 | 1220 |
| Los Angeles | 1936 | 1745 | 831 | 1374 | 0 | 2339 | 2451 | 347 | 959 | 2300 |
| Miami | 604 | 1188 | 1726 | 968 | 2339 | 0 | 1092 | 2594 | 2734 | 923 |
| New York | 748 | 713 | 1631 | 1420 | 2451 | 1092 | 0 | 2571 | 2408 | 205 |
| San Francisco | 2139 | 1858 | 949 | 1645 | 347 | 2594 | 2571 | 0 | 678 | 2442 |
| Seattle | 2182 | 1737 | 1021 | 1891 | 959 | 2734 | 2408 | 678 | 0 | 2329 |
| Washington D.C | 543 | 597 | 1494 | 1220 | 2300 | 923 | 205 | 2442 | 2329 | 0 |
a) Get this data in R and compute dissimilarity distance as city.dissimilarity object. b) Fit a classical multidimensional model using the city.dissimilarity object. c) Get the summary of the model and interpret it carefully. d) Get the bi-plot of the model and interpret it carefully. e) Compare this model with the first two components from principal component analysis model in this data.
OR
Use the first four variables of "iris" data file into R Studio and do as follows with R script:
a) Fit a k-means clustering model in the data with k=2 and k=3. b) Plot the clusters formed with k=3 in the single graph and interpret them carefully. c) Add cluster centers for the plot of clusters formed with k=3 above and interpret it carefully. d) Compare the k=3 cluster variable with Species variable of iris data using confusion matrix and interpret the result carefully.
Frequently asked questions
- Where can I find the Master in Data Science (SMS, TU) Statistical Computing with R question paper 2080?
- The full Master in Data Science (SMS, TU) Statistical Computing with R 2080 (Board) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Statistical Computing with R 2080 paper come with solutions?
- Yes. Every question on this Statistical Computing with R past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the Master in Data Science (SMS, TU) Statistical Computing with R 2080 paper?
- The Master in Data Science (SMS, TU) Statistical Computing with R 2080 paper carries 45 full marks and is meant to be completed in 120 minutes, across 10 questions.
- Is practising this Statistical Computing with R past paper free?
- Yes — reading and attempting this Statistical Computing with R past paper on Kekkei is completely free.