Master in Data Science (SMS, TU) Statistical Computing with R Question Paper 2082 (Set pages 13-14; Second Assessment 2082) Nepal
This is the official Master in Data Science (SMS, TU) Statistical Computing with R question paper for 2082 Set pages 13-14; Second Assessment 2082, as set in the Second Assessment examination. It carries 45 full marks and a time allowance of 120 minutes, across 10 questions. On Kekkei you can attempt this Statistical Computing with R past paper online with a timer, get instant AI feedback and step-by-step solutions, and track the topics where you lose marks — completely free. Whether you are revising for your Master in Data Science (SMS, TU) Statistical Computing with R exam or solving previous years' question papers, this 2082 paper is a great way to practise under real exam conditions.
| Level | Master in Data Science (SMS, TU) |
|---|---|
| Subject | Statistical Computing with R |
| Year | 2082 BS |
| Exam session | Second Assessment · Set pages 13-14; Second Assessment 2082 |
| Full marks | 45 |
| Time allowed | 120 minutes |
| Questions | 10, all with step-by-step solutions |
Group A
Describe the following concepts with examples: a) Wilkinson's approach of grammar of graphics b) ggplot2 approach of grammar of graphics
Discuss following concepts with clear examples: a) BLUE in linear regression b) LINE in linear regression
Explain followings concepts with clear examples: a) Spearman's rank correlation b) Mann-Whitney U test
Explain supervised learning classification regression model with: a) Multinominal logistic regression b) Random forest model
Discuss supervised linear regression models with focus on: a) Support vector machine b) Multilayer perceptron
Group B
Do the following in R Studio using ggplot2 and dplyr packages and knit output as PDF file: a) Create a dataset with following variables: age (20-59 years), height (110 – 190 centimeters), weight (40-90 kg) with random 150 cases of each variable. Your roll number must be used to set the random seed. b) Compute body mass index (BMI) variable as: BMI = [(weight in kg) / (height in meter squared)] c) Create body mass index categories: <18, 18-24, 25-30, 30+ and label them as "underweight", "normal", "overweight" and "obese" respectively using dplyr package d) Show the percentage distribution of labelled BMI variable with pie chart using ggplot2 package
Do the following in R Studio using "airquality" dataset of R with R script to knit PDF output: a) Perform test of normality of "Temp" variable by each category of "Month" variable and interpret all the results carefully b) Perform test of equality of variance on "Temp" variable by each category of Month variable and interpret the result carefully c) Compare Temp by Month using the best statistical test for this data and interpret the result carefully d) Perform the post-hoc test and interpret the results carefully
Do the following in R Studio with R script to knit PDF output: a) Create a dataset with 200 random cases, 1 random binary (1 and 0) variable and four random non-binary (categorical and continuous) variables with your roll number as random seed b) Divide it into train and test datasets with 70:30 random splits c) Fit a supervised logistic regression and decision tree classification models on train data with binary variable as dependent variable and all other four variables as independent variable d) Predict the released variable in the test datasets of both the models and interpret the results carefully
Do as follows using in-built "USArrests" dataset with R script to knit PDF output: a) Create a "criminality scale" of four variables of this dataset using the Principal Component Analysis b) Compute the eigenvalues and interpret the PCA result carefully using Kaiser's criteria c) Show the Scree plot and decide on the number of components to retain with careful interpretation d) Revise the criminality scale using VARIMAX rotation and interpret the result carefully
OR
Do as follows using given dataset of 10 US cities in R studio with R script:
| City | Atlanta | Chicago | Denver | Houston | Los Angeles | Miami | New York | San Francisco | Seattle | Washington D.C |
|---|---|---|---|---|---|---|---|---|---|---|
| Atlanta | 0 | 587 | 1212 | 701 | 1936 | 604 | 748 | 2139 | 2182 | 543 |
| Chicago | 587 | 0 | 920 | 940 | 1745 | 1188 | 713 | 1858 | 1737 | 597 |
| Denver | 1212 | 920 | 0 | 879 | 831 | 1726 | 1631 | 949 | 1021 | 1494 |
| Houston | 701 | 940 | 879 | 0 | 1374 | 968 | 1420 | 1645 | 1891 | 1220 |
| Los Angeles | 1936 | 1745 | 831 | 1374 | 0 | 2339 | 2451 | 347 | 959 | 2300 |
| Miami | 604 | 1188 | 1726 | 968 | 2339 | 0 | 1092 | 2594 | 2734 | 923 |
| New York | 748 | 713 | 1631 | 1420 | 2451 | 1092 | 0 | 2571 | 2408 | 205 |
| San Francisco | 2139 | 1858 | 949 | 1645 | 347 | 2594 | 2571 | 0 | 678 | 2442 |
| Seattle | 2182 | 1737 | 1021 | 1891 | 959 | 2734 | 2408 | 678 | 0 | 2329 |
| Washington D.C | 543 | 597 | 1494 | 1220 | 2300 | 923 | 205 | 2442 | 2329 | 0 |
a) Get dissimilarity distance as city.dissimilarity object b) Fit a classical multidimensional model using the city.dissimilarity object c) Get the summary of the model and interpret it carefully d) Get the bi-plot of the model and interpret it carefully
Do the following in R Studio with R script to knit PDF output: a) Create a dataset with 200 random cases and five random variables with your roll number as random seed b) Fit a hierarchical clustering model using single linkage and get the dendogram for this model c) Fit a hierarchical clustering model using complete linkage and get the dendogram for this model d) Fit a hierarchical clustering model using average linkage and get the dendogram for this model e) Find the best hierarchical clustering model for this data and locate the number of clusters for it
OR
Use the four variables of "USArrests" data file into R Studio and do as follows with R script to knit PDF output: a) Fit a k-means clustering model in the data with k=2, plot it in the single graph and interpret it carefully b) Add cluster centers for the plot of clusters formed with k=2 and interpret it carefully c) Fit a k-means clustering model in the data with k=3, plot it in the single graph and interpret it carefully d) Add cluster centers for the plot of clusters formed with k=3 and interpret it carefully
Frequently asked questions
- Where can I find the Master in Data Science (SMS, TU) Statistical Computing with R question paper 2082?
- The full Master in Data Science (SMS, TU) Statistical Computing with R 2082 (Second Assessment) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
- Does the Statistical Computing with R 2082 paper come with solutions?
- Yes. Every question on this Statistical Computing with R past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
- How many marks is the Master in Data Science (SMS, TU) Statistical Computing with R 2082 paper?
- The Master in Data Science (SMS, TU) Statistical Computing with R 2082 paper carries 45 full marks and is meant to be completed in 120 minutes, across 10 questions.
- Is practising this Statistical Computing with R past paper free?
- Yes — reading and attempting this Statistical Computing with R past paper on Kekkei is completely free.