Browse papers
LevelMaster in Data Science (SMS, TU)
SubjectStatistical Computing with R
Year2080 BS
Exam sessionFa Reassessment · Set First Re-assessment 2080
Full marks45
Time allowed120 minutes
Questions10, all with step-by-step solutions
A

Group A

5 questions·3 marks each
1Short answer3 marks

Explain how to import these types of data in R using "dplyr" package:

a) Tab separated values text file b) Comma separated values text file c) SPSS data file

data-importdplyr
2Short answer3 marks

Explain following data types in R with examples:

a) Integer variable is different than number variable b) Categorical variable is different than factor variable c) Date variable is different than Date as well as time variable

data-typesr
3Short answer3 marks

Explain these terms with examples for R:

a) Getting multi-way table with array b) Creating class intervals of continuous variable c) Missingness vs nothingness

arraysclass-intervalsmissingness
4Short answer3 marks

Explain the followings with examples for R:

a) Reference range based on mean b) Reference range based on median c) Outliers and extreme values

reference-rangeoutliers
5Short answer3 marks

Explain the following in R with example:

a) Nodes and edges b) Diameter c) Edge density

graph-theorysna
B

Group B

5 questions·6 marks each
6Long answer6 marks

Open the R studio and do the followings with R script and knit HTML output:

a) What happens when 4L is multiplied by 3.2? b) What happens when 4L is multiplied by 2L? c) Define blood with O, O, A, A, B, B and check its type and attributes with your comments d) Define x with 1,2,NA,8,3,NA,3 and get its mean with or without pipes. e) Get the first and sixth elements of x using sub-setting codes and its explanation.

r-data-typessubsetting
7Long answer6 marks

Do the following in R studio and with R script to knit HTML output:

a) Define an object "rating" with 9, 2, 5, 8, 6, 1, 3, 2, 8, 4, 6, 8, 7, 1, 2, 6, 10, 5, 6, 9, 6, 2, 4, 7 b) Replicate the given table obtained from SPSS software for the rating object in R

rating

FrequencyPercentValid PercentCumulative Percent
Valid128.38.38.3
2416.716.725.0
314.24.229.2
428.38.337.5
528.38.345.8
6520.820.866.7
728.38.375.0
8312.512.587.5
928.38.395.8
1014.24.2100.0
Total24100.0100.0
frequency-tablespssr
8Long answer6 marks

Use the "air quality" data as AQ to do following in R Studio with R script to knit HTML output:

a) Replace missing values of Ozone variable with the best measure of central tendency b) Create a Date variable in AQ using Month and Day variable for year 2022. c) Create line plot of "Ozone" variable with "Date" as the row index and interpret it carefully d) Get class intervals of the cleaned Ozone variable using range, its square root and zero rounding e) Get frequency distribution (n and %) of Ozone variable class intervals and interpret it carefully

air-qualitymissing-valuesclass-intervals
9Long answer6 marks

Do the following in R Studio with tidyverse package using R Script to knit HTML output:

a) Define a tibble having country, year, cases and population variables with 100 random data each b) Transform the cases variable as log of cases (LnCase) and population variable as log of population (LnPop) c) Create scatterplots of 1. Cases and population, 2. LnCase and population, 3. Cases and LnPop and 4. LnCase and LnPop in a single graph window with base R plot code and interpret it carefully.

OR

Load the "igraph" package in R studio and do the basic SNA as follows with R script and HTML output:

a) Define g as graph object with (1,2,2,3,3,4,4,1) as its elements b) Plot g with node color as green, node size as 30, link color as red and link size as 5 and interpret it c) Plot the g as undirected arguments and interpret it carefully d) Plot g with seven nodes and interpret it carefully e) Get degree, closeness and betweenness of g and interpret them carefully.

tidyverseigraphsna
10Long answer6 marks

Use the cleaned "AQ" file in R studio and do as follows with R Scripts and HTML outputs:

a) Get reference range of "Ozone" variable using mean and standard deviation b) Plot histogram of "Ozone" variable and show the outliers of "Ozone" with reference range limits c) Get reference range of "Ozone" variable using median and inter-quartile range d) Plot boxplot of "Ozone" variable and show the outliers of "Ozone" with reference range limits e) Write a summary of the results obtained from the histogram and boxplot

OR

Do as follows in R Studio and do as follows with R script and HTML outputs:

a) Open R and then go to Help and Manuals if PDF and open "An Introduction to R" file b) Import this pdf file in R using "pdftools" package c) Perform pre-processing and create 'corpus' afterwards d) Find the most frequent terms and create histogram of the most frequent e) Create word cloud of the corpus, color it using rainbow or R Color Brewer package f) Perform topic modelling and interpret the result carefully

reference-rangeoutlierstext-mining

Frequently asked questions

Where can I find the Master in Data Science (SMS, TU) Statistical Computing with R question paper 2080?
The full Master in Data Science (SMS, TU) Statistical Computing with R 2080 (Fa Reassessment) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Statistical Computing with R 2080 paper come with solutions?
Yes. Every question on this Statistical Computing with R past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the Master in Data Science (SMS, TU) Statistical Computing with R 2080 paper?
The Master in Data Science (SMS, TU) Statistical Computing with R 2080 paper carries 45 full marks and is meant to be completed in 120 minutes, across 10 questions.
Is practising this Statistical Computing with R past paper free?
Yes — reading and attempting this Statistical Computing with R past paper on Kekkei is completely free.