Browse papers
LevelMaster in Data Science (SMS, TU)
SubjectFundamentals of Data Science
Year2081 BS
Exam sessionFa
Full marks45
Time allowed120 minutes
Questions10, all with step-by-step solutions
A

Group A

5 questions·3 marks each
1Short answer3 marks

Why is Data Science often regarded as a role with blurry or ambiguous boundaries? Provide rationale to support your explanation.

data-sciencerole-boundaries
2Short answer3 marks

Compare and contrast feature generation and feature selection algorithms.

feature-engineeringfeature-selection
3Short answer3 marks

Discuss on machine learning and its types.

machine-learning
4Short answer3 marks

Discuss common methods of data validation that can be applied to ensure the quality and integrity of the dataset.

data-validationdata-quality
5Short answer3 marks

Briefly explain the ETL and ELT process of data migration.

etleltdata-migration
B

Group B

5 questions·6 marks each
6Long answer6 marks

Elaborate on TDSP (Team Data Science Process) as a framework for the data science lifecycle.

OR

Discuss CRISP-DM (Cross-Industry Standard Process for Data Mining) as an agile approach to the data science lifecycle.

tdspcrisp-dmdata-science-lifecycle
7Long answer6 marks

Consider a dataset representing whether students passed an exam based on three features: Study Hours (Low, Medium, High), Previous Grades (Low, Medium, High), and Tutoring (Yes or No). The target variable is Exam Result (Pass or Fail).

Study HoursPrevious GradesTutoringExam Result
LowLowYesFail
LowMediumNoFail
MediumHighYesPass
HighLowNoFail
MediumMediumYesPass
HighHighYesPass
HighHighNoPass
LowLowNoFail

Using the ID3 algorithm, calculate the information gain for each feature (Study Hours, Previous Grades, Tutoring) and determine which feature should be chosen as the root node for the decision tree.

OR

Consider a dataset containing the coordinates of 8 points in a two-dimensional space:

  • Point 1: (2, 3)
  • Point 2: (3, 4)
  • Point 3: (3, 5)
  • Point 4: (4, 6)
  • Point 5: (7, 8)
  • Point 6: (8, 7)
  • Point 7: (9, 8)
  • Point 8: (10, 9)

Apply the K-Means algorithm to cluster these points into 3 clusters.

id3decision-treek-means
8Long answer6 marks

You are analyzing a dataset containing information about customer orders for an e-commerce platform. However, upon initial inspection, you notice several data quality issues that may impact the reliability of your analysis.

Describe three common data quality issues that you may have identified in the dataset, providing specific examples for each issue. Explain the potential consequences of these issues on your analysis and propose strategies to address them effectively.

data-qualitydata-cleaning
9Long answer6 marks

Explain the linear regression algorithm with appropriate example.

linear-regression
10Long answer6 marks

Consider a dataset containing monthly sales data for a retail store over a period of two years. The dataset consists of the following columns: Date (representing the month), Sales (the total sales for that month) and profit. Using this dataset, answer the following questions:

a) Define what a time series is and explain its importance in data analysis. b) Identify and describe the different types of time series patterns that may exist in the sales data.

time-seriespatterns

Frequently asked questions

Where can I find the Master in Data Science (SMS, TU) Fundamentals of Data Science question paper 2081?
The full Master in Data Science (SMS, TU) Fundamentals of Data Science 2081 (Fa) question paper is available free on Kekkei. You can read every question online and attempt the paper under timed exam conditions.
Does the Fundamentals of Data Science 2081 paper come with solutions?
Yes. Every question on this Fundamentals of Data Science past paper includes a step-by-step solution, plus instant AI feedback when you attempt it on Kekkei.
How many marks is the Master in Data Science (SMS, TU) Fundamentals of Data Science 2081 paper?
The Master in Data Science (SMS, TU) Fundamentals of Data Science 2081 paper carries 45 full marks and is meant to be completed in 120 minutes, across 10 questions.
Is practising this Fundamentals of Data Science past paper free?
Yes — reading and attempting this Fundamentals of Data Science past paper on Kekkei is completely free.