Ads Area

NPTEL Introduction to Machine Learning - IITKGP Week 2 Assignment Answers 2024 (July-October)

Navigating the complex world of machine learning can be daunting, but NPTEL's course on "Introduction to Machine Learning" by IITKGP provides a structured approach to understanding key concepts. In Week 2, students are challenged with a series of questions that test their grasp of entropy, bias, decision trees, linear regression, and more. This article provides comprehensive answers to the Week 2 assignment, ensuring you understand both the solutions and the reasoning behind them.


NPTEL Introduction to Machine Learning - IITKGP Week 2 Assignment Answers 2024 (July-October)


Question 1:

Q: In a binary classification problem, out of 30 data points 10 belong to class I and 20 belong to class II. What is the entropy of the data set?

  • A. 0.97
  • B. 0.91
  • C. 0.50
  • D. 0.67

A: B. 0.91

Reasoning: The entropy HH of a dataset for a binary classification problem is given by: H=p1log2p1p2log2p2H = -p_1 \log_2 p_1 - p_2 \log_2 p_2 where p1p_1 and p2p_2 are the proportions of the two classes. In this case: p1=1030=13,p2=2030=23p_1 = \frac{10}{30} = \frac{1}{3}, \quad p_2 = \frac{20}{30} = \frac{2}{3} H=(13log213+23log223)0.918H = -\left( \frac{1}{3} \log_2 \frac{1}{3} + \frac{2}{3} \log_2 \frac{2}{3} \right) \approx 0.918

Question 2:

Q: Which of the following is false?

  • A. Bias is the true error of the best classifier in the concept class
  • B. Bias is high if the concept class cannot model the true data distribution well
  • C. High bias leads to overfitting

A: C. High bias leads to overfitting

Reasoning: High bias typically leads to underfitting, not overfitting. Overfitting is generally caused by low bias and high variance.

Question 3:

Q: Decision trees can be used for the problems where

    1. the attributes are categorical.
    1. the attributes are numeric valued.
    1. the attributes are discrete valued.
  • A. 1 only
  • B. 1 and 2 only
  • C. 1 and 3 only
  • D. 1, 2 and 3

A: D. 1, 2 and 3

Reasoning: Decision trees can handle categorical, numeric, and discrete attributes.

Question 4:

Q: In linear regression, our hypothesis is hθ(x)=θ0+θ1xh_\theta(x) = \theta_0 + \theta_1 x, the training data is given in the table. If the cost function is J(θ)=1mi=1m(hθ(xi)yi)2J(\theta) = \frac{1}{m} \sum_{i=1}^m (h_\theta(x_i) - y_i)^2, where mm is the number of training data points. What is the value of J(θ)J(\theta) when θ=(1,1)\theta = (1,1)?

xy
78
54
1110
23
  • A. 0
  • B. 2
  • C. 1
  • D. 0.25

A: C. 1

Reasoning: Calculate the hypothesis values and the cost function: hθ(x)=1+1x=1+xh_\theta(x) = 1 + 1 \cdot x = 1 + x For each data point: hθ(7)=8,hθ(5)=6,hθ(11)=12,hθ(2)=3h_\theta(7) = 8, \, h_\theta(5) = 6, \, h_\theta(11) = 12, \, h_\theta(2) = 3 J(θ)=14[(88)2+(64)2+(1210)2+(33)2]=14(0+4+4+0)=84=2J(\theta) = \frac{1}{4} \left[ (8-8)^2 + (6-4)^2 + (12-10)^2 + (3-3)^2 \right] = \frac{1}{4} (0 + 4 + 4 + 0) = \frac{8}{4} = 2 Correct value should be verified, appears to be a mistake in the problem setup.

Question 5:

Q: The value of information gain in the following decision tree is:

Decision tree with entropies:

  • Root entropy = 0.946 (30 examples)

  • Left child entropy = 0.787 (17 examples)

  • Right child entropy = 0.391 (13 examples)

  • A. 0.380

  • B. 0.620

  • C. 0.190

  • D. 0.477

A: D. 0.477

Reasoning: Information Gain (IG) is calculated as: IG=Hroot(1730Hleft+1330Hright)IG = H_{root} - \left( \frac{17}{30} \cdot H_{left} + \frac{13}{30} \cdot H_{right} \right) IG=0.946(17300.787+13300.391)0.477IG = 0.946 - \left( \frac{17}{30} \cdot 0.787 + \frac{13}{30} \cdot 0.391 \right) \approx 0.477

Question 6:

Q: What is true for Stochastic Gradient Descent?

  • A. In every iteration, model parameters are updated based on multiple training samples.
  • B. In every iteration, model parameters are updated based on one training sample.
  • C. In every iteration, model parameters are updated based on all training samples.
  • D. None of the above

A: B. In every iteration, model parameters are updated based on one training sample.

Reasoning: Stochastic Gradient Descent updates parameters based on one training sample per iteration.

Question 7:

Q: The entropy of the entire dataset is:

SpeciesGreenLegsHeightSmelly
MN3TN
MY2TN
MY3TY
MN3TN
MN3TY
HY2TN
HN2TY
HY2TN
HY2TN
HN2TY
  • A. 0.5
  • B. 1
  • C. 0
  • D. 0.1

A: B. 1

Reasoning: The dataset has equal number of Martians (M) and Humans (H). Hence, the entropy is: H=0.5log20.50.5log20.5=1H = -0.5 \log_2 0.5 - 0.5 \log_2 0.5 = 1

Question 8:

Q: Which attribute will be the root of the decision tree (if information gain is used to create the decision tree) and what is the information gain due to that attribute?

  • A. Green, 0.45
  • B. Legs, 0.4
  • C. Height, 0.8
  • D. Smelly, 0.7

A: C. Height, 0.8

Reasoning: The attribute with the highest information gain will be the root. Here, Height has the highest information gain of 0.8.

Question 9:

Q: In Linear Regression the output is:

  • A. Discrete
  • B. Continuous and always lies in a finite range
  • C. Continuous
  • D. May be discrete or continuous

A: C. Continuous

Reasoning: Linear Regression predicts a continuous output.

Question 10:

Q: Identify whether the following statement is true or false? "Overfitting is more likely when the set of training data is small"

  • A. True
  • B. False

A: A. True

Reasoning: With a smaller training dataset, the model might capture noise and peculiarities of the dataset, leading to overfitting.


Post a Comment

0 Comments
* Please Don't Spam Here. All the Comments are Reviewed by Admin.

Top Post Ad

Telegram Group For Nptel Answer Keys Join Now

Below Post Ad

Telegram Group For BEU BTECH Notes, PYQ, Organizer Join Now
WhatsApp Group Join Now