Ads Area

NPTEL Introduction to Machine Learning Week 3 Assignment Answers 2024 (July-October)

The NPTEL Introduction to Machine Learning course for the July-October 2024 session covers critical topics in Week 3, including linear classification, logistic regression, and Linear Discriminant Analysis (LDA). This week's assignment tests the understanding of these concepts through various questions designed to challenge and enhance learning.

NPTEL Introduction to Machine Learning Week 3 Assignment Answers 2024 (July-October)


Question 1

For a two-class problem using discriminant functions (did_i - discriminant function for class ii), where is the separating hyperplane located?

Given:

  • d1(x)=xTw1+w10d_1(\mathbf{x}) = \mathbf{x}^T \mathbf{w}_1 + w_{10}
  • d2(x)=xTw2+w20d_2(\mathbf{x}) = \mathbf{x}^T \mathbf{w}_2 + w_{20}
  • The separating hyperplane is where d1(x)=d2(x)d_1(\mathbf{x}) = d_2(\mathbf{x}).

Since d1(x)=d2(x)d_1(\mathbf{x}) = d_2(\mathbf{x}), we get: xTw1+w10=xTw2+w20\mathbf{x}^T \mathbf{w}_1 + w_{10} = \mathbf{x}^T \mathbf{w}_2 + w_{20} xT(w1w2)=w20w10\mathbf{x}^T (\mathbf{w}_1 - \mathbf{w}_2) = w_{20} - w_{10}

Therefore, the separating hyperplane is defined by: xT(w1w2)=w20w10\mathbf{x}^T (\mathbf{w}_1 - \mathbf{w}_2) = w_{20} - w_{10}

Answer: xT(w1w2)=w20w10\mathbf{x}^T (\mathbf{w}_1 - \mathbf{w}_2) = w_{20} - w_{10}

Question 2

Given the following dataset consisting of two classes, AA and BB, calculate the prior probability of each class.

Feature 1Class
2.3A
1.8A
3.2A
1.2A
2.1A
1.9B
2.4B

Calculate P(A)P(A) and P(B)P(B):

  • Number of samples for class A, nA=5n_A = 5
  • Number of samples for class B, nB=2n_B = 2
  • Total number of samples, n=7n = 7

Prior probabilities: P(A)=nAn=570.714P(A) = \frac{n_A}{n} = \frac{5}{7} \approx 0.714 P(B)=nBn=270.286P(B) = \frac{n_B}{n} = \frac{2}{7} \approx 0.286

Answer: P(A)=0.714,P(B)=0.286P(A) = 0.714, P(B) = 0.286

Question 3

In a 3-class classification problem using linear regression, the output vectors for three data points are (0.8,0.3,0.1)(0.8, 0.3, -0.1), (0.2,0.6,0.2)(0.2, 0.6, 0.2), and (0.1,0.4,0.4)(0.1, 0.4, 0.4). To which classes would these points be assigned?

Assignment is based on the highest output value for each data point:

  • Data point (0.8,0.3,0.1)(0.8, 0.3, -0.1) -> Class 1 (0.8 is the highest)
  • Data point (0.2,0.6,0.2)(0.2, 0.6, 0.2) -> Class 2 (0.6 is the highest)
  • Data point (0.1,0.4,0.4)(0.1, 0.4, 0.4) -> Class 2 (0.4 is the highest, tie between class 2 and class 3)

Answer:

  • (0.8,0.3,0.1)(0.8, 0.3, -0.1) -> Class 1
  • (0.2,0.6,0.2)(0.2, 0.6, 0.2) -> Class 2
  • (0.1,0.4,0.4)(0.1, 0.4, 0.4) -> Class 2

Question 4

If you have a 5-class classification problem and want to avoid masking using polynomial regression, what is the minimum degree of the polynomial you should use?

For a kk-class problem, to avoid masking, we need to use a polynomial of degree k1k-1.

For 5 classes: k=5k = 5 Minimum degree of the polynomial: k1=51=4k-1 = 5-1 = 4

Answer: 4

Question 5

Consider a logistic regression model where the predicted probability for a given data point is 0.4. If the actual label for this data point is 1, what is the contribution of this data point to the log-likelihood?

Log-likelihood contribution for logistic regression is given by: LL=ylog(p)+(1y)log(1p)\text{LL} = y \log(p) + (1 - y) \log(1 - p) Where yy is the actual label and pp is the predicted probability.

Given: y=1,p=0.4y = 1, p = 0.4

Contribution to log-likelihood: LL=1log(0.4)+(11)log(10.4)\text{LL} = 1 \cdot \log(0.4) + (1 - 1) \cdot \log(1 - 0.4) LL=log(0.4)\text{LL} = \log(0.4) LL0.9163\text{LL} \approx -0.9163

Answer: 0.9163-0.9163

Question 6

What additional assumption does LDA make about the covariance matrix in comparison to the basic assumption of Gaussian class conditional density?

Linear Discriminant Analysis (LDA) assumes that the covariance matrix is the same for all classes.

Answer: The covariance matrix is the same for all classes.

Question 7

What is the shape of the decision boundary in LDA?

In LDA, the decision boundary is linear.

Answer: Linear

Question 8

For two classes C1C_1 and C2C_2 with within-class variances σ12=1\sigma_{1}^2 = 1 and σ22=4\sigma_{2}^2 = 4 respectively, if the projected means are μ1=1\mu_{1} = 1 and μ2=3\mu_{2} = 3, what is the Fisher criterion J(w)J(w)?

The Fisher criterion is given by: J(w)=(μ1μ2)2σ12+σ22J(w) = \frac{(\mu_1 - \mu_2)^2}{\sigma_1^2 + \sigma_2^2}

Given: μ1=1,μ2=3\mu_1 = 1, \mu_2 = 3 σ12=1,σ22=4\sigma_1^2 = 1, \sigma_2^2 = 4

Calculate J(w)J(w): J(w)=(13)21+4J(w) = \frac{(1 - 3)^2}{1 + 4} J(w)=45J(w) = \frac{4}{5} J(w)=0.8J(w) = 0.8

Answer: 0.8

Question 9

Given two classes C1C_1 and C2C_2 with means μ1=[23]\mu_1 = \begin{bmatrix} 2 \\ 3 \end{bmatrix} and μ2=[57]\mu_2 = \begin{bmatrix} 5 \\ 7 \end{bmatrix} respectively, what is the direction vector for LDA when the within-class covariance matrix SWS_W is the identity matrix II?

For LDA, the direction vector ww is given by: w=SW1(μ1μ2)w = S_W^{-1} (\mu_1 - \mu_2)

Given: μ1=[23],μ2=[57]\mu_1 = \begin{bmatrix} 2 \\ 3 \end{bmatrix}, \mu_2 = \begin{bmatrix} 5 \\ 7 \end{bmatrix} SW=IS_W = I

Calculate μ1μ2\mu_1 - \mu_2: μ1μ2=[23][57]\mu_1 - \mu_2 = \begin{bmatrix} 2 \\ 3 \end{bmatrix} - \begin{bmatrix} 5 \\ 7 \end{bmatrix} μ1μ2=[2537]\mu_1 - \mu_2 = \begin{bmatrix} 2 - 5 \\ 3 - 7 \end{bmatrix} μ1μ2=[34]\mu_1 - \mu_2 = \begin{bmatrix} -3 \\ -4 \end{bmatrix}

Since SW=IS_W = I, the direction vector ww is: w=I1[34]w = I^{-1} \begin{bmatrix} -3 \\ -4 \end{bmatrix} w=[34]w = \begin{bmatrix} -3 \\ -4 \end{bmatrix}

Answer: [34]\begin{bmatrix} -3 \\ -4 \end{bmatrix}

Tags

Post a Comment

0 Comments
* Please Don't Spam Here. All the Comments are Reviewed by Admin.

Top Post Ad

Telegram Group For Nptel Answer Keys Join Now

Below Post Ad

Telegram Group For BEU BTECH Notes, PYQ, Organizer Join Now
WhatsApp Group Join Now