The NPTEL Introduction to Machine Learning course for the July-October 2024 session covers critical topics in Week 3, including linear classification, logistic regression, and Linear Discriminant Analysis (LDA). This week's assignment tests the understanding of these concepts through various questions designed to challenge and enhance learning.
Question 1
For a two-class problem using discriminant functions (di - discriminant function for class i), where is the separating hyperplane located?
Given:
- d1(x)=xTw1+w10
- d2(x)=xTw2+w20
- The separating hyperplane is where d1(x)=d2(x).
Since d1(x)=d2(x), we get:
xTw1+w10=xTw2+w20
xT(w1−w2)=w20−w10
Therefore, the separating hyperplane is defined by:
xT(w1−w2)=w20−w10
Answer: xT(w1−w2)=w20−w10
Question 2
Given the following dataset consisting of two classes, A and B, calculate the prior probability of each class.
Feature 1 | Class |
---|
2.3 | A |
1.8 | A |
3.2 | A |
1.2 | A |
2.1 | A |
1.9 | B |
2.4 | B |
Calculate P(A) and P(B):
- Number of samples for class A, nA=5
- Number of samples for class B, nB=2
- Total number of samples, n=7
Prior probabilities:
P(A)=nnA=75≈0.714
P(B)=nnB=72≈0.286
Answer:
P(A)=0.714,P(B)=0.286
Question 3
In a 3-class classification problem using linear regression, the output vectors for three data points are (0.8,0.3,−0.1), (0.2,0.6,0.2), and (0.1,0.4,0.4). To which classes would these points be assigned?
Assignment is based on the highest output value for each data point:
- Data point (0.8,0.3,−0.1) -> Class 1 (0.8 is the highest)
- Data point (0.2,0.6,0.2) -> Class 2 (0.6 is the highest)
- Data point (0.1,0.4,0.4) -> Class 2 (0.4 is the highest, tie between class 2 and class 3)
Answer:
- (0.8,0.3,−0.1) -> Class 1
- (0.2,0.6,0.2) -> Class 2
- (0.1,0.4,0.4) -> Class 2
Question 4
If you have a 5-class classification problem and want to avoid masking using polynomial regression, what is the minimum degree of the polynomial you should use?
For a k-class problem, to avoid masking, we need to use a polynomial of degree k−1.
For 5 classes:
k=5
Minimum degree of the polynomial:
k−1=5−1=4
Answer: 4
Question 5
Consider a logistic regression model where the predicted probability for a given data point is 0.4. If the actual label for this data point is 1, what is the contribution of this data point to the log-likelihood?
Log-likelihood contribution for logistic regression is given by:
LL=ylog(p)+(1−y)log(1−p)
Where y is the actual label and p is the predicted probability.
Given:
y=1,p=0.4
Contribution to log-likelihood:
LL=1⋅log(0.4)+(1−1)⋅log(1−0.4)
LL=log(0.4)
LL≈−0.9163
Answer: −0.9163
Question 6
What additional assumption does LDA make about the covariance matrix in comparison to the basic assumption of Gaussian class conditional density?
Linear Discriminant Analysis (LDA) assumes that the covariance matrix is the same for all classes.
Answer: The covariance matrix is the same for all classes.
Question 7
What is the shape of the decision boundary in LDA?
In LDA, the decision boundary is linear.
Answer: Linear
Question 8
For two classes C1 and C2 with within-class variances σ12=1 and σ22=4 respectively, if the projected means are μ1=1 and μ2=3, what is the Fisher criterion J(w)?
The Fisher criterion is given by:
J(w)=σ12+σ22(μ1−μ2)2
Given:
μ1=1,μ2=3
σ12=1,σ22=4
Calculate J(w):
J(w)=1+4(1−3)2
J(w)=54
J(w)=0.8
Answer: 0.8
Question 9
Given two classes C1 and C2 with means μ1=[23] and μ2=[57] respectively, what is the direction vector for LDA when the within-class covariance matrix SW is the identity matrix I?
For LDA, the direction vector w is given by:
w=SW−1(μ1−μ2)
Given:
μ1=[23],μ2=[57]
SW=I
Calculate μ1−μ2:
μ1−μ2=[23]−[57]
μ1−μ2=[2−53−7]
μ1−μ2=[−3−4]
Since SW=I, the direction vector w is:
w=I−1[−3−4]
w=[−3−4]
Answer: [−3−4]