机器学习辅导 | 贝叶斯分类器

Machine Learning. Homework 1. Due Oct 18 (in
class, no late HW please!).
Problem 1. The feature space consists of three possible points (events)
A, B, C, which occur with probability 0.2, 0.3, 0.5, respectively. For each
event there are two possible labels +1 or −1, which happen with probability
0.9, 0.3 and 0.8 respectively (that is, P(1|A) = 0.9, P(1|B) = 0.3, P(1|C) =
0.8). Determine the Bayes optimal classifier. What is the expected loss of
the Bayes optimal classifier?
Problem 2. A probability distribution on the real line is a mixture
of two classes +1 and −1 with density N(1, 2) (normal distribution with
mean 1 and variance 2) and N(4, 1), with prior probabilities 0.3 and 0.7
respectively. What is the Bayes decision rule? Give an estimate for the
Bayes risk.
Problem 3. Consider a k-NN classifier for a 2-class problem. What
is its expected (classification) loss and how does it compare to the Bayes
optimal, when k = 3, assuming you have sufficiently many data points?
How does the empirical loss of 3-NN compare to the Bayes optimal? (Recall
that the empirical loss of 1-NN is zero).
Problem 4. Generate 2000 points from two equally weighted spherical
Gaussians N(0, I), N((3, 0, . . . , 0), I) in R
p
, p = 1, 11, 21, . . . 101 (note, you
have to first flip a coin to decide from which Gaussian to sample), where
I is the identity matrix and the centers of Gaussians are distance 3 apart.
Implement 1-NN and 3-NN classifiers. Test the resulting classifier on a
separately generated dataset with 1000 pts. Plot the error rate as a function
of p. Observations?
Problem 5. What is the VC-dimension of the set of indicator functions
of disks in R
2
(i.e., functions which are 1 inside a circle −1 outside (but not
the other way around!))? What about the indicator functions of rectangular
boxes with sides parallel to the axes? You need to explain why.
1