机器学习代写|CS 6316 Machine Learning Homework 1: Learning Theory and Linear Predictors

这是一篇来自美国的关于学习理论和线性预测器的机器学习代写家庭作业

Submission Instruction

Questions (20 points)

fD(x) =  +1 if P[y = +1| x]> 12

1 ifP[y = 1| x]> 12(1)

Note that P[y = +1 | x] + P[y = 1 | x] = 1. Please show that this is the optimal predictor. In other words, for any predictor h, we have

LD(fD) LD(h) (2)

D =12N (x; 0, 1)| y= {z1 }+12N (x;23π, 0.5)|{z }y=+1(3)

Please answer the following questions with the new data distribution

(a) (2 point) What is the decision boundary of the Bayes predictor bBayes? Such as the Bayes predictor can be defined as

fD(x) =  +1x > bBayes1x < b Bayes (4)

(b) (1 point) What is the true error of the Bayes predictor, LD(fD)?

(c) (2 point) With the following hypothesis space H and the data distribution in equation 3, please

find out the best hypothesis h∈ H and report the corresponding decision boundary b

H = {

i

: i [1200]}

(5)1(d) (1 point) What is the true error of h, LD(h)?

(e) (1 point) Follow a similar data generation procedure as in the demo code, sample 100 data points from each component and label them correspondly. Then, with the same hypothesis space H in equation 5 and these 200 training examples, please find out the best hypothesis hthat minimize the empirical error and report the corresponding decision boundary bS.

(f) (1 point) What is the true error of hS, LD(hS)?

The data you need for the implementation is in the file data.txt, which is released together with the assignment. Comparing to the pseudocode in our lecture, T was removed from line 3. That is because in practice we do not know the actual value of T. But we can monitor the predictions on all data points and stop the algorithm when the classifier makes correct predictions on all examples.

1: Input: S = {(x1, y1), . . . ,(xm, ym))}

2: Initialize w(0) = (0, . . . , 0)

3: for t = 1, 2, · · · do

4:i t mod m

5:if yiw(t), xi⟩≤0 then

6:w(t+1) w(t) + yixi

7:end if

8: end for

9: Output: the final w(t)

L(hw, S) =1m mXi=1 log(1 + exp(yiw, xi))(6)

Please show that the gradient of L(hw, S) with respect to w is dL(hw, S)

dw =1m mXi=1exp(yiw, xi)1 + exp(yiw, xi)(yixi) (7)

L2 (hw, S) =mXi=1(hw(xi) yi)2 + λw2 2 (8)

Please show that the solution of this problem, when A + λI is invertible, is

w = (A + λI)1(9)

where I is the identify matrix, A and b is defined as

A =mXi=1xixTi   b =mXi=1yix(10)

Note that {xi} are column vectors.

Please report your answer in the homework submission and also submit your code with file name as [ComputingID]-hw02.py or [ComputingID]-hw02.ipynb. Without code submission, you will get a 50% deduction on the total points you have on this problem.