# 机器学习代写 | COMPSCI 3314 Introduction to Statistical Machine Learning

这个作业是完成统计机器学习相关的测试题

COMPSCI 3314

Introduction to Statistical Machine Learning

Question 1

(a) Cross-validation is a method to (Choose the best single answer

from multiple choices):

(A) Remove the curse of dimensionality

(B) Assess how the results of a machine learning model will generalise to an unseen data set

(C) Remove noises or outliers from a data set

[2 marks]

(b) Kernel Principal Component Analysis is a method for (Choose the

best single answer from multiple choices):

(A) Classification

(B) Reduction of the dimensionality

(C) Probability estimation

(D) Regression

[2 marks]

(c) Which of the following statements is best practice in Machine Learning for building a real system? (Choose the best single answer from

multiple choices)

(A) Use all the data available for training to obtain optimal performance

(B) Use all the data available for testing the performance of your

algorithm

(C) Split the training data into two separate sets. Use the first subset for training and perform cross-validation solely on the second

subset

(D) Perform cross-validation on training, validation and testing sets

[3 marks]

(d) Which of the following statements about Machine Learning is False?

(Choose the best single answer from multiple choices)

(A) Machine learning algorithms often suffer from the curse of dimensionality

(B) Machine learning algorithms cannot generalise to the data that

are not observed during training of the algorithm

(C) Machine learning algorithms are typically sensitive to noise

(D) Machine learning algorithms typically perform better in terms

of testing accuracy when more training data become available

[3 marks]

(e) Which of the following statements is (are) true? (Select all the

correct ones)

(A) Gaussian mixture model (GMM) is a supervised learning method.

Question 2

Let {(xi

, yi)}

n

i=1 be the training data for a binary classification problem,

where xi ∈ R

d and yi ∈ {−1, 1}. Let w ∈ R

d be the parameter vector,

b ∈ R be the offset, ξi be the slack variable for i = 1, …, n.

Here the notation hp, qi = p · q calculates the inner product of two

vectors.

(a) What is wrong with the following primal form of the soft margin

SVMs?

min

w,b,ξ

1

2

k w k

2 + C

Xn

i=1

ξi

,

s.t. yi(hxi

, wi + b) ≥ 1 − ξi

, i = 1, · · · , n.

[2 marks]

(b) After fixing the problem in the above form, what is the estimated

w if C = 0?

[2 marks]

(c) The dual form of the soft margin SVMs is given below. How to

modify it (slightly) to make it become the dual form for the hard

margin SVMs?

max

α

Xn

i=1

αi −

1

2

X

i,j

αiαjyiyj hxi

, xj i

s.t. 0 ≤ αi ≤ C, i = 1, · · · , n

Xn

i=1

αiyi = 0

[2 marks]

(d) Express b using the dual variables and the training data.

[3 marks]

(e) A RBF kernel corresponds to lifting to a feature space with how

many dimensions?

[3 marks]

(f) Let u = [w; b] and z = [x; 1]. We can rewrite (hw, xi + b) as

hu, zi. This means if we augment the training data {(xi

, yi)}

n

i=1

to {(zi

, yi)}

n

i=1, where zi = [xi

; 1], we only need to learn one parameter u instead of two parameters w and b.

1. Please write down the primal form of the soft margin SVMs

using decision function sign[hu, zi].

2. Is the new primal form equivalent to the old primal form? In

other words, if we train two SVMs (standard SVM and this new

re-parameterised SVM), in general, will we obtain exactly the

same classification function?

3. Please prove your answer for above question (i.e. using derivation to show why or why not equivalent).

[6 marks]

(g) Suppose that we have a kernel K(·, ·) such that there is an implicit

high-dimensional feature map Φ : R

d → R

D that satisfies ∀x, z ∈

R

d

, K(x, z) = hΦ(x), Φ(z)i = Φ(x)

>Φ(z) = PD

i=1 Φ(x)iΦ(z)i

is the

inner product in the D-dimensional space.

Show how to compute the squared `2 distance in the D-dimensional

space:

||Φ(x) − Φ(z)||2 =

X

D

i=1

(Φ(x)i − Φ(z)i)

2

without explicitly calculating the values in the D-dimensional vectors. You are asked to provide a formal proof.

[6 marks]