机器学习代写 | COMPSCI 3314/7314 Introduction to Statistical Machine Learning

本次澳洲代写是一个统计机器学习基础的限时测试

Basic Concepts of Machine Learning, etc.

Question 1

(a) Identify which of the following classifiers are nonlinear classifiers:

(1) RBF kernel SVM(2) 1-nearest neighbour classifier (3) 3-nearest
neighbour classifier

Write down your choice (choices).

(b) Write down the probabilistic density function of a Gaussian Mix-
ture Model (GMM), that is, the likelihood of a sample x belonging
to a GMM (2 points). Explain the relationship between estimat-
ing the parameters of a GMM and clustering data with a GMM. In
other words, if we already learned the parameters of a GMM, how
could we calculate the membership (or equivalently the posterior
probability) of a sample belonging to a cluster (4 points)?

(c) As shown in the following figure, there are 8 data points in the 2-
dimensional space. The squares denote training samples belonging
to class “1” and circles denote training samples belonging to class
“2”. The cross “+” denotes a test sample. What is the predicted
category of the test sample if we use a 1-nearest neighbour clas-
sifier? and what is the prediction if we use a 3-nearest neighbour
classifier? Explain how could we choose k for k-nearest neighbour
classifier? (1 point, 2 points and 2 points)

(d) True or False: A classifier with a lower training error on the train-
ing set always performs better on the test set.
Briefly explain why.

True or False: One way to avoid overfiting is to use less training
data.
Briefly explain why.

True or False: It is a good practice to choose the optimal hyper-
parameter of a model on the test set.
Briefly explain why.