机器学习代写|Homework Assignment 1 CSE 151A: Introduction to Machine Learning

这是一篇英国的机器学习代写

Instructions: Please answer the questions below, attach your code in the document, and insert figures to create a single PDF file. You may search information online but you will need to write code/find solutions to answer the questions yourself.
Grade: out of 100 points

In this question, you are provided with several scenarios. You need to identify if the given scenario is better formulated as a classification task or a clustering task. You should also provide the reason that supports your choice.
1. Scenario 1: Assume there are 100 graded answer sheets for a homework assignment (scores range from 0 to 100). We would like to split them into several groups where each group has similar scores.
Choice:                                       task
Reason:
2. Assume there are 100 graded answer sheets for a homework assignment (scores range from 0 to 100).
We would like to split them into several groups where each group represents a letter grade (A, B, C,D) following the criteria: A (90-100), B (75-90), C (60-75), D (0-60).
Choice:                                       task
Reason:

2.1 (20 points) Derivatives with Scalars

2.2 (20 points) Derivatives with Vectors

Several particular vector derivatives are useful for this course. For matrix A RM×M, column vector x RM and a RM , we have

The above rules adopt a denominator-layout notation. For more rules, you can refer to this Wikipedia page.

Please apply the above rules and calculate following derivatives:

In machine learning, we have many metrics to evaluate the performance of our model. For example, in a binary classification task, there is a dataset S = (xi , yi), i = 1, .., N where each data point (x, y) contains a feature vector x RM and a ground-truth label y ∈ {0, 1}. We have obtained a classifier f : RM → {0, 1} to predict the label ˆy of feature vector x:yˆ = f(x) Assume N = 200 and we have the following confusion matrix to represent the result of classifier f on dataset

Please follow the lecture notes to compute the metrics below:

Hint: You may refer to other metrics you have computed.

We will be using the UCI Wine dataset for this problem and Question 5. The description of the dataset can be found at https://archive.ics.uci.edu/ml/datasets/wine. You can load the dataset using the code below (recommended), or you can download the dataset here and load it yourself. You may refer the the Jupyter notebook HW1-Q4-Q5.ipynb for some skeleton code.

import matplotlib.pyplot as plt

from sklearn import datasets

wine = datasets.load wine()

X = wine.data

Y = wine.target

Report your code and the scatter plot in Gradescope submission.

We have already had a glimpse of the Wine dataset in Question 4. In this question, we will still use the Wine dataset. In fact, you can see the shape of array X is (178, 13) by running X.shape, which means it contains 178 data points and 13 features per data point. You may refer the the Jupyter notebook HW1-Q4-Q5.ipynb for some skeleton code. Here, we will calculate some measures of the array X and perform some basic data manipulation:

Hint: You may use np.random.randint().

pr int X[ 0 ]  # P r i n t t h e f i r s t row o f a r r ay X.

pr int X[ : , 0 ]  # P r i n t t h e f i r s t column o f a r r ay X. # : he re means a l l rows and 0 means column 0 .

pr int X[ 3 : 5 , 1 : 3 ]  # P r i n t 4 t h and 5 t h rows , 2nd and 3 rd columns .

pr int X[ : 3 , : 2 ]  # P r i n t f i r s t 3 rows , f i r s t 2 columns .

http://cs231n.github.io/python-numpy-tutorial/

Report your code and the results of data manipulation in Gradescope submission.