Python代写机器学习 | CMPSC 448: Machine Learning and AI HW1


CMPSC 448: Machine Learning and AI

Homework 1 (Due 02/14/2021 11:59 PM)

Problem 1 [20 points] In this problem, you are given two matrices A, B ∈ R2×2 and a vector x ∈ R2 􏰂1 2􏰃 􏰂1 2􏰃 􏰂2􏰃


and asked to answer the following questions about them.

Problem 2 [10 points] For this problem, we use the following notation for random variables: • X ∼ N(μ,σ2): X is a Gaussian random variable with mean μ and variance σ2
• X ∼ Bern(p): X is a {0, 1}-valued Bernoulli random variable with expectation p.
• E[X]: the expected value of random variable X

(a) If X ∼ N (1, 2), then what is E[X]? What is (E[X])2 − E[X2]? 1

Problem 3 [5 points] What is the rank of the following matrix and why?

1 2 1 1 0 3 112

Problem 4 [5 points] Use either numpy.linalg or scipy.linalg to find the eigendecomposition of the following matrix:

3 1 1 X=2 4 2

−1 −1 1 Problem5[5points]Forthefunctionf(x)=ln􏰀1+e−2x􏰁,whatisitderivativef′(x)=df(x) =?.

Problem 6 [10 points] Let x ∈ Rd be a vector in d dimensional space and define the vector valued function

f : Rd → R by
where A ∈ Rd×d is a symmetric matrix and b ∈ Rd is a fixed vector. Using the definition of gradient show

f(x)= 12x⊤Ax+b⊤x, ∇f(x) = Ax + b

Problem 7 [5 points]

(a) Whatisthemaximizerofg:[−4,4]→Rgivenbyg(x)=1×3−1×2−6x+27? 222

(b) What is 􏰅 1 g(x)dx for g defined above? 0

Exploratory Data Analysis with pandas

Problem 8 [40 points] The goal of this problem is to do basic data analysis on a simple data set using pandas package in Python (no machine learning for now). As it has been emphasized in the lectures, we need to have a good understanding of data before training a machine learning model. In this assignment, you are asked to analyze the UCI Adult data set. The Adult data set is a standard machine learning data set that contains demographic information about the US residents. This data was extracted from the census bureau database found at: The data set contains 32561 instances and 15 features (please check the notebook for possible values of each feature) with different types (categorical and continuous).

The data is provided as a csv file and can be loaded into panda’s DataFrame object as shown: data = pd.read_csv(‘’)

You are asked to answer following questions about this data set. Please note that you need to use pandas functionalities to answer these questions, rather than implementing pure Python code.

To answer these questions, you are provided with a Jupyter notebook with questions. Please complete the notebook with you code to answer the questions. You are encouraged to install Anaconda distribution of Python to run the Jupyter notebook or directly use JupyterLab and accomplish this problem.


This homework comes with a data file, and a Jupyter notebook. You are asked to: