# python辅导 | ECON 570 Problem Set 1

ECON 570 Problem Set 1

Due: October 4, 2019

1 Computational Complexity

A. Write custom functions for

1. Calculating the inner product of two vectors.

2. Calcuatling the product of two matrices.

B. Generate random m × n matrix X and n × p matrix Y , for m = 50, n = 100, p = 200.

Compute

1. The amount of time it takes to multiply X and Y using the custom code you

wrote.

2. The amount of time it takes to multiply X and Y using built-in NumPy methods.

C. Now increase m, n, p each by a factor of 10.

1. How long do you expect it would take to multiply X and Y using your custom

code? How long does it actually take?

2. How long do you expect it would take to multiply X and Y using built-in NumPy

methods? How long does it actually take?

D. Increase m, n, p each by a factor of 10 again, and repeat the above but using NumPy’s

built-in methods only. How long does it take to multiply X and Y ? Did it increase by

the same factor as it did before when all the dimensions were increased by a factor of

10? Why or why not?

E. Generate a n × p matrix and a n-vector y.

1. Set n = 5000, p = 200. How long does it take to regress y on X?

1

2. Set n = 50000, p = 200. How long do you expect the same regression would take?

How long does it actually take?

3. Set n = 5000, p = 2000. How long do you expect the same regression would take?

How long does it actually take?

2 Breast Cancer Data

Use the breast cancer data from sklearn to perform the following exercises.

A. Load the breast cancer data with the load breast cancer method from the module

sklearn.datasets.

B. Standardize each feature in the data set.

C. Perform PCA on the standardized features. How many principle components must we

keep to explain 90% of the total variance? How much variance is explained if we keep

2?

D. Perform k-means with k = 2 on the full set of features, and on the first 2 principle

components only. Compare how well the clusters found by k-means in each of these

cases compare to the true targets of the data set.

3 Olivetti Faces

Use the Olivetti faces data set available through sklearn to do the following.

A. Fetch and load the data with the fetch olivetti faces method from the module

sklearn.datasets.

B. Demean each face in the data set (no need to divide by standard deviation as every

dimension is a number between a fixed range representing a pixel).

C. Compute and display the first 9 eigenfaces.

D. In class we showed that any given face in the data set can be represented as a linear

combination of the eigenfaces. For any face in the data set, show how it progresses as

we combine 1, 51, 101, . . . eigenfaces, until the full image is recovered.

2