# Python代写｜Data 102 Assignment 4

本次美国代写是一个Python人工智能相关的assignment

**Overview**

Submit your writeup including all code and plots as a PDF via Gradescope.1 We recom

mend reading through the entire homework beforehand and carefully using functions for

testing procedures, plotting, and running experiments. Taking the time to test, maintain,

and reuse code will help in the long run!

Data science is a collaborative activity. While you may talk with others about the

homework, please write up your solutions individually. If you discuss the homework with

your peers, please include their names on your submission. Please make sure any hand

written answers are legible, as we may deduct points otherwise.

Please note that this homework is slightly shorter than usual, to give you

time to start working on your project.

**1 Observational Data on Infant Health**

The Infant Health and Development Program (IHDP) was an experiment treating low

birth-weight, premature infants with intensive high-quality childcare from a trained provider.

The goal is to estimate the causal effect of this treatment on the child’s cognitive test

scores. The data does not represent a randomized trial with randomly allocated treat

ment, so there may be confounders between treatment and outcome. In this problem, we

devise a propensity score model to control for observed confounders.

(a) (2 points) The CSV file ihdp.csv has 27 columns:

In this part, you’ll estimate ˆ e(x) (the predicted probability that zi = 1) by fitting a

logistic regression model that predicts zi from xi. Specifically:

1. Read the data in ihdp.csv (e.g. using the csv package in Python) into three

arrays: Z ∈ {0, 1}n containing the treatments, Y ∈ Rn containing the outcomes,

and X ∈ Rn×25 containing the features.

2. To fit a logistic regression model, use the scikit-learn package in Python,

which is imported as sklearn. Start with the following two lines: