人工智能代写|CSCI-561 – Fall 2022 – Foundations of Artificial Intelligence Homework 3

In this homework assignment, you will implement a multi-layer perceptron (MLP) neural network and use it to classify data from four different datasets, shown in Figure 1. Your implementation will be made from scratch, using no external libraries other than Numpy; machine learning libraries are NOT allowed (e.g.Scipy, TensorFlow, Caffe, PyTorch, Torch, mxnet, etc.).

You will train and test your neural network implementation on four datasets inspired by the TensorFlow Neural Network Playground (https://playground.tensorflow.org). We encourage you to visit this site and experiment with various model settings and datasets.There are 4 files associated with each dataset. The files have the following naming scheme:

where <name> is one of the 4 dataset names: spiral, circle, xor, or gaussian. As a result, there are a total of 16 data files, all of which can be found in HW3->resource->asnlib->public.

Below is a visual representation of each dataset along with a brief description.

Spiral

Both classes are interwoven in a spiral pattern.

Data points are subject to noise along their respective spiral and thus may overlap.

Gaussian

Data points are generated and classified according to two Gaussian distributions. The distributions have different means, but samples may overlap as pictured.

XOR

Data points classified according to the XOR function. Noise may push data classes over XOR “boundaries” as seen in the figure below.

Gaussian

Data points are generated and classified according to two Gaussian distributions. The distributions have different means, but samplesmay overlap as pictured.

 Circle

Data points are generated and classified according to two annuli (rings) sharing a common center. Although the annuli are not overlapping,noise may push data points across the gap

The train and test files for each dataset represent an 80/20 train/test split. You are welcome to aggregate the data from each set and re-split to your liking. All datasets have 2-dimensional data points (the x,y coordinates of the point in R 2 ), along with binary labels (either 0 or 1).

Your task is to implement a multi-hidden-layer neural network learner (see model description part for additional details), that will do the following. For a given dataset,

Your program will take three input files (provided as paths through command line arguments) and produce one output file as follows:

run your_program train_data.csv train_label.csv test_data.csv

⇒ test_predictions.csv

For example,

python3 NeuralNetwork3.py train_data.csv train_label.csv test_data.csv

⇒ test_predictions.csv

In other words, your algorithm file NeuralNetwork.* will take training data, training labels, and testing data as inputs, and output classification predictions on the testing data. Note that your neural network implementation should not depend on which of the four datasets are provided during a given execution; your script will only receive the training data/labels and test data for a single dataset type at a time.

As mentioned in the overview, NumPy is the only external library you can use in your implementation (or equivalent numerical computing-only library in non-Python languages). By external we mean outside the standard library (e.g. in Python, random, os, etc are fine to use). No component of the neural network implementation can leverage a call to an external ML library; you must implement the algorithm yourself, from scratch. (You will receive no credit for this assignment if this rule is not adhered to).

The format of *_data.csv looks like:

x1 1 , x2 1 ,

x1 2 , x2 2 ,

Where x1 (n) , x2 (n) , are the coordinates of the n th data point. The *_label.csv and your output

test_predictions.csv will look like

where y (n) is either 0 or 1 corresponding to the label for data point x (n) (where is the n th data point, [x1 (n) ,x2 (n) ]). Thus, there is a single column indicating the predicted class label for each unlabeled sample in the input test file.

The format of your test_predictions.csv file is crucial. Your output file must have this name and format so that it can be parsed correctly to compare with true labels by the auto-grading scripts. This file should be written to your working path.

When we grade your submission, we will use hidden training data and hidden testing data for each dataset instead of the public data you are provided. That is, for each of the four datasets (spiral, circle, xor, or gaussian), your NN submission will be trained from scratch on hidden training data and evaluated on hidden test data. The handling of arguments in your program, along with the name/format of your output prediction file must match the above specifications to ensure your submission is auto-graded correctly.

The maximum running time to train and test a model is 2 minutes for each dataset. This means training/testing across all datasets can take at most 8 minutes, where a 2 minute limit is applied per dataset (i.e. time does not bleed over if a dataset is “finished” prior to the 2 minute mark).

The model you will implement is a vanilla feed-forward neural network, possibly with many hidden layers (see Figure 2 for a generic depiction). Your network should have 2 input nodes and output a single value.

Beyond this, there are no constraints on your model’s structure; it is up to you to decide what activation function, number of hidden layers, number of nodes per hidden layer, etc your model should use. It’s worth noting you should be using cross-entropy as your loss function (each dataset presents a binary classification task). Depending on your implementation, you may also need to employ the softmax  function on your last-layer outputs and select a single value for your final output.