# Python辅导 | CMPT310 Summer 2019 Assignment 4

Python实现蒙特卡罗树搜索要求使用A*图算法

As explained below, this assignment requires that you submit two files on Canvas:

One of the intriguing features of the AlphaZero game-playing program is that it learned to play chess extremely well given only the rules of chess, and no special knowledge about how to make good moves. Here, we want you to write a program that does the same thing but for a much simpler game: Tic-Tac-Toe.

In this part, your task is to implement (in Python) a Tic-Tac-Toe playing program where a human can play against the computer, and the computer makes all its moves using pure Monte Carlo tree search.

Importantly, you cannot give the computer any knowledge of how to play Tic-Tac-Toe beyond the basic rules of how to play it, how to determine legal moves, and how to recognize if the game is a win, loss, or draw.

Your program must play using pure Monte Carlo Tree Search (pMCTS).

The idea is as follows. When it’s the computers turn to make a move, it makes a list of all legals moves. Then for each of these moves it does some number of random playouts. A random playout is when the computer simulates playing the game — using randomly chosen moves — until it is over, i.e. a win, loss, or draw is reached. It records the result (a win, loss, or draw), and then does some more random playouts. It does random playouts for every possible move, and when they’re done it choses the move that resulted in the greatest number of wins (or least number of losses, or most number of wins + draws, … — whatever you think is the best way to make a decision based on the win/loss/draw statistics).

With enough random playouts, this approach will play essentially perfectly. Do enough random playouts so that your program never loses against a smart player.

The exact number of random playouts that the computer should do is an interesting question, and we want you to determine how many should be done. You should aim to do the fewest number of playouts that results in your program never losing. You may need to experiment a bit with this to find a good value.

Remember: Do not add any knowledge how how to choose a good move in Tic-Tac-Toe to your program! Your program must make its move decisions entirely by random playouts.

Your game should have a neat and easy-to-use interface in plain text.

You shouldn’t use any code from the textbook, or any special libraries for this: stick to regular Python, and code imported from its standard library.

When you are ready, put all your code in a single file named a4.py on Canvas. Please set up the file so that running the command “python3 a4.py” will launch a new game, e.g.:

Assignment 5 is a project that you can work on either by yourself, or with a partner (teams larger than 2 are not permitted).

So, in a file named project_info.txt, tell us this information:

The name of everyone on your team. Teams may have 1 or 2 people (teams of more than 3 are strictly not allowed).

If you are on a team of 2, you will need to talk to a TA about signing up in a team on Canvas.

Please note that we will expect a little more work from teams of 2, and we expect both members to do an equal amount of work. Teams of 2 that don’t follow these rules may have their marks adjusted appropriately.

The project (see below) that you’ve chosen to work on.

The main programming language you plan to use, plus a list of any libraries you plan to use.

Please note that general-purpose helper libraries of code are permitted, but no libraries that implement key AI techniques are permitted: the purpose of this assignment is for you to implement some AI algorithms on your own, and so calling existing versions of the algorithms is not what we want to see.

Implement a program that can optimally solve random instances of the 15-puzzle (and also the 25-puzzle, if you can).

The markers will be looking for the following:

Implement an efficient version of both WalkSAT, and a propositional theorem prover based on resolution. They should both use the minisat input format. Using the same set of test examples, compare the performance of your two algorithms against the Python version of WalkSAT (in the textbook code in logic.py) and minisat. Use randomly generated k-CNF sentences as input (these are described in the textbook in the first full paragraph of page 264).

The markers will be looking for the following:

As explained in this video, the peaceable queens problem asks how many black and white queens can be placed on an n-by-n chess board so that there are an equal number of white queens and black queens, and no two queens of different colors attack each other.

Information about current results can be found here: https://oeis.org/A250000. For example, since the video was published, apparently the value for n=15 has been found to be 32. Starting at n=1, here is the currently known sequence of values:

Solve this problem in at least two different ways: using minisat, and using min-conflicts customized for peaceable queens. The min-conflicts approach is intriguing because customized version of min-conflicts can efficiently find solutions to the regular n-queens problem for n in the millions in less than a minute.

The markers will be looking for the following:

A CNF encoding of the peaceable queens problem for any board size n > 1.

The details of how to do this are up to you: try to work it out from first principles.

An implementation of a customized min-conflicts style algorithm. Efficiency is important, so you should do this in a language more efficient than Python (e.g. C++).

The exact details of the algorithm are up to you: experiment, and try out some different ideas.

Note that there is no guarantee that a min-conflicts style algorithm will work as well for this problem as it does for the regular n-Queens!

For n=1 up to n=15 (or as big a value of n your program can handle) as you can handle, verify the values given in the video and listed at https://oeis.org/A250000.

Make an attempt find the solution for n=16 (or higher). If you think you’ve proven a value for n=16 (or higher), let us (and others!) know!

A written report of at least one page that explains your CNF encoding, the techniques in your min-conflicts style algorithm, the results of your programs, etc. This should include graphs and tables of relevant data.

Re-do part 1 of this assignment, but instead of Tic Tac Toe use either the game Connect 4 (on a board with 7 columns and 6 row), or Reversi (on an 8-by-8 board).

It may be harder to make perfect players for these games, but make them play as well as you can.

The markers will be looking for the following:

Use of a language more efficient than Python. Doing lots of random playouts as quickly as possible is the key to success. You should keep an estimate of how many playouts per second your program can do as a measure of its performance.

You program should play as well as possible, although perhaps not perfectly. Note that it has been proven that the first player can always force a win in Connect 4.

So that games don’t go on too long, you should put a limit on the maximum amount of time the computer can use (e.g. 5 seconds at most per move).

In addition to the pure Monte Carlo Tree Search version of your program, created a second modified version of your program that uses heuristics — and any other additions you think would help — to make (hopefully) better-than-random moves during the playouts.

Make this version of your program play against the other version to see which one is better.

A written report of at least one page that explains your implementations, discusses how well they play, the extra techniques used by your second version, a comparison of which program is stronger, etc. This should include graphs and tables of relevant data.