Python代写 | Problem1 The Bandit Game

本次美国代写是Python游戏np算法的一个assignment

This game involves a player P and a casino C. The casino C offers s slots (where s is determined at game day) one of which offers a 0.6 probability to win and k-1 slots having a 0.47 probability to win. C chooses which slot is the winning slot and may change the winning slot k times (also determined at game day but below 20) over 100 * s pulls.

P begins with 1000 tokens. Before each pull, P choose a slot and bets from one to three tokens. If P wins, then P’s wealth increases by that same number of tokens. If P loses, then P’s wealth decreases by that same number of tokens. C is told what P’s wealth is after each roll.

In a competition between teams X and Y, X will play as P and Y as C for one game. Then they will switch role in the second game.

You will be given the number of switches k, the number of slots s and an initial assignment by C of the winning slot. Before each pull, (i) check whether C has switched the choice of the winning slot and perform the switch unless C has used up its budget of k switches; (ii) accept the choice by P of a slot to choose and the amount to bet (provided P has at least that many tokens left); (iii) determine whether P wins or loses based on the winning probability (0.47 for losing slots) and (0.6 for the winning slot); (iv) add the winnings or subtract the losings from P’s total wealth (which starts off at 1000 and can never go below 0); and (v) inform C of P’s new wealth.

If P’s wealth ever reaches 0 or if the 100*s pulls are done, the game stops.

As usual, you will keep track of the time of each player and track the wealth of P graphically.