# represent state of the board as a string
# each string is a box
# the menace system will be a dictionary
# pick arbitrary
# store all moves in history and reward at the end of each game