I had a chat with my friend Devang about AI and what not. He was working on a Pac Man solver. He then mentioned to me min/max solvers and how he had used them to solve tic tac toe before. This perked my interest since I had attempted before to try and create a tic tac toe solver with TD reinforcement learning neural networks. Biting off more than I could chew. So I listened through his description of the problem and sat down and coded this up.

The algorithm works by keeping a score associated with every possible board state. Two computer agents play against one another, placing moves at random. As they play a record is kept of the states of the board before each move they make. Once one wins, a value is added to the scores associated with every state recorded for the current game. Eventually the normalized scores converge to the probability of the team winning for the given state, assuming the opponent plays in the probabilistically best locations as well.

Here is the probability of X winning for each of its initial moves. I let it run for a million iterations in the hopes that that would allow it to fully settle out. As you can see however, the corners have slightly differing probabilities when (seeing as they are isomorphic states) they should all be the same. Same with the side probabilities. Still, as common sense would show, the center has the higest score for placing your first move.

0.129454 0.076044 0.129082 0.074750 0.187690 0.074530 0.127845 0.073732 0.126872