Capstone Game Post
Mortem:
Playing Go with
Mobile Neural Networks
Bryan O’Malley
MGMS Program
Full Sail University
22/8/2016
Dedication
To my wife, Stephanie, for
putting up with me, my crazy schedule, and my horrible mismanagement of it.
Playing Go
with Mobile Neural Networks
Research on modern AI techniques for
mobile platforms.
Android was used for testing.
The initial test prototypes were done
in MATLAB. C++ with ArrayFire for GPGPU math was utilized for training the
network weights, while the final test application was made in Unity with OpenCV
for math processing.
This project’s audience is intended
to be computer scientists, software engineers, and game programmers at the
graduate level.
Following the history-making work of Google DeepMind’s AlphaGo project,
I wanted to determine if similar techniques could be utilized to make a
stronger, more modern AI for devices less powerful than a supercomputer.
Most current consumer-level Go playing AIs are underwhelming for
skilled amateurs to play against, providing little real challenge. Looking into
their methods, the majority of these utilize optimized Monte Carlo algorithms
and are expensive and slow to run even at moderate difficulty levels. My idea
was to attempt to utilize more modern methods of AI to approach the problem
from another angle, attempting to create a lighter-weight AI, even if the
playing strength couldn’t be increase. Google’s AlphaGo project provided an
interesting approach to try, and their research showed that a properly set up
AI of this type, even with modest parameters should be able to out-perform
current consumer-level Go AIs. With this in mind, the remaining question was
whether or not these scaled down versions would be possible to run on a mobile
platform. My experiment sought to address that question.
My motivation for this project was my own attempt to learn to play Go,
and attempting to find a Go AI for mobile to practice against. The availability
was very weak, and most apps’ reviews are plagued with negative remarks
regarding their level of difficulty. Additionally, my inquiries to the admins
of Online-Go.com indicated that their Go playing AI bots requires large amounts
of server processing to run. I wondered if I could do something to address both
problems.
The scope of my capstone, after some revision by my advisors and
myself, was to determine if I could run a reasonably challenging (compared to
current competition, such as Fuego or GnuGo) Go AI on a mobile platform by
means of utilizing a neural network AI.
With unlimited time, effort, and money, my ideal positive outcome for
my experiment would have been a convolutional neural network-based AI possible
of running on a mobile device with reasonable turn times (<10s) that could
provide a competitive experience even for players into the low amateur Dan
ranks of Go.
Generally speaking, I was able to cram
a lot more work and development time into my schedule than I thought I’d be
able to. I think the final push to get the test application done went very
well, also. Most of the phases of the project ran into one technical glitch or
hang-up or another, but I was able to push through each in turn.
I was able to create, after my initial
phase of testing network architecture in MATLAB and researching different math
libraries, a fairly robust and modular neural network training utility using
C++ and ArrayFire. This came together fairly quickly and I was very soon able
to start testing and developing the final networks.
Training of the neural networks came
together in a big way, early on, due to finding and being able to utilize a
very good guide on conjugate gradient descent and I was able to get some real
gains in the efficiency of my training algorithms. Some training tasks ended
up, after optimizations, with ArrayFire on GPU, taking much less time than I
initially estimated, even though that time was eventually consumed by other
tasks.
I was able to put my app in several
people’s hands, at least 2 of which actually had some background in playing Go,
and they seemed very happy with the application. While the AI didn’t really
provide much of a challenge, I was told by one tester with much more background
in Go than me, that it provided very interesting or convincing looking moves at
first. The other testers, while perhaps not as able to provide specific input
on the AI’s play level, seemed to quite like the interface I’d come up with and
were impressed by the AI’s speed, which was in a way, the main purpose of the
experiment.
The results of the experiment are very
positive overall. While my final neural network is a bit trimmed down from the
AlphaGo example I was emulating for training purposes, the especially snappy
play times shows that in practical use, it is definitely a viable option.
A huge net positive for me, was that I was able to learn quite a bit of
the length of this process in terms of cross-platform development, dealing with
large data sets, GPGPU math processing, neural networks, conjugate descent, and
more. I think a lot of this learning I did is hugely beneficially, not just for
my own personal usage, but because I think I may be able to reshape it in a way
that may, perhaps, be a bit more approachable for others.
The scope of project management on my
project was not as much as I’m used to dealing with, in terms of work, or prior
school experience, due to the project being entirely a solo work. There were no
standups or sprint planning with a group, and there was no discussions of roles
or what others are doing, because I already knew what I’d done and what I’m
doing. I think due to this I, at times, let myself sort of ‘sit’ in a given
task for a bit too long. For example, the MATLAB prototype and learning phase
of the project probably stretched on a bit too long and robbed me of time
researching and testing things with CNNs later in the development that could
have been helpful.
While I was very quickly able to get my
networks training utilizing ArrayFire, and I was able to experiment with a lot
of different training strategies and meta-variables, because of the modular
design I came up with, the training itself was a bit of a disaster. The data
sets I was working with were very large, and the solutions I found for working
with these large data sets proved to be very much too slow on my hardware.
Perhaps with additional time or more money to invest in hardware, I could have
overcome this, but the final training pushes on the test networks simply didn’t
bear the fruit I’d hoped for. With over 40 million entries in my database, I
didn’t really end up training the network on more than a few hundred thousand
of these data points. The software, on my hardware, simply didn’t have the
speed or the optimization to handle this sort of data throughput. So, though
the experiment was mainly a success, in that it proved the final trained
network could easily run on the test device (NVidia Shield Tablet), the final
trained competency of the AI wasn’t really testable, as the training was never
truly finalized.
There were a variety of math solutions, ready-made
neural network training libraries and frameworks, as well as data storage
solutions that I found. The problem was that, for the hardware and OS setup I
had at my disposal, the practical options were extremely limited. On top of
that, attempting to stick with free or low-cost solutions to my problems also
shut a lot of doors for me. In some cases, this turned out to not be a
significant problem, for example, ArrayFire solved my math problems fairly
handily. However, it’s possible I could have produced a more optimized training
tool if I’d gone with a pre-existing neural network framework, such as Caffe or
TensorFlow. Also, my storage solution for my dataset (Kyoto Cabinets) turned
out to be very problematic in terms of performance, though this may be a
configuration problem, as I didn’t have a lot of time to test different
configurations of the database.
Testing and recording of test results during
development of the project was fairly lacking. In retrospect, I should have
been collecting a lot more data on training speed, turn times, and perhaps
generated some alternative accuracy and costs scores for the networks I was
training, in order to better describe my results in follow-on documentation, in
the hopes of publishing my results. With this being so lacking, I will likely
need to go back revisit several of these steps to record this data. Luckily I
took enough notes on my process that this shouldn’t be too difficult a hurdle
to overcome.
While my main goal of testing the time performance of
the neural network AI was successful, there were other performance benchmarks I
should have considered, such as battery drain. Also, with the incomplete
training of the final CNN AI, I was unable to get an accurate picture of the
AI’s play strength, and even if I had, I didn’t have a good way of testing this
lined up in any practical way. I had hypothetical ideas on how to test this,
but I wouldn’t have likely had the time or resources to follow-through on any
real decent test of the AI’s skill level.
One of my main goals of going through this research process was to
learn how to create a publishable white-paper, by doing. I believe at this
point I’m well on my way to that goal, but I have not managed to cross that
finish line, which is disappointing. The level of effort still left to be done
to move this research to a publishable state is daunting, though I do plan to
continue with it unto completion.
In full retrospect, my project probably didn’t quite accomplish as much
as my ideal goal set out to accomplish. But, I think, looking back at the work
that was done, the work that didn’t get done, and the scope of what was
attempted, I don’t know if the ideal would actually have been achievable on the
timetable I had, given the starting knowledge I had at the start of the
project.
I think, had I the time to go back and start again, I could possible
achieve the ideal in the time allotted, provided there is some hardware or
software solution to my problems with training times, making the network’s
training achievable on my timetable. Though, even that provides a sort of hard
limit on what I could accomplish by myself.
Overall, I think the scope of what I attempted may well have been a bit
out of my reach, though with a somewhat more limited ideal or slightly
redefined definition of success, I could perhaps fit my ideal to the scope of
my work. I think what I did manage to achieve is at a good level based on the
time, resources, and skills that I had. Without having more available to me in
any of these categories, I doubt the end result could have been much different.
In the end, then, I am happy with the final outcome of the work, and I
hope to be able to, in the months to come, turn this into something I’m okay
with publishing. Perhaps this work, though not at quite the lofty goals I set
for myself, still may provide a merit to the game development community, both
in showing the ease in which these sorts of AI can be employed, while also
outlining the struggles that someone may have in doing so.
·
AccelerEyes. (2016). ArrayFire | Faster Code.
Retrieved July 31, 2016, from http://arrayfire.com/
·
Enox Software. (2016, June 28). OpenCV for
Unity. Retrieved July 31, 2016, from
https://www.assetstore.unity3d.com/en/#!/content/21088
·
FAL Labs. (2011, March 4). Kyoto Cabinet: a
straightforward implementation of DBM. Retrieved July 31, 2016, from http://fallabs.com/kyotocabinet/
·
Google Brain. (2016). TensorFlow [Machine Leanring Library]. Mountain View, CA.
·
Jia, Y., Shelhamer, E., & Berkeley Vision
and Learning Center (2016). Caffe [Deep Learning Framwork]. Berkeley, CA.
·
O’Malley, B. (2016). Unity Go Player [Unity Android Game]. Orlando, FL: Student Project.