Installing TensorFlow on AWS

Curious about Google's newly released TensorFlow? I don't have a beefy GPU machine, so I spent some time getting it to run on EC2. The steps on how to reproduce it are pretty brutal and I wouldn't recommend going through it unless you want to waste five hours of your live.

Instead, I recommend instead just getting the AMI that I built (ami-cf5028a5). Choose g2.2xlarge and you should have a box with TensorFlow running in a minute or two! Note that it's only available in us-east-1 (virginia) so far.

If you haven't used AWS, here's a tutorial on how to set up an instance from an AMI. I usually use spot instances since they are much cheaper, but they have some risk of getting killed unexpectedly (interestingly it seems more rare now, I wonder if it's since the Bitcoin price is so much lower).

There are some known issues with TensorFlow on AWS. In particular I wasn't able to get better performance from g2.8xlarge compared to g2.2xlarge, which sucks, since one of the cool features with TensorFlow is that it should distribute work across GPU's. See this thread for some more info. Looking forward to see these issues getting resolved.

What is TensorFlow?

It seems like there's a lot of misunderstanding about TensorFlow. It's not some crazy flow based graphical tool to do neural nets. It's kind of boring really. It's just a marginally better version of Theano with much faster compilation times and capability to distribute work over multiple GPU's/machines. Theano completely blew my mind when I first discovered it. Its approach was super innovative, but it's pretty rough around the edges and I think in open source the pioneers die with arrows in their backs.

I expect TensorFlow (or maybe CGT or something else) to grow more popular. But in practice I don't think people will use any of those straight up for machine learning – higher level libraries like Keras will be the preferred way to do most deep learning tasks.

Tagged with: software,