Erik Bernhardsson    About

My issue with GPU-accelerated deep learning

I’ve been spending several hundred bucks renting GPU instances on AWS over the last year. The speedup from a GPU is awesome and hard to deny. GPUs have taken over the field. Maybe following the footsteps of Bitcoin mining there’s some research on using FPGA (I know very little about this).

I don’t think there’s a coincidence that GPUs that are built for graphics turn out to be great for image classification using convolutional neural networks. When you are dealing with pixel data packed into 2D arrays it’s possible to parallelize all operations very efficiently.

My issue is that the complexity of each minibatch is where is the number of parameters. The larger models you are dealing with, the bigger this issue becomes.

Word2vec uses a clever technique called hierarchical softmax to achieve (more details here). I have no idea how to implement this on a GPU and I suspect it’s impossible. Here’s where the CPU shows its strength – traversing a logarithmic datastructure takes a lot of branching and can’t be expressed as a batch operation.

Logarithmic data structures happens to be a field I’m pretty excited about, particularly for vector models and multi-class prediction problems. I’m the author of Annoy, which a library for high dimensional nearest neighbor queries, so it’s something I’ve spent some time thinking about.

For collaborative filtering and natural language processing, GPU architectures are highly constraining. I suspect once you hit a billion parameters or so, more specialized networks that use logarithmic datastructures will outperform for NLP and CF. The speedup from the brute force GPU approach will be offset by the smarter datastructures that a CPU can handle. I haven’t seen any research on this but seems to me like a huge opportunity. In particular, I would love to see hybrid architectures that can use a GPU for the “dense” networks and CPU for the “sparse” networks.

Want to get blog posts over email?

Enter your email address and get weekly emails with new articles!

Erik Bernhardsson

... is the CTO at Better, which is a startup changing how mortgages are done. I write a lot of code, some of which ends up being open sourced, such as Luigi and Annoy. I also co-organize NYC Machine Learning meetup. You can follow me on Twitter or see some more facts about me.