Erik Bernhardsson    About

Annoying blog post

I spent a couple of hours this weekend going through some pull requests and issues to Annoy, which is an open source C++/Python library for Approximate Nearest Neighbor search. I set up Travis-CI integration and spent some time on one o...

Detecting corporate fraud using Benford’s law

Note: This is a silly application. Don’t take anything seriously. Benford’s law describes a phenomenon where numbers in any data series will exhibit patterns in their first digit. For instance, if you took a list of the 1,000 longest ri...

Running Theano on EC2

Inspired by Sander Dieleman’s internship at Spotify, I’ve been playing around with deep learning using Theano. Theano is this Python package that lets you define symbolic expressions (cool), does automatic differentiation (really cool), ...

Recurrent Neural Networks for Collaborative Filtering

I’ve been spending quite some time lately playing around with RNN’s for collaborative filtering. RNN’s are models that predict a sequence of something. The beauty is that this something can be anything really – as long as you can design ...

Where do locals go in NYC?

One obvious thing to anyone living in NYC is how tourists cluster in certain areas. I was curious about the larger patterns around this, so I spent some time looking at data. The thing I wanted to understand is: what areas are dominated ...