This is a blog post rewritten from a presentation at NYC Machine Learning on Sep 17. It covers a library called Annoy that I have built that helps you do nearest neighbor queries in high dimensional spaces. In the first part, I went thro...

This is a blog post rewritten from a presentation at NYC Machine Learning last week. It covers a library called Annoy that I have built that helps you do (approximate) nearest neighbor queries in high dimensional spaces. I will be splitt...

A couple of people in my old team have been around talking about how Spotify does music recommendations and put together some quite good presentations.
First one is Neville Li’s presentation about Scala Data Pipelines @ Spotify:
The se...

I was playing around with D3 last night and built a silly visualization of antipodes and how our intuitive understanding of the world sometimes doesn’t make sense. Check out the visualization at bl.ocks.org!
Basically the idea is if you...

Every once in a while when talking to smart people the topic of automation comes up. Technology has made lots of occupations redundant, so what’s next?
Switchboard operator, a long time ago
What about software engineers? Every year t...

Here’s a problem that I used to give to candidates. I stopped using it seriously a long time ago since I don’t believe in puzzles, but I think it’s kind of fun.
Let’s say you have a function that simulates a random coin flip. It retu...

Annoy is a library written by me that supports fast approximate nearest neighbor queries. Say you have a high (1-1000) dimensional space with points in it, and you want to find the nearest neighbors to some point. Annoy gives you a way t...