Delivering Music Recommendations
I’ve turned into a lazy bastard and I’m just posting presentations on this blog, but here’s one from Rohan Singh at Spotify talking about the backend infrastructure of the Discover page.
Read more…I’ve turned into a lazy bastard and I’m just posting presentations on this blog, but here’s one from Rohan Singh at Spotify talking about the backend infrastructure of the Discover page.
Read more…I was just at the NYC Predictive Analytics meetup talking about how we build machine learning algorithms using Hadoop to power music recommendations.
Great meetup, where we had two speakers, me and Blake Shaw from Foursquare. Blake talked about how they use machine learning at Foursquare, using Hadoop (and Luigi), and he uploaded his slides here!
Read more…I thought this article about the company culture at HubSpot is kind of funny. “HubSpot’s Awesome Presentation Shows how to Create a 21st Century Culture”.

Just FYI: You’re not different. You’re a bunch of white hipsters aged 25-30 dressed up in the same theme. That’s not being different.
Read more…I was in Portland, OR for a few days hanging out at OSCON. Was fun. I also talked a bit about Luigi:
Next week I’m presenting at the NYC Predictive Analytics meetup together with Blake Shaw from Foursquare. The topic is ML + Hadoop. Will be fun!
Read more…Sometimes you have to maximize some function $$ f(w_1, w_2, ldots, w_n) $$ where $$ w_1 + w_2 + ldots + w_n = 1 $$ and $$ 0 le w_i le 1 $$ . Usually, $$ f $$ is concave and differentiable, so there’s one unique global maximum and you can solve it by applying gradient ascent. The presence of the constraint makes it a little tricky, but we can solve it using the method of Lagrange multipliers. In particular, since the surface $$ w_1 + w_2 + ldots + w_n $$ has the normal $$ (1, 1, ldots, 1) $$ , the following optimization procedure works:
Read more…Continuing in the same spirit of shameless self-promotion, here’s some recent Luigi press:
Read more…
Just open sourced hdfs2cass which is a Hadoop job (written in Java) to do efficient Cassandra bulkloading. The nice thing is that it queries Cassandra for its topology and uses that to partition the data so that each reducer can upload data directly to a Cassandra node. It also builds SSTables locally etc. Not an expert at Cassandra so I’ll stop describing those parts before I embarrass myself.
Read more…We had an unconference at Spotify last Thursday and I added a semi-trolling semi-serious topic about abolishing documentation. Or NoDoc, as I’m going to call this movement. This was meant to be mostly a thought experiment, but I don’t see it as complete madness.
Read more…I’ve been obsessed with Wikipedia for the past ten years. Occasionally I find some good articles worth sharing and that’s why I created the wikiphilia Twitter handle. Just a long stream of stuff that for one reason or another may be interesting.
Read more…The Discovery page, the new start page in Spotify, is finally out to a fairly significant percentage of all users. Really happy since we have worked on it for the past six months. Here’s a screen shot:
Read more…