Open source

I like building software, and below are some open source projects I've built during my time at Better or Spotify, along with some things I built in my spare time. Most of it is not on my personal Github, so I've compiled it here:

Annoy

Annoy is a C++/Python library to index and retrieve vectors in high-dimensional spaces. These types of approximate nearest neighbor queries are very useful for certain types of machine learning models that I used at Spotify to power the recommender system. Annoy is widely used outside of Spotify, for instance by Instacart and Uber.

Luigi

I built Luigi at Spotify to handle the very complex dependency graph of job arising in the recommender system. It's been widely used by many other companies outside of Spotify. In the last few years, I have not had time to support Luigi, and it's probably past its days of glory.

ANN-Benchmarks

ANN-Benchmarks provides a benchmark suites for approximate nearest neighbor systems such as Annoy. It features a number of pregenerated data sets that are used by researchers building new algorithms.

Git of Theseus

Git of Theseus came out of a blog post on this blog analyzing the half-life of code repos and how different authors contribute to it. It's a tool that can analyze any git repository and output a number of interesting charts.

Convoys

Convoys fits a number of statistical models for analyzing conversion rate. There's a bunch of fun statistics/math going into it, and it has helped Better improve its user acquisition and conversion substantially. I've written a blog post on Better's engineering blog.

JSONSchema2DB

Introspects JSON Schemas and creates Postgres/Redshift tables and ingests data to them.

Synchronicity

Synchronicity does a bunch of magic wrapping to take a library written using Python's async syntax, and turns it into something that can be used both in a synchronous and an asynchronous context.