Optimizing over multinomial distributions

Sometimes you have to maximize some function $$ f(w_1, w_2, ldots, w_n) $$ where $$ w_1 + w_2 + ldots + w_n = 1 $$ and $$ 0 le w_i le 1 $$ . Usually, $$ f $$ is concave and differentiable, so there’s one unique global maximum and you can solve it by applying gradient ascent. The presence of the constraint makes it a little tricky, but we can solve it using the method of Lagrange multipliers. In particular, since the surface $$ w_1 + w_2 + ldots + w_n $$ has the normal $$ (1, 1, ldots, 1) $$ , the following optimization procedure works:

Read more…

hdfs2cass

Just open sourced hdfs2cass which is a Hadoop job (written in Java) to do efficient Cassandra bulkloading. The nice thing is that it queries Cassandra for its topology and uses that to partition the data so that each reducer can upload data directly to a Cassandra node. It also builds SSTables locally etc. Not an expert at Cassandra so I’ll stop describing those parts before I embarrass myself.

Read more…

NoDoc

We had an unconference at Spotify last Thursday and I added a semi-trolling semi-serious topic about abolishing documentation. Or NoDoc, as I’m going to call this movement. This was meant to be mostly a thought experiment, but I don’t see it as complete madness.

Read more…