Erik Bernhardsson    About

A neat little trick with time decay

Something that pops up pretty frequently is to implement time decay, especially where you have recursive chains of jobs. For instance, say you want to keep track of a popularity score. You calculate today’s output by reading yesterday’s output, discounting it by and then adding some hit count for today. Typically you choose so that for a day or something like that. We do this to generate popularity scores for every track at Spotify.

There is another approach that doesn’t require you to do the discounting and gives you a bit more flexibility. If you think about it, essentially what you want to calculate is where is the current time and the sum is over all hits since we started keeping track of the popularity. This is a single score that takes time decay into account. You can add new hits by just discounting the existing sum and adding .

The problem is that you have to keep track of the timestamps together with the scores, or else you can’t do proper discounting when you add numbers that were calculated at different points in the time. There is a trick to get around this. Notice that the current time only introduces a constant factor so let’s just factor it out. You get which means you don’t have to keep track of what time it is. You can just apply the discount factor when you need it! In practice this means that you don’t have to keep track of any timestamp or anything.

Now, you need to keep track of the following thing instead: . This is a **ridiculously large number, so in principle you need to store the logarithm of it instead.

But if you store the logarithm of it, how do you add a new term? You basically want to calculate something like where represents a new term. This isn’t possible by the naive method because the intermediate sum will overflow, but it turns out that there is a simple trick to calculate this that makes the computer happy. This identity can be derived by some simple substitution and some logarithm identities:

Furthermore, say you want to evaluate the score at some point in $$ T $$ in time later. This is equivalent to $$ Sexp(-lambda T) = exp(s - lambda T) $$ and it can be evaluated without any overflow problems.

Erik Bernhardsson

... is the CTO at Better, which is a startup changing how mortgages are done. I write a lot of code, some of which ends up being open sourced, such as Luigi and Annoy. I also co-organize NYC Machine Learning meetup. You can follow me on Twitter or see some more facts about me.