statistics

Mortality statistics and Sweden's "dry tinder" effect

2020-09-23 We live in a year of about 350,000 amateur epidemiologists and I have no desire to join that “club”. But I read something about COVID-19 deaths that I thought was interesting and wanted to see if I could replicated it through data.

How to hire smarter than the market: a toy model

2020-01-13 Let's consider a toy model where you're hiring for two things and that those are equally valuable. It's not very important what those are, so let's just call them “thing A” and “thing B” for now.

Buffet lines are terrible, but let's try to improve them using computer simulations

2019-10-16 My company has a buffet every Friday, and the lines grow to epic proportions when the food arrives. I've suspected for years that the “classic” buffet line system is a deeply flawed and inefficient method, and every time I'm stuck in the line has made me more convinced.

Modeling conversion rates using Weibull and gamma distributions

2019-08-05 This is a blog post originally featured on the Better engineering blog. If you want to link to this article or share it, please go to the original post URL! Separately, I'm sorry it's been so long with no posts on this blog.

Why software projects take longer than you think: a statistical model

2019-04-15 Anyone who built software for a while knows that estimating how long something is going to take is hard. It's hard to come up with an unbiased estimate of how long something will take, when fundamentally the work in itself is about solving something.

The hacker's guide to uncertainty estimates

2018-10-08 It started with a tweet: New years resolution: every plot I make during 2018 will contain uncertainty estimates — Erik Bernhardsson (@bernhardsson) January 7, 2018 Why? Because I've been sitting in 100,000,000 meetings where people endlessly debate whether the monthly number of widgets is going up or down, or whether widget method X is more productive than widget method Y.

Waiting time, load factor, and queueing theory: why you need to cut your systems a bit of slack

2018-03-27 I've been reading up on operations research lately, including queueing theory. It started out as a way to understand the very complex mortgage process (I work at a mortgage startup) but it's turned into my little hammer and now I see nails everywhere.

Learning from users faster using machine learning

2017-12-12 I had an interesting idea a few weeks ago, best explained through an example. Let's say you're running an e-commerce site (I kind of do) and you want to optimize the number of purchases. Let's also say we try to learn as much as we can from users, both using A/B tests but also using just basic slicing and dicing of the data.

The half-life of code & the ship of Theseus

2016-12-05 As a project evolves, does the new code just add on top of the old code? Or does it replace the old code slowly over time? In order to understand this, I built a little thing to analyze Git projects, with help from the formidable GitPython project.

Subway waiting math

2016-07-09 Why does it suck to wait for things? In a previous post I analyzed a NYC subway dataset and found that at some point, quite early, it's worth just giving up. This isn't a proof that the subway doesn't run on time – in fact it might actually proves that the subway runs really well.

NYC subway math

2016-04-04 Apparently MTA (the company running the NYC subway) has a real-time API. My fascination for the subway takes autistic proportions and so obviously I had to analyze some of the data. The documentation is somewhat terrible, but here's some relevant code for how to use the API:

More MCMC – Analyzing a small dataset with 1-5 ratings

2015-12-05 I've been obsessed with how to iterate quickly based on small scale feedback lately. One awesome website I encountered is Usability Hub which lets you run 5 second tests. Users see your site for 5 seconds and you can ask them free-form questions afterwards.

MCMC for marketing data

2015-10-31 The other day I was looking at marketing spend broken down by channel and wanted to compute some simple uncertainty estimates. I have data like this: <th> Total spend </th> <th> Transactions </th> Channel A <td> 2292.

It's called Berkson's paradox!

2015-04-09 As noted by multiple tweets, my previous post describes a phenomenon denoted Berkson's paradox. Here's another example: Why Are Handsome Men Such Jerks?

Erik Bernhardsson

About Top posts

Articles tagged with statistics