Here's a theory I have about cloud vendors (AWS, Azure, GCP):
Cloud vendors1 will increasingly focus on the lowest layers in the stack: basically leasing capacity in their data centers through an API. Other pure-software providers will build all the stuff on top of it.
I guess I should really call this a parable.
The backdrop is: you have been brought in to grow a tiny data team (~4 people) at a mid-stage startup (~$10M annual revenue), although this story could take place at many different types of companies.
Software infrastructure (by which I include everything ending with *aaS, or anything remotely similar to it) is an exciting field, in particular because (despite what the neo-luddites may say) it keeps getting better every year! I love working with something that moves so quickly.
Hanlon's razor is a classic aphorism I'm sure you have heard before: Never attribute to malice that which can be adequately explained by stupidity.
I've found that neither malice nor stupidity is the most common reason when you don't understand why something is in a certain way.
My company has a buffet every Friday, and the lines grow to epic proportions when the food arrives. I've suspected for years that the “classic” buffet line system is a deeply flawed and inefficient method, and every time I'm stuck in the line has made me more convinced.
Anyone who built software for a while knows that estimating how long something is going to take is hard. It's hard to come up with an unbiased estimate of how long something will take, when fundamentally the work in itself is about solving something.
It started with a tweet:
New years resolution: every plot I make during 2018 will contain uncertainty estimates
— Erik Bernhardsson (@bernhardsson) January 7, 2018 Why? Because I've been sitting in 100,000,000 meetings where people endlessly debate whether the monthly number of widgets is going up or down, or whether widget method X is more productive than widget method Y.
This is a bit of a rant but I really don't like software that invents its own query language. There's a trillion different ORMs out there. Another trillion databases with their own query language. Another trillion SaaS products where the only way to query is to learn some random query DSL they made up.
Here's a dumb extremely accurate rule I'm postulating* for software engineering projects: *you need at least 3 examples before you solve the right problem*.
This is what I've noticed:
Don't factor out shared code between two classes.
I was reading yet another blog post titled “Why our team moved from <language X> to <language Y>” (I forgot which one) and I started wondering if you can generalize it a bit. Is it possible to generate a N * N contingency table of moving from language X to language Y?
As a project evolves, does the new code just add on top of the old code? Or does it replace the old code slowly over time? In order to understand this, I built a little thing to analyze Git projects, with help from the formidable GitPython project.
Why does it suck to wait for things? In a previous post I analyzed a NYC subway dataset and found that at some point, quite early, it's worth just giving up.
This isn't a proof that the subway doesn't run on time – in fact it might actually proves that the subway runs really well.
Apparently MTA (the company running the NYC subway) has a real-time API. My fascination for the subway takes autistic proportions and so obviously I had to analyze some of the data. The documentation is somewhat terrible, but here's some relevant code for how to use the API:
I do a lot of recruiting and have given maybe 50 offers in my career. Although many companies do, I never put a deadline on any of them. Unfortunately, I've often ended up competing with other companies who do, and I feel really bad that this usually tricks younger developers into signing offers.
For some reason I decided one night I wanted to get a bunch of fonts. A lot of them. An hour later I had a bunch of scrapy scripts pulling down fonts and a few days later I had more than 50k fonts on my computer.
The easiest way to be a 10x engineer is to make 10 other engineers 2x more efficient. Someone can be a 10x engineer if they do nothing for 364 days then convinces the team to change programming language to a 2x more productive language.