Headcount goals, feature factories, and when to hire those mythical 10x people
When I started building up a tech team for Better, I made a very conscious decision to pay at the high end to get people. I thought this made more sense: they cost a bit more money to hire, but output usually more than compensates for it. Many fellow CTOs, some went for the other side of the spectrum. This was a mystery to me, until it all made sense to me.
What is output?
Before we get started, let me clarify what I mean by “output” or “productivity”. I don't mean an engineer just hammering on the keyboard shipping code at light speed. When I talk about it, I refer to a whole range of things, like helping your coworkers, introducing new frameworks, improving the process, and much more. I've written about this in the past.
You can't really measure it, of course. But all managers try, when they set the salary of Alice to $110,000 and Bob to $115,000. So on some level, managers certainly believe they have some precise idea of the relative value of each engineer.
Headcount goals
Let's dissect an classic management objective: headcount goals. In a typical engineering hiring process, a CTO (or high up person) figures out roughly how much they need to get done compared to how many engineers they have, then goes to the CFO and haggles a bit, then gets assigned a headcount number and a salary range for those people. That then gets distributed across the org recursively, and every hiring manager gets a target for how many people to hire.
Let's say the CTO is absolutely adamant that they need to grow the engineering team by 2x in a year. This bubbles down to a junior engineering manager. If you are running a decentralized interview process, then you know create a great agency problem where the junior manager is told their success at the company is partly measured by how well they reach their hiring goal. Of course they are going to lower their bar for who they hire!
Don't think this happens? I've seen it. I've seen how recruiting bars start slipping because of well-meaning people pushing for more resources. And how over time the average level of engineering talent slowly declines.
Solving the misalignment
The right solution to this is partly to make the interview process and decision centralized. No team should impose their own hiring standards because with aggressive headcount goals, everyone on that team will be incentivized to lower the bar.
But let's also on more fundamental level: why headcount goals? This makes the underlying assumption that every engineer has roughly the same productivity. In reality, engineer productivity can be very dispersed.
So why not target a certain output level? Of course it's because engineers don't come with labels that say this one is a 2.3x engineer that costs $140,000 and that other one is a 4.5x that costs $180,000. You don't know! Let's first talk about this relationship though because I think it's important to understand.
Cost as a function of productivity
What's maybe surprising is that cost as function of productivity seems to be a sub-linear function. A 3x or 4x engineer might cost say 2x more. This is clearly not a law imposed by physics that fits a straight line, but I think most people who have done some serious recruiting would concede that it follows something slightly less than linear.
For instance, let's say the cost of a $$k$$x engineer is $$k^{0.6}$$. So for a 2x engineer we pay 1.5x more and for a 10x engineer we pay 4x more. The choice of the exponent is a bit arbitrary here, but the point is to reflect that the cost scales less than lineary. Any exponent less than 1 works for the purpose of this argument, and note that an exponent larger than 1 would not exist in an efficient market. No one would hire a 2x engineer at 2.1x the cost – they would simply hire 2 1x engineers.
This seems like a no-brainer then. Why wouldn't everyone pay a ton more money to hire the most senior engineers? Let's throw headcount targets out the window and replace with total output target Maybe we should even go as far as having a total salary dollar target, rather than headcount? Besides the challenge of convincing your CFO of this, it probably misaligns incentives even more.
Headcount targets usually come with salary bands that you agree on beforehand. This is another weird constraint if you think about it – if more expensive engineers have a higher ROI then why cap the cost (and thus the productivity)?
These are things that I've been struggling to understand. It turns out, you can formalize a simple model where it's rational to hire two 1x engineers instead of a 2x engineer even if the total cost is higher.
Feature factories and task overhead
There's one common argument for hiring “cheaper” engineering talent which is that a ton of tasks are straightforward, unsexy, or boring. Maybe an entry-level engineer doesn't mind tweaking WordPress themes all day, but a senior engineer need more challenges. At the extreme end of this spectrum is a type of company often derided as a feature factory, where I suspect people imagine a sweat shop of super inexperienced engineers basically updating forms in HTML or adding tracking pixels.
I'm pretty unconvinced by argument. A senior person will find opportunities to automate and reduce repetitive parts, paying for themselves.
However, there's a slight variant of this idea that I think actually does justify hiring less experienced engineers, which has to do with task overhead. Let's consider a toy model:
Let's say we have two engineers, one called Norm the normal engineer and one Twanda the 2x engineer. Let's say they both work at a company where Norm spends 50% of his time actually working, with the rest of the time lost as “task overhead”. Maybe a bunch of bookkeeping (going into Jira, creating Github pull requests, waiting for CI etc). This is overhead that have to happen for every task.
How much more productive is Twanda compared to Norm? 2x? No! Twanda generates 4/3 as much value! And in general, if a 1x engineer spends $$ c $$ of their time on “task overhead” items, then a $$ k $$x engineer will have output factor $$ 1 / (c/k + 1-c) $$.
Note that spending time in meetings doesn't have the same impact. In a hypothetical company where 90% of all time is being spent in meetings, a 2x faster engineer would still get 2x more work done (in the 10% of time that isn't spent in meetings). What my model is talking about is task-related overhead.
You can see in this toy model that a lot of the productivity gains of a higher-output engineer will be diminished in an environment with high task overhead. You really benefit a lot more from more productive people if you minimize the amount of task overhead!
The cost-benefit analysis of high output engineers
Now we have a bunch of the assumptions that lets us calculate the output per cost of a $$k$$x engineer. We know the output factor $$ 1/(c/k + 1-c) $$ and the cost $$ k^{0.6} $$ so the output per cost is:
$$ \frac{1}{(c/k + 1-c)k^{0.6}} $$
For any given value of $$ c $$, we can solve for the optimal value for $$ k $$! Take the derivative with respect to $$ k $$ set it to zero. Because I'm a lazy person, I just plugged it into Wolfram Alpha and the optimal value of $$ k $$ as a function of $$ c $$ turns out to be
$$ k = \frac{2}{3}\frac{c}{1-c} $$
Let's plot the optimal value of $$ k $$ with respect to $$ c $$. I had to plot it on the log-scale for the shape to come out nicely:
Beautiful! Let's unpack this by picking a few points on the chart:
- Extreme case: if the overhead is 100% then the best value for money is to hire 0x engineers.
- If the overhead is about 80% then the best value for money is to hire 0.2x engineers.
- If the overhead is about 40% then the best value for money is to hire 1x engineers.
- If the overhead is about 20% then the best value for money is to hire 3x engineers.
- If the overhead is about 7% then the best value for money is to hire 10x engineers.
- Extreme case: if the overhead is 0% then the best value for money is to hire ∞x engineers.
So it's all about getting the overhead of work down.
Getting the most value out of your tech team
We talked a lot about the difference between engineers in terms of productivity vs cost and how to get the most value of them. The good news is that there's really only two things that it boils down to!
- Have a centralized recruiting process with a consistent high bar
- Reduce the task overhead to a minimum
If you don't have those things, there's no point trying to hire super senior people: and in particular you are probably better off hiring average engineers. Xavier Amatriain wrote a blog post with sort of similar conclusions: don't expect that you can cherry-pick elements of the Netflix culture and drop it into your startup. You might have to start with your development process and your hiring process!
If you had asked me before I wrote this blog post why some companies pay top dollars for engineers and other don't, I probably would have said that some companies are super tech focused, and so they can truly get value out of really expensive engineers, whereas some companies are a collection of scripts using some off-the-shelf framework, and an expensive engineer wouldn't make a huge difference.
I still think this is right, but I think the exact causality has to do more with the model posited in this post. As an example, Google (known for paying much) have types of challenges that engineers can work independently for a very long time. That lowers the (amortized) task overhead, which means that they get more value out of an expensive (but more productive) engineer. Other companies have a large quantity of small projects (thus a large task overhead) meaning they rationally shouldn't pay at the top of the market.
This all definitely strikes me as kind of “obvious” in hindsight, and maybe you feel the same. At least you know have some math to back it up!
Tagged with: startups, hiring, management, math