Interviewing engineers is hard because it's adversarial

2023-05-18

Hiring is one of the most important functions in any company. The people who are hired end up deciding who else is hired (etc), and this basically determines what kind of company it becomes. The same is true of software engineer careers: the perceived quality of the first few jobs can dramatically affect the trajectory of a career.

This makes interviewing high stakes for everyone involved. Most software engineers hate interviewing and correctly point out that almost all interviews in this industry are uniformly bad. They're a stressful time sink, require too many touchpoints over too long a timeline, are often opaque, test skills that barely relate to the actual job, with seemingly arbitrary results, and worse, are often influenced by factors unrelated to qualifications.

The terrible state of interviewing is sort of an idle obsession in our industry; there are many opinions but very little in the way of rigorous analysis or discussion. In my experience the most commonly overlooked consideration is that interviewing is necessarily adversarial: the two sides involved have different incentives and goals. Hiring orgs want to find and hire the best candidate for their given budget. Candidates want an offer - the highest one possible, regardless of fit. This is because 1) having multiple offers confers leverage, and 2) the principal-agent problem. This tension makes it hard for either side to approach the problem with a 100% collaborative mindset, which is why most naive ideas for how to make interviews "better" either fail to scale, or would never work in the first place.

So here are just a few considerations that must be accounted for when designing interviews:

Factors to consider when designing an interview

Total ordering: You might be forgiven for thinking an ideal interview has a specific and clearly delineated "bar" that a candidate needs to clear to receive an offer. They either do well enough to get an offer, or they don't. Theoretically this maps to a boolean. In practice, this needs to be an integer, because what if multiple candidates pass and there's only one open position? You need a way to rank candidates, not just pass/fail them.

Cheating: Glassdoor (etc) makes it extremely easy to cheat by "studying to the test" which causes at least two problems: selectively unintentionally drifts down over time (more candidates will pass, even though the interview as designed should've failed them), and it becomes difficult to fairly compare candidates, because probably some cheated and some didn't. It's hard to counter this; basically you have a sliding scale where you're always trying to balance "freshness of your questions to mitigate cheating" and "consistency so the questions are 1) of similar difficulty over time and 2) make relative comparisons easier".

Selectivity: Every interview question should give the candidate more opportunities to demonstrate they are a good fit, and this means questions should be calibrated to the expectations of the role. But in practice it's often necessary to develop a question to weed out clearly unqualified candidates - that is to say, passing the question is necessary but not sufficient for an offer. Suppose you're hiring a senior machine learning engineer. If your goal is "minimize the risk that I proceed with someone from the bottom 30% of the candidate pool which has objectively never done any machine learning" then "explain to me in your own words how a support vector machine works" might suffice as as a screener. Arriving at the "correct" difficulty for screener and main interview questions is a difficult problem, and the candidate pool is not static: in general you'll find candidates magically get better at answering questions over time because of Glassdoor.

Repeatability & scalability: Interviews need to be somewhat repeatable and controlled, because that's a prerequisite for targeting a specific selectivity and getting a total ordering. Repeatability is highest when isolating variables and removing extraneous details, which is naturally in tension with conducting an interview that "mirrors a real engineer's day-to-day." The average engineer's day-to-day is driven by lots of context about their team and business and that's really hard to simulate for a "realistic" interview, and can be markedly different from one day to the next.

Trainability: A good heuristic to go on is "if you can't explain what a good answer is and why it's a good answer then you haven't thought rigorously enough about it, or it's a bad question." You need to be able to train people to give the interview question properly, answer clarifying questions, ask appropriate followups, and most important - grade candidates fairly. Since a good question doesn't ask for a factoid that someone can memorize and regurgitate, this can be much harder than it seems. Forcing yourself to actually put pen to paper and lay out what, exactly, is a "good/better/best" answer without accidentally encoding some flaw into it (like implied knowledge of some random factoid) can be surprisingly difficult.

Selection bias / pain tolerance: Given how high stakes hiring is, it's safe to predict that companies would take much more time to interview candidates if they could have unlimited time, money, and candidate patience. But sought-after engineers have lots of options, so they get snapped up quickly. The engineers who are *most* willing to jump through 8-round interviews in a good economy are (statistically speaking) going to be the ones that are mediocre-to-bad. Joel Spolsky famously complained about this almost 2 decades ago (until he switched to complaining about how much Facebook paid). It's the "market for lemons" principle: this person must not be very good, because they're interviewing! This results in an equilibrium where most companies know they have, at best 3-4 hours of interviewing to decide whether to make an offer or not, because they know that longer processes will repel good candidates and attract only bad candidates, overwhelming any benefits of higher confidence hiring decisions.

"I would just do x"

Most people who think there is a Better Way to interview software engineers have an interview process in mind that fails at least one of these considerations. Again, interviewing is adversarial, so you can't assume candidates won't try to game the process [1], or that they are representing their skills and work experience accurately. At the same time, being mindful of time constraints is necessary to minimize pain and avoid repelling good candidates with arduously long funnels. This limits freedom of movement quite a bit when it comes to designing a process that is both fair and effective.

This problem always settles into an equilibrium where smaller firms have an edge: it doesn't matter that their interview process isn't very scalable because they're not operating at scale. But once companies get big they all seem to basically revert to leetcode, which is evidence that the currently-innovative practices don't scale up well.

Is it possible to do better than the current meta? Will we ever move beyond "system design challenges on whiteboards" and "leetcoding exercises on Coderpad?" I'm sure we'll get there. Once something is both 1) clearly better and 2) durably works at scale, market forces essentially guarantee it'll be copied until it takes over. But right now it's not clear to me that the most vocal complainers have anything to offer in terms of (repeatable, scalable, sufficiently Glassdoor-proof) alternatives.



[1] Candidates on zoom calls cheating with a coach next to them (or in their ear) is a thing, feel free to ask me how I know.