Interviewing engineers is hard because it's adversarial

2023-05-18

Hiring is one of the most important functions in any company. The people who are hired end up deciding who else is hired (etc), and this basically determines what kind of company it becomes. The same is true of software engineer careers: the perceived quality of the first few jobs can dramatically affect the trajectory of a career.

This makes interviewing high stakes for everyone involved. Most software engineers hate interviewing and correctly point out that almost all interviews in this industry are uniformly bad. They're a stressful time sink, require too many touchpoints over too long a timeline, are often opaque, test skills that barely relate to the actual job, with seemingly arbitrary results, and worse, are often influenced by factors unrelated to qualifications.

The terrible state of interviewing is sort of an idle obsession in our industry; there are many opinions but very little in the way of rigorous analysis or discussion. In my experience the most commonly overlooked consideration is that interviewing is necessarily adversarial: the two sides involved have different incentives and goals. Hiring orgs want to find and hire the best candidate for their given budget. Candidates want an offer - the highest one possible, regardless of fit. This is because 1) having multiple offers confers leverage, and 2) the principal-agent problem. This tension makes it hard for either side to approach the problem with a 100% collaborative mindset, which is why most naive ideas for how to make interviews "better" either fail to scale, or would never work in the first place.

So here are just a few considerations that must be accounted for when designing interviews:

Factors to consider when designing an interview

Total ordering: You might be forgiven for thinking an ideal interview has a specific and clearly delineated "bar" that a candidate needs to clear to receive an offer. They either do well enough to get an offer, or they don't. Theoretically this maps to a boolean. In practice, this needs to be an integer, because what if multiple candidates pass and there's only one open position? You need a way to rank candidates, not just pass/fail them.

Solution that doesn't work: Just hire the first person to pass the interview. This is equivalent to random(), which is an obviously suboptimal strategy to adopt. And missing out on a great candidate means they go to a competitor and then compete with you. You don't want to deliberately adopt suboptimal strategies in high stakes zero-sum situations.

Cheating: Glassdoor (etc) makes it extremely easy to cheat by "studying to the test" which causes at least two problems: selectively unintentionally drifts down over time (more candidates will pass, even though the interview as designed should've failed them), and it becomes difficult to fairly compare candidates, because probably some cheated and some didn't. It's hard to counter this; basically you have a sliding scale where you're always trying to balance "freshness of your questions to mitigate cheating" and "consistency so the questions are 1) of similar difficulty over time and 2) make relative comparisons easier".

Solution that doesn't work: Post the question online to "level the playing field" for everyone! The obvious problem with this approach is that sometimes (regardless of the merits) you're looking to test how someone works on a problem they haven't specifically seen before. Posting everything online ahead of time turns it into a memorization/preparation test instead.
Solution that doesn't work: Frequently rotate questions. It's too hard to consistently design great and novel interview questions, since it takes time and effort to calibrate the difficulty, delivery, and interviewer training to get a question "just right." And the calibration data is noisy and sparse: sometimes you get a bad hire or miss out on a great hire because the interview question was inappropriately leveled. Or maybe it was just luck?

Selectivity: Every interview question should give the candidate more opportunities to demonstrate they are a good fit, and this means questions should be calibrated to the expectations of the role. But in practice it's often necessary to develop a question to weed out clearly unqualified candidates - that is to say, passing the question is necessary but not sufficient for an offer. Suppose you're hiring a senior machine learning engineer. If your goal is "minimize the risk that I proceed with someone from the bottom 30% of the candidate pool which has objectively never done any machine learning" then "explain to me in your own words how a support vector machine works" might suffice as as a screener. Arriving at the "correct" difficulty for screener and main interview questions is a difficult problem, and the candidate pool is not static: in general you'll find candidates magically get better at answering questions over time because of Glassdoor.

Solution that doesn't work: Don't use screeners at all. Just rely on your tougher questions - "if they can't pass a weed-out question then they shouldn't be able to pass the harder ones either." The problem with this approach is it wastes everyone's time. It's just a complete failure of process and resource management to send a candidate through a time-intensive wringer of a multiple-hour interview loop with senior level employees when the candidate is hopelessly unqualified, the outcome is completely certain, and this can be discovered in 15 minutes. "Cutting interview loops short" is theoretically doable, but requires training and expectation-setting and doesn't even fully solve the problem (harder questions require a more skilled interviewer!), so it isn't clearly better than alternatives.

Repeatability & scalability: Interviews need to be somewhat repeatable and controlled, because that's a prerequisite for targeting a specific selectivity and getting a total ordering. Repeatability is highest when isolating variables and removing extraneous details, which is naturally in tension with conducting an interview that "mirrors a real engineer's day-to-day." The average engineer's day-to-day is driven by lots of context about their team and business and that's really hard to simulate for a "realistic" interview, and can be markedly different from one day to the next.

Solution that doesn't work: As part of the interview process, you can pair your candidates with a real engineer/interviewer as they go about their work for the day, but that's going to give every candidate a very different experience. It's also a big commitment for everyone involved!

Trainability: A good heuristic to go on is "if you can't explain what a good answer is and why it's a good answer then you haven't thought rigorously enough about it, or it's a bad question." You need to be able to train people to give the interview question properly, answer clarifying questions, ask appropriate followups, and most important - grade candidates fairly. Since a good question doesn't ask for a factoid that someone can memorize and regurgitate, this can be much harder than it seems. Forcing yourself to actually put pen to paper and lay out what, exactly, is a "good/better/best" answer without accidentally encoding some flaw into it (like implied knowledge of some random factoid) can be surprisingly difficult.

Selection bias / pain tolerance: Given how high stakes hiring is, it's safe to predict that companies would take much more time to interview candidates if they could have unlimited time, money, and candidate patience. But sought-after engineers have lots of options, so they get snapped up quickly. The engineers who are *most* willing to jump through 8-round interviews in a good economy are (statistically speaking) going to be the ones that are mediocre-to-bad. Joel Spolsky famously complained about this almost 2 decades ago (until he switched to complaining about how much Facebook paid). It's the "market for lemons" principle: this person must not be very good, because they're interviewing! This results in an equilibrium where most companies know they have, at best 3-4 hours of interviewing to decide whether to make an offer or not, because they know that longer processes will repel good candidates and attract only bad candidates, overwhelming any benefits of higher confidence hiring decisions.

Solution that doesn't work: Creating a publicly documented fast lane for preferred "hot" candidates. Besides being manifestly unfair, it would perpetuate and magnify inequalities from random luck and encode biases into policy, without much evidence it would meaningfully change outcomes. E.g., for this to be effective you would need a public policy like "candidates who once worked at Google can skip the screening interviews" but there's no reason to expect that this makes people from Google more likely to apply (a job change is a big move; they'd need to already be intrigued enough by other factors that this tips the scales for them), or that they'd be a better fit for the role.
Solution that doesn't work: "Just" be a very attractive place to work. Candidates have a higher pain tolerance for top employers. This is obviously very difficult - rankings are zero-sum; not everyone can be a top employer. Many industries can't or won't do this at all because pay is a very large component of what makes a workplace "attractive," and in most industries the benefits of a good team are quickly swamped by the cost of overpaying for it.

"I would just do x"

Most people who think there is a Better Way to interview software engineers have an interview process in mind that fails at least one of these considerations. Again, interviewing is adversarial, so you can't assume candidates won't try to game the process [1], or that they are representing their skills and work experience accurately. At the same time, being mindful of time constraints is necessary to minimize pain and avoid repelling good candidates with arduously long funnels. This limits freedom of movement quite a bit when it comes to designing a process that is both fair and effective.

This problem always settles into an equilibrium where smaller firms have an edge: it doesn't matter that their interview process isn't very scalable because they're not operating at scale. But once companies get big they all seem to basically revert to leetcode, which is evidence that the currently-innovative practices don't scale up well.

Is it possible to do better than the current meta? Will we ever move beyond "system design challenges on whiteboards" and "leetcoding exercises on Coderpad?" I'm sure we'll get there. Once something is both 1) clearly better and 2) durably works at scale, market forces essentially guarantee it'll be copied until it takes over. But right now it's not clear to me that the most vocal complainers have anything to offer in terms of (repeatable, scalable, sufficiently Glassdoor-proof) alternatives.

[1] Candidates on zoom calls cheating with a coach next to them (or in their ear) is a thing, feel free to ask me how I know.

Peter Cai

Interviewing engineers is hard because it's adversarial

Factors to consider when designing an interview

"I would just do x"