On Technical Interviews

by Benjamin Peterson
published 2014-10-31

I’ve recently been doing a lot of interviews with American tech companies, small and large, for a position as a software engineer. In addition to the introvert-unfriendly experience of flying around the country in order to talk to strangers for a whole day, the content of many of the technical interviews has come to bother me.

Allow me to sketch a hypothetical interview experience: Your first interviewer walks in, introduces themselves, and gives the question. The question almost surely involves large arrays of numbers as input. If the interviewer is feeling particularly creative, you might get a contrived story, too. (“Imagine you are given an array of 100,000,000 floating point baseball scores between 0 and 10,000.”) Thinking for a few seconds, you see there’s an easy quadratic or cubic time algorithm. You’ll code this bad algorithm up on a whiteboard with the sinking feeling that there’s an “obvious”, more-efficient solution that you should have thought of the instant the problem was mentioned. This “correct” solution will involve some nontrivial data structure (probably one of the numerous balanced binary trees), and you’ll arrive at it after some patient prodding from the interviewer. You might spend some quality time figuring out how to get indented code blocks to line up on a whiteboard. Then, if there’s still time left in the interview, the parameters of the question may be altered, leaving you with a solution that may or not be able to be adequately extended. As the clock indicates the end of the interview, you wonder whether the interviewer would like to hear about your last interesting coding project. Alas, they scurry out and are replaced with a new interviewer and a new question about binary trees.

Binary Trees and Sorting

Interview questions like the one imagined above can generally be characterized as easy programming competition problems. I’m not particularly bad at this type of problem, but it frustrates me to have all the skills I’ve tried to cultivate as a software engineer reduced to mediocre algorithms and data structures brainteasers. Rather than holistic evaluations of my programming ability, many technical interview problems feel like they were lifted off the exam for an algorithms class. That’s odd for an industry where a formal university education is seen as unnecessary. However, there are even courses to help people cram for the interview “exam”.

I don’t wish to to minimize the importance of algorithms and data structures in software engineering or label them irrelevant. They are an core skill of our craft—but only one of many. Other abilities such as API design, code review, debugging, and interpersonal skills are just as important. As such, it’s clear that the weight given to questions about algorithms and data structures in most tech interviews is way out of proportion. It’s akin to evaluating a writer solely by asking them to diagram sentences and recall the comma rules they learned in grade school. Generously, I would say knowing how to apply dynamic programming has about the same practical use as understanding the difference between fork() and vfork(), but you are far more likely to be asked about the former in an interview. The tech industry closes its eyes, crosses its fingers, and hopes people who know the time complexity of Dijkstra’s algorithm are the same people who can design beautiful APIs, write defensive code, and create comprehensive test suites. It’s depressing to think how many perfectly competent engineers are being passed over because they didn’t know the difference between pre-order and post-order tree traversal. (And this is, of course, merely in addition to those sidelined by the usual biases.)

Tradition and Incentives

I would like to understand how the tech industry arrived at its current interview process. Unfortunately, I don’t know much about the history of the software engineering interview. (If anyone knows of any work on this, please let me know.) At any rate, I suspect the modern Silicon Valley incarnation of the technical interview has been heavily influenced by Google’s hiring practices, which have always emphasized algorithmic knowledge.

Algorithms and data structures usually feature prominently in the university educations of aspiring programmers. At least in the American computer science curriculum, algorithms and data structures form the core classes with systems classes at the periphery. Thus, asking about algorithms in interviews favors younger engineers, especially from elite engineering schools. At the same time, it creates biases against older developers, who haven’t had to think about treaps in a while. Then there is a feedback loop as engineers hired on their data structure chops begin interviewing the next generation of engineers.

(As an aside, I’ve always found it interesting that the computer science you are quizzed about in a tech interview is basically frozen in the 1980s. Nearly all algorithm interview questions can be solved with a handy copy of CLRS. CLRS is a perfectly good book, but theoretical computer science has progressed quite a bit since its publication in ways relevant to today’s working programmer. For example, smoothed analysis provides a better understanding of an algorithm’s practical running time than traditional asymptotic analysis. Randomized algorithms have become critical to modern distributed computing and data processing among other fields. Aside from basic probability questions, though, I’ve never been asked about this exciting area in an interview.)

The typical software engineer will be required to do several technical interviews a week. Interviews are not usually considered to be a particularly interesting part of the job, and engineers naturally work to minimize the effort expended on them. Another reason, therefore, for the prevalence of algorithmic questions is that they are easier for the interviewer. For these problems, there are definitely correct and incorrect answers. Different candidate’s performance can be easily compared. The tech industry has a love of data and often ignores information that can’t easily be evaluated numerically. (A notable exception to this is the infamous “culture fit” hiring criterion.) In contrast to algorithm questions, it’s hard to turn the maintainability of someone’s code into quantitative nuggets ideal for comparison.

Whiteboards

The practice of conducting coding exercises on whiteboards embodies the disconnect between interviews and real-world software engineering. Indeed, all the time in my life spent writing code on whiteboards has been for interviews. The interview problem is usually stressful enough. Why must it also be solved in such an unnatural environment for a programmer to work? I tend to write code (and English) in a nonlinear fashion. This is nearly impossible to do on a whiteboard without ending up with many sloppy-looking arrows indicating the insertion of code at various points. The whiteboard makes the interview feel like a strange oral exam where you’re not allowed to use any of the tools of your craft. I’ve also been asked to write code using Google Docs, which is like doing arithmetic with Roman numerals.

Of all the problems with the modern technical interview, the whiteboard is one of the easiest to eliminate. Candidates should be provided with a computer to program on or, even better, allowed to bring their own laptop in. Perhaps it concerns some that the candidate may be able to simply look up the solution to the question on Google and StackOverflow. If the answer to a certain question is easy to look up and use, that merely demonstrates the irrelevance of the problem to everyday software engineering. Consider a hypothetical situation where a programmer needs to solve some obnoxious “traditional” interview question like finding ascending sequences in an unsorted array of integers. In this (highly unlikely) situation, they would be doing their employer a disservice by not checking the web for an solution first before concocting one themselves. Most programmers don’t spend their days gluing code snippets from the internet together, so surely there can be interview questions that are still “challenging” even with the web available. In fact, the space of possible questions is much richer. For example, with a computer at hand, it’s perfectly reasonable to ask questions that require looking at documentation or cloning a Git repository.

No Easy Solutions

So, what can be done? Interviews should attempt to capture a much more holistic view of a candidate’s programming skills. There’s already been movement in this area. For example, interviews at Stripe can include designing an API, fixing a bug in a popular open source library, or pair programming. Even the old algorithms and data structure questions are probably salvageable by refocusing on problems encountered in their real-world application. Off the top of my head, I can imagine interesting discussions starting from talking about why despite the more appealing asymptotic behavior of heapsort, quicksort is usually faster; how the naive implementation of binary search is broken; or the cache unfriendliness of trees.

Of course, simply changing interview questions is not panacea. I expect engineers will have to devote more energy to engaging with candidates. Holistic interview questions are not as easy for the interviewer as algorithm questions; API design is definitely subjective in ways the running time of an algorithm is not. We must also ensure new interview questions do not create new axes of bias. For example, evaluating a candidate’s open source contributions or doing code walkthroughs in lieu of a normal interview seems promising but is unfair to people who haven’t been doing the unpaid labor of open source.

I certainly don’t have nor do I expect there to be a magic bullet. After all, deciding whether to hire someone with just a few hours of contact of is extremely difficult. What I do know is that the tech industry needs do some experimenting to find out how the technical interview experience can become more valuable for both the employer and the candidate.