"LLM-proofing" our take home coding challenge

4 months ago 23

Our original idea for our coding challenge was to ask candidates to build a tic-tac-toe game in React with a few curve balls thrown in around making the solution more general for larger game boards and play modes. We scrapped that idea when we discovered ChatGPT could trivially do this.

Here are some things we did to "LLM-proof" our new challenge.

But first, why the scare quotes around "LLM-proof." Because I'm sure someone could get an LLM to solve the new challenge, but the quality of prompting that would be necessary to pull this off would be an indication of a strong engineer. The point of an interview process is to find strong engineers, not to find ones who don't use LLMs.

First, we focused the challenge on a product-specific feature instead of a toy problem. Products need to be differentiated to exist at all, so starting the challenge w/ a unique product feature already sets us up well for finding a task that's different from what was in the LLM's training dataset.

Second, we made abstraction and good design a core part of the challenge. Again, this is something that LLMs aren't particularly great at. This shouldn't be surprising since most code that LLMs are trained on is a mess.

Finally, we playtested the challenge with LLM assistance. The LLM was able to come up with a basic implementation with one or two prompts. However, trying to get the LLM to refactor its code to be more flexible was a lot harder. We gave up after 90 minutes of trying.

Without giving away too much of the challenge, all we wanted was the LLM to introduce a new Typescript inteface that would DRY up the React code it generated and enable additional implementations of the interface. I couldn't get ChatGPT 4o to do it.

If some LLM whisperer could get an LLM make the refactor, great. Welcome to the team, but for the dozens of solutions seen since implementing the new challenge, the hires we've made wrote the solution the old fashioned way.

Read Entire Article

"LLM-proofing" our take home coding challenge

Related

Private Credit's Rising Pile of 'Bad PIK' Points to Default ...

Is the AI bubble too big to burst? [video]

Open-Source Ada: From Gateware to Application