Code Tours as Code

3 hours ago 1

How to onboard a new developer to a codebase?

In-person or zoom call walk-throughs are great but don’t scale. If you try to record that walk-through as a video, it’s just a matter of time before it goes stale.

You can create documents in Confluence or Notion, hoping they won’t become a graveyard of knowledge like all previous attempts. Learning from that, you write Markdown READMEs and store them in the repo. This increases the chance of getting updated as code changes, but it’s just slowing the process, and eventually decay hits as well.

There are fancy code tour plugins for IDEs like VSCode and IntelliJ. It’s a nice idea in theory. However, as code changes, those will point to the wrong location because they store tours linking to files with absolute line numbers. It isn’t feasible to maintain over time.

Let’s try another approach: expressing Code Tours as code and referencing locations of interest by name. Using a city tour metaphor, we’re not giving exact directions. That would fail if a street gets closed or a bus line is rerouted. Instead, we highlight landmarks, and it’s up to the traveling developer to explore at their own pace based on their situation and interest.

This enables us to leverage features of IDEs that are great for code navigation. And refactoring tools help to keep up with code changes and lower the maintenance burden.

Code tours as code

A code tour is just a regular source file. Start by creating a file named readme.js or tour.js in the root folder of an app, service, or module. Here is an example tour for simple testing without mocks example.

import { App } from "./app"; import { CommandLine } from "./infrastructure/command_line"; import * as rot13 from "./logic/rot13"; // ## Application layer App // Main application entry point ; App.prototype.run // ## Infrastructure layer // Infrastructure wrapper for reading command-line arguments and writing to `stdout` CommandLine // Provides access to command-line arguments ; CommandLine.prototype.args // Wrapper for printing to console ; CommandLine.prototype.writeOutput // ## Logic layer // Implementation of ROT-13 encoding rot13.transform

You can include short comments for an overview. If the file becomes too large, split it into separate files by area or domain like users_tour.js and orders_tour.js.

Going on a tour

To start a tour, open the source file in your favorite IDE or LSP-enabled editor. As you walk through, use “Go to definition” to jump across the code base to learn more details.

Some editors also provide a “Peek Definition” feature to preview the target source code on hover to browse code even faster without switching to a different file.

VSCode peek definition

GitHub has also improved its indexing features and now supports jumping to definitions for some languages, which makes it possible to navigate code tours even without needing to clone or checkout first.

Code navigation on GitHub

Keeping tours up-to-date

The big benefit of keeping code tours as code is that using IDE refactoring tools will also change references in the tour and keep it up-to-date.

For many changes like renaming a function, the IDE refactoring tools will work without an issue. Other changes like deleting a function can make the tour outdated. However, the compiler catches that and will report a diagnostic with a non-existing reference.

To make sure changes that break tours are not merged by accident, you can import the tour files in tests. That way, unknown references will be reported as errors when running checks in CI.

Formatting tips

Logically grouped items in a nested hierarchy are great for readability.

The problem is that auto-formatters can strip the indentation.

The workaround to keep the formatter from stripping the indentation is to pad the code with additional characters. The approach depends on the used programming language. For example, for JS/TS add semicolons.

Other languages have different considerations.

For Go and other C-family languages, we can use {} block scopes for indentation. Go also doesn’t seem to like top-level forms, so we can wrap a tour in a function. Having functions without calling them gives an error, so we add () parentheses. See a more detailed example.

func tour() { foo() { bar() { baz() } } }

In Clojure, a dash - as a symbol doesn’t cause issues with operands. See in the example that we can indent more naturally.

Conclusion

Code tours as code is a low-effort approach to give an overview of a codebase for onboarding. It works with existing IDEs and editors for navigation. Thanks to refactoring tools, tours are much easier to maintain when code changes. Additionally, CI checks can be added to make sure they stay up-to-date.

This approach is also useful when exploring unknown undocumented codebases. Take notes in code as you explore a codebase and you will end up with a trail of breadcrumbs. Curate the important ones, add some comments with a short summary, and you have a code tour ready for the next person after you.

Does it still make sense to manually curate tours in the age of GenAI? LLMs can be helpful in analyzing codebase and explaining how things work. However, current models seem to lack “taste” to exclude unimportant details, outputs can often be overwhelming or plainly wrong.

Or you can ask LLM of your choice to draft a code tour for you and give this article as context in the prompt.

Read Entire Article