👋 This page was last updated ~5 years ago. Just so you know.
My honeymoon with the Go language is extremely over.
This article is going to have a different tone from what I’ve been posting the past year - it’s a proper rant. And I always feel bad writing those, because, inevitably, it discusses things a lot of people have been working very hard on.
In spite of that, here we are.
Having invested thousands of hours into the language, and implemented several critical (to my employer) pieces of infrastructure with it, I wish I hadn’t.
If you’re already heavily invested in Go, you probably shouldn’t read this, it’ll probably just twist the knife. If you work on Go, you definitely shouldn’t read this.
I’ve been suffering Go’s idiosyncracies in relative silence for too long, there’s a few things I really need to get off my chest.
Alright? Alright.
Garden-variety takes on Go
By now, everybody knows Go doesn’t have generics, which makes a lot of problems impossible to model accurately (instead, you have to fall back to reflection, which is extremely unsafe, and the API is very error-prone), error handling is wonky (even with your pick of the third-party libraries that add context or stack traces), package management took a while to arrive, etc.
But everybody also knows Go’s strengths: static linking makes binaries easy to deploy (although, Go binaries get very large, even if you strip DWARF tables - stack trace annotations still remain, and are costly).
Compile times are short (unless you need cgo), there’s an interactive runtime profiler (pprof) at arm’s reach, it’s relatively cross-platform (there’s even a tiny variant for embedded), it’s easy to syntax-highlight, and there’s now an official LSP server for it.
I’ve accepted all of these - the good and the bad.
We’re here to talk about the ugly.
Simple is a lie
Over and over, every piece of documentation for the Go language markets it as “simple”.
This is a lie.
Or rather, it’s a half-truth that conveniently covers up the fact that, when you make something simple, you move complexity elsewhere.
Computers, operating systems, networks are a hot mess. They’re barely manageable, even if you know a decent amount about what you’re doing. Nine out of ten software engineers agree: it’s a miracle anything works at all.
So all the complexity is swept under the rug. Hidden from view, but not solved.
Here’s a simple example.
This example does go on for a while, actually - but don’t let the specifics distract you. While it goes rather in-depth, it illustrates a larger point.
Most of Go’s APIs (much like NodeJS’s APIs) are designed for Unix-like operating systems. This is not surprising, as Rob & Ken are from the Plan 9 gang.
So, the file API in Go looks like this:
Makes sense for a Unix, right?
Every file has a mode, there’s even a command that lets you dump it as hex:
And so, a simple Go program can easily grab those “Unix permission bits”:
On Windows, files don’t have modes. It doesn’t have stat, lstat, fstat syscalls - it has a FindFirstFile family of functions (alternatively, CreateFile to open, then GetFileAttributes, alternatively, GetFileInformationByHandle), which takes a pointer to a WIN32_FIND_DATA structure, which contains file attributes.
So, what happens if you run that program on Windows?
It makes up a mode.
Node.js does the same. There’s a single fs.Stats “type” for all platforms.
Using “whatever Unix has” as the lowest common denominator is extremely common in open-source codebases, so it’s not surprising.
Let’s go a little bit further. On Unix systems, you can change the modes of files, to make them read-only, or flip the executable bit.
Let’s run this on Linux:
And now on Windows:
So, no errors. Chmod just silently does… nothing. Which is reasonable - there’s no equivalent to the “executable bit” for files on Windows.
What does Chmod even do on Windows?
It sets or clears the read-only bit. That’s it.
We have an uint32 argument, with four billion two hundred ninety-four million nine hundred sixty-seven thousand two hundred ninety-five possible values, to encode… one bit of information.
That’s a pretty innocent lie. The assumption that files have modes was baked into the API design from the start, and now, everyone has to live with it. Just like in Node.JS, and probably tons of other languages.
But it doesn’t have to be like that.
A language with a more involved type system, and better designed libraries could avoid that pitfall.
Out of curiosity, what does Rust do?
Oh, here we go again - Rust, Rust, and Rust again.
Why always Rust?
Well, I tried real hard to keep Rust out of all of this. Among other things, because people are going to dismiss this article as coming from “a typical rustacean”.
But for all the problems I raise in this article… Rust gets it right. If I had another good example, I’d use it. But I don’t, so, here goes.
There’s no stat-like function in the Rust standard library. There’s std::fs::metadata:
This function signatures tells us a lot already. It returns a Result, which means, not only do we know this can fail, we have to handle it. Either by panicking on error, with .unwrap() or .expect(), or by matching it against Result::Ok / Result::Err, or by bubbling it up with the ? operator.
The point is, this function signature makes it impossible for us to access an invalid/uninitialized/null Metadata. With a Go function, if you ignore the returned error, you still get the result - most probably a null pointer.
Also, the argument is not a string - it’s a path. Or rather, it’s something that can be turned into a path.
And String does implement AsRef<Path>, so, for simple use cases, it’s not troublesome:
But paths are not necessarily strings. On Unix (!), paths can be any sequence of bytes, except null bytes.
We’ve just made a file with a very naughty name - but it’s a perfectly valid file, even if ls struggles with it.
That’s not something we can represent with a String in Rust, because Rust Strings are valid utf-8, and this isn’t.
Rust Paths, however, are… arbitrary byte sequences.
And so, if we use std::fs::read_dir, we have no problem listing it and getting its metadata:
What about Go?
It… silently prints a wrong version of the path.
See, there’s no “path” type in Go. Just “string”. And Go strings are just byte slices, with no guarantees about what’s inside.
So it prints garbage, whereas in Rust, Path does not implement Display, so we couldn’t do this:
We had to do this:
And if we wanted a friendlier output, we could handle both cases: when the path happens to be a valid utf-8 string, and when it doesn’t:
Go says “don’t worry about encodings! things are probably utf-8”.
Except when they aren’t. And paths aren’t. So, in Go, all path manipulation routines operate on string, let’s take a look at the path/filepath package.
Package filepath implements utility routines for manipulating filename paths in a way compatible with the target operating system-defined file paths.
The filepath package uses either forward slashes or backslashes, depending on the operating system. To process paths such as URLs that always use forward slashes regardless of the operating system, see the path package.
What does this package give us?
Strings. Lots and lots of strings. Well, byte slices.
Speaking of bad design decisions - what’s that Ext function I see?
Interesting! Let’s try it out.
Right away, I’m in debating mood - is .foo’s extension really .foo? But let’s move on.
This example was run on Linux, so C:\foo.txt\bar’s extension, according to filepath.Ext, is.. .txt\bar.
Why? Because the Go standard library makes the assumption that a platform has a single path separator - on Unix and BSD-likes, it’s /, and on Windows it’s \\.
Except… that’s not the whole truth. I was curious, so I checked:
No funny Unix emulation business going on - just regular old Windows 10.
And yet, in Go’s standard library, the path/filepath package exports those constants:
os, in turn, exports:
So how comes filepath.Ext works with both separators on Windows?
Let’s look at its implementation:
Ah. An IsPathSeparator function.
Sure enough:
(Can I just point out how hilarious that “Extension” was deemed long enough to abbreviate to “Ext”, but “IsPathSeparator” wasn’t?)
How does Rust handle this?
It has std::path::is_separator:
And it has std::path::MAIN_SEPARATOR - emphasis on main separator:
The naming alone makes it much clearer that there might be secondary path separators, and the rich Path manipulation API makes it much less likely to find this kind of code, for example:
Or this kind:
Or this… kind:
It turns out Rust also has a “get a path’s extension” function, but it’s a lot more conservative in the promises it makes:
Let’s submit it to the same test:
On Linux:
On Windows:
Like Go, it gives a txt\bar extension for a Windows path on Linux.
Unlike Go, it:
- Doesn’t think “/.foo” has a file extension
- Distinguishes between the “/foo.” case (Some("")) and the “/foo” case (None)
Let’s also look at the Rust implementation of std::path::Path::extension:
Let’s dissect that: first it calls file_name(). How does that work? Is it where it searches for path separators backwards from the end of the path?
No! It calls components which returns a type that implements DoubleEndedIterator - an iterator you can navigate from the front or the back. Then it grabs the first item from the back - if any - and returns that.
The iterator does look for path separators - lazily, in a re-usable way. There is no code duplication, like in the Go library:
So, now we have only the file name. If we had /foo/bar/baz.txt, we’re now only dealing with baz.txt - as an OsStr, not a utf-8 String. We can still have random bytes.
We then map this result through split_file_at_dot, which behaves like so:
- For "foo", return (Some("foo"), None)
- For "foo.bar", return (Some("foo"), Some("bar"))
- For "foo.bar.baz", return (Some("foo.bar"), Some("baz"))
and_then, we only return after if before wasn’t None.
If we spelled out everything, we’d have:
The problem is carefully modelled. We can look at what we’re manipulating just by looking at its type. If it might not exist, it’s an Option<T>! If it’s a path with multiple components, it’s a &Path (or its owned counterpart, PathBuf). If it’s just part of a path, it’s an &OsStr.
Of course there’s a learning curve. Of course there’s more concepts involved than just throwing for loops at byte slices and seeing what sticks, like the Go library does.
But the result is a high-performance, reliable and type-safe library.
It’s worth it.
Speaking of Rust, we haven’t seen how it handles the whole “mode” thing yet.
So std::fs::Metadata has is_dir() and is_file(), which return booleans. It also has len(), which returns an u64 (unsigned 64-bit integer).
It has created(), modified(), and accessed(), all of which return an Option<SystemTime>. Again - the types inform us on what scenarios are possible. Access timestamps might not exist at all.
The returned time is not an std::time::Instant - it’s an std::time::SystemTime - the documentation tells us the difference:
A measurement of the system clock, useful for talking to external entities like the file system or other processes.
Distinct from the Instant type, this time measurement is not monotonic. This means that you can save a file to the file system, then save another file to the file system, and the second file has a SystemTime measurement earlier than the first. In other words, an operation that happens after another operation in real time may have an earlier SystemTime!
Consequently, comparing two SystemTime instances to learn about the duration between them returns a Result instead of an infallible Duration to indicate that this sort of time drift may happen and needs to be handled.
Although a SystemTime cannot be directly inspected, the UNIX_EPOCH constant is provided in this module as an anchor in time to learn information about a SystemTime. By calculating the duration from this fixed point in time, a SystemTime can be converted to a human-readable time, or perhaps some other string representation.
The size of a SystemTime struct may vary depending on the target operating system.
Source: https://doc.rust-lang.org/std/time/struct.SystemTime.html
What about permissions? Well, there it is:
A Permissions type! Just for that! And we can afford it, too - because types don’t cost anything at runtime. Everything probably ends up inlined anyway.
What does it expose?
Well! It exposes only what all supported operating systems have in common.
Can we still get Unix permission? Of course! But only on Unix:
Representation of the various permissions on a file.
This module only currently provides one bit of information, readonly, which is exposed on all currently supported platforms. Unix-specific functionality, such as mode bits, is available through the PermissionsExt trait.
Source: https://doc.rust-lang.org/std/fs/struct.Permissions.html
std::os::unix::fs::PermissionsExt is only compiled in on Unix, and exposes the following functions:
The documentation makes it really clear it’s Unix-only:
But it’s not just documentation. This sample program will compile and run on Linux (and macOS, etc.)
But will fail to compile on Windows:
How can we make a program that runs on Windows too? The same way the standard library only exposes PermissionsExt on Unix: with attributes.
Those aren’t #ifdef - they’re not preprocessor directives. There’s no risk of forgetting an #endif. And if you miss if/else chains, there’s a crate for that.
Here’s that sample program on Linux:
And on Windows:
Can you do that in Go? Sure! Kind of!
There’s two ways to do something similar, and both involve multiple files.
Here’s one:
In main.go, we need:
In poke_windows.go, we need:
And in poke_unix.go, we need:
Note how the _windows.go suffix is magic - it’ll get automatically excluded on non-Windows platforms. There’s no magic suffix for Unix systems though!
So we have to add a build constraint, which is:
- A comment
- That must be “near the top of the file”
- That can only be preceded by blank space
- That must appear before the package clause
- That has its own language
From the docs:
A build constraint is evaluated as the OR of space-separated options. Each option evaluates as the AND of its comma-separated terms. Each term consists of letters, digits, underscores, and dots. A term may be negated with a preceding !. For example, the build constraint:
// +build linux,386 darwin,!cgo
corresponds to the boolean formula:
(linux AND 386) OR (darwin AND (NOT cgo))
A file may have multiple build constraints. The overall constraint is the AND of the individual constraints. That is, the build constraints:
// +build linux darwin > // +build 386
corresponds to the boolean formula:
(linux OR darwin) AND 386
Fun! Fun fun fun. So, on Linux, we get:
And on Windows, we get:
Now, at least there’s a way to write platform-specific code in Go.
In practice, it gets old very quickly. You now have related code split across multiple files, even if only one of the functions is platform-specific.
Build constraints override the magic suffixes, so it’s never obvious exactly which files are compiled in. You also have to duplicate (and keep in sync!) function signatures all over the place.
It’s… a hack. A shortcut. And an annoying one, at that.
So what happens when you make it hard for users to do things the right way? (The right way being, in this case, to not compile in code that isn’t relevant for a given platform). They take shortcuts, too.
Even in the official Go distribution, a lot of code just switches on the value of runtime.GOOS at, well, run-time:
“But these are little things!”
They’re all little things. They add up. Quickly.
And they’re symptomatic of the problems with “the Go way” in general. The Go way is to half-ass things.
The Go way is to patch things up until they sorta kinda work, in the name of simplicity.
Lots of little things
Speaking of little things, let’s consider what pushed me over the edge and provoked me to write this whole rant in the first place.
It was this package.
What does it do?
Provides mechanisms for adding idle timeouts to net.Conn and net.Listener.
Why do we need it?
Because the real-world is messy.
If you do a naive HTTP request in Go:
Then it works. When it works.
If the server never accepts your connection - which might definitely happen if it’s dropping all the traffic to the relevant port, then you’ll just hang forever.
If you don’t want to hang forever, you have to do something else.
Like this:
Not so simple, but, eh, whatever, it works.
Unless the server accepts your connection, says it’s going to send a bunch of bytes, and then never sends you anything.
Which definitely, 100%, for-sure, if-it-can-happen-it-does-happen, happens.
And then you hang forever.
To avoid that, you can set a timeout on the whole request, like so:
But that doesn’t work if you’re planning on uploading something large, for example. How many seconds is enough to upload a large file? Is 30 seconds enough? And how do you know you’re spending those seconds uploading, and not waiting for the server to accept your request?
So, getlantern/idletiming adds a mechanism for timing out if there hasn’t been any data transmitted in a while, which is distinct from a dial timeout, and doesn’t force you to set a timeout on the whole request, so that it works for arbitrarily large uploads.
The repository looks innocent enough:
Just a couple files! And even some tests. Also - it works. I’m using it in production. I’m happy with it.
There’s just.. one thing.
I’m sorry?
One hundred and ninety-six packages?
Well, I mean… lots of small, well-maintained libraries isn’t necessarily a bad idea - I never really agreed that the takeaway from the left-pad disaster was “small libraries are bad”.
Let’s look at what we’ve got there:
I’m sure all of these are reasonable. Lantern is a “site unblock” product, so it has to deal with networking a lot, it makes sense that they’d have their own libraries for a bunch of things, including logging (golog) and some network extensions (netx). testify is a well-known set of testing helpers, I use it too!
Let’s keep going:
Uhh….
Wait, I think we..
I can understand some of these but…
STOP! Just stop. Stop it already.
It keeps going on, and on. There’s everything.
YAML, Redis, GRPC, which in turns needs protobuf, InfluxDB, an Apache Kafka client, a Prometheus client, Snappy, Zstandard, LZ4, a chaos-testing TCP proxy, three other logging packages, and client libraries for various Google Cloud services.
What could possibly justify all this?
Let’s review:
Only built-in imports. Good.
This one is the meat of the library, so to say, and it requires a few of the getlantern packages we’ve seen:
It does end up importing golang.org/x/net/http2/hpack - but that’s just because of net/http. These are built-ins, so let’s ignore them for now.
getlantern/hex is self-contained, so, moving on to getlantern/mtime:
That’s it? What’s why Go ends up fetching the entire github.com/aristanetworks/goarista repository, and all its transitive dependencies?
What does aristanetworks/goariasta/monotime even do?
Mh. Let’s look inside issue15006.s
I uh… okay.
What does that issue say?
This is known and I think the empty assembly file is the accepted fix.
It’s a rarely used feature and having an assembly file also make it standout.
I don’t think we should make this unsafe feature easy to use.
And later (emphasis mine):
I agree with Minux. If you’re looking at a Go package to import, you might want to know if it does any unsafe trickery. Currently you have to grep for an import of unsafe and look for non-.go files. If we got rid of the requirement for the empty .s file, then you’d have to grep for //go:linkname also.
That’s… that’s certainly a stance.
But which unsafe feature exactly?
Let’s look at nanotime.go:
That’s it. That’s the whole package.
The unsafe feature in question is being able to access unexported (read: lowercase, sigh) symbols from the Go standard library.
Why is that even needed?
If you remember from earlier, Rust has two types for time: SystemTime, which corresponds to your… system’s… time, which can be adjusted via NTP. It can go back, so subtraction can fail.
And it has Instant, which is weakly monotonically increasing - at worse, it’ll give the same value twice, but never less than the previous value. This is useful to measure elapsed time within a process.
How did Go solve that problem?
At first, it didn’t. Monotonic time measurement is a hard problem, so it was only available internally, in the standard library, not for regular Go developers (a common theme):
And then, it did.
Sort of. In the most “Go way” possible.
I thought some more about the suggestion above to reuse time.Time with a special location. The special location still seems wrong, but what if we reuse time.Time by storing inside it both a wall time and a monotonic time, fetched one after the other?
Then there are two kinds of time.Times: those with wall and monotonic stored inside (let’s call those “wall+monotonic Times”) and those with only wall stored inside (let’s call those “wall-only Times”).
Suppose further that:
- time.Now returns a wall+monotonic Time.
- for t.Add(d), if t is a wall+monotonic Time, so is the result; if t is wall-only, so is the result.
- all other functions that return Times return wall-only Times. These include: time.Date, time.Unix, t.AddDate, t.In, t.Local, t.Round, t.Truncate, t.UTC
- for t.Sub(u), if t and u are both wall+monotonic, the result is computed by subtracting monotonics; otherwise the result is computed by subtracting wall times. - t.After(u), t.Before(u), t.Equal(u) compare monotonics if available (just like t.Sub(u)), otherwise walls.
- all the other functions that operate on time.Times use the wall time only. These include: t.Day, t.Format, t.Month, t.Unix, t.UnixNano, t.Year, and so on.
Doing this returns a kind of hybrid time from time.Now: it works as a wall time but also works as a monotonic time, and future operations use the right one.
So, as of Go 1.9 - problem solved!
If you’re confused by the proposal, no worries, let’s check out the release notes:
Transparent Monotonic Time support
The time package now transparently tracks monotonic time in each Time value, making computing durations between two Time values a safe operation in the presence of wall clock adjustments. See the package docs and design document for details.
This changed the behavior of a number of Go packages, but, the core team knows best:
So, if you have a package without a minimum required Go version, you can’t be sure you have the “transparent monotonic time support” of Go 1.9, and it’s better to rely on aristanetworks/goarista/monotime, which pulls 100+ packages, because Go packages are “simple” and they’re just folders in a git repository.
The change raised other questions: since time.Time now sometimes packs two types of time, two calls are needed. This concern was dismissed.
In order for time.Time not to grow, both values were packed inside it, which restricted the range of times that could be represented with it:
This issue was raised early on in the design process:
You can check out the complete thread for a full history.
Parting words
This is just one issue. But there are many like it - this one is as good an example as any.
Over and over, Go is a victim of its own mantra - “simplicity”.
It constantly takes power away from its users, reserving it for itself.
It constantly lies about how complicated real-world systems are, and optimize for the 90% case, ignoring correctness.
It is a minefield of subtle gotchas that have very real implications - everything looks simple on the surface, but nothing is.
The Channel Axioms are a good example. There is nothing explicit about them. They are invented truths, that were convenient to implement, and who everyone must now work around.
Here’s a fun gotcha I haven’t mentioned yet:
The documentation reads:
BUGS
On ARM, x86-32, and 32-bit MIPS, it is the caller’s responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.
If the condition isn’t satisfied, it panics at run-time. Only on 32-bit platforms. I didn’t have to go far to hit this one - I got bit by this bug multiple times in the last few years.
It’s a footnote. Not a compile-time check. There’s an in-progress lint, for very simple cases, because Go’s simplicity made it extremely hard to check for.
This fake “simplicity” runs deep in the Go ecosystem. Rust has the opposite problem - things look scary at first, but it’s for a good reason. The problems tackled have inherent complexity, and it takes some effort to model them appropriately.
At this point in time, I deeply regret investing in Go.
Go is a Bell Labs fantasy, and not a very good one at that.
April 2022 Update
I wrote this in 2020, and have changed jobs twice since. Both jobs involved Go in some capacity, where it’s supposed to shine (web services). It has not been a pleasant experience either - I’ve lost count of the amount of incidents directly caused by poor error handling, or Go default values.
If folks walk away with only one new thought from this, please let it be that: defaults matter. Go lets you whip something up quickly, but making the result “production-ready” is left as an exercise to the writer. Big companies that have adopted it have developed tons of tooling around it, use all available linters, do code generation, check the disassembly, and regularly pay the engineering cost of just using Go at all.
That’s not how most Go code is written though. I’m interested not in what the language lets you do, but what is typical for a language - what is idiomatic, what “everyone ends up doing”, because it is encouraged.
Because that’s the kind of code I inevitably end up being on-call for, and I’m tired of being woken up due to the same classes of preventable errors, all the time. It doesn’t matter that I don’t personally write Go anymore: it’s inescapable. If it’s not internal Go code, it’s in a SAAS we pay for: and no matter who writes it, it fails in all the same predictable ways.
Generics will not solve this. It is neat that they found a way to sneak them into the language, but it’s not gonna change years of poor design decisions, and it’s definitely not gonna change the enormous amount of existing Go code out there, especially as the discourse around them not being the usability+performance win everyone thought they would be keeps unfolding.
As I’ve mentioned recently on Twitter, what makes everything worse is that you cannot replace Go piecemeal once it has taken hold in a codebase: its FFI story is painful, the only good boundary with Go is a network boundary, and there’s often a latency concern there.
Lastly: pointing out that I have been teaching Rust is a lazy and dismissive response to this. For me personally, I have found it to be the least awful option in a bunch of cases. I am yearning for even better languages, ones that tackle the same kind of issues but do it even better. I like to remind everyone that we’re not out there cheering for sports team, just discussing our tools.
If you’re looking to reduce the whole discourse to “X vs Y”, let it be “serde vs crossing your fingers and hoping user input is well-formed”. It is one of the better reductions of the problem: it really is “specifying behavior that should be allowed (and rejecting everything else)” vs “manually checking that everything is fine in a thousand tiny steps”, which inevitably results in missed combinations because the human brain is not designed to hold graphs that big.
(JavaScript is required to see this. Or maybe my stuff broke)
Here's another article just for you: