Why "Everyone Dies" Gets AGI All Wrong

1 month ago 4

Being: A reaction to Eliezer Yudkowsky and Nate Soares’s book “If anybody builds it everyone dies” which is getting a bit of media attention

I’ve known Eliezer Yudkowsky since the late 1990s. In 2000, he visited my AI company Webmind Inc. in New York City, where we were explicitly trying to build AGI alongside various commercial applications. His mission? To convince us to maybe stop. Or, failing that, to get us thinking really hard about AGI safety and how to make the AGI good for everyone.

While he was encouraging us to slow down and work on ethics instead of building AGI, he also showed me his new programming language called Flare – which he was developing specifically to create AGI. A few years later, in 2005, I published an edited book called “Artificial General Intelligence“ which put that term on the map, and Eliezer had a chapter in it explaining his understanding of how to engineer minds and make safe AGI.

This contradiction has persisted through the decades. Eliezer has oscillated between “AGI is the most important thing on the planet and only I can build safe AGI” and “anyone who builds AGI will kill everyone.”

For a period of time in the early aughts I ended up as the Head of Research at Eliezer’s organization The Singularity Institute for AI (name later changed to MIRI), which gave us a chance to both debate these matters more extensively and also to try to reconcile our perspective as best we could – which turned out to not be all that well, and my tenure at the organization ended with an aura of friendly, mutually respectful disagreement on many fundamentals. I later wrote a blog post expressing my views, called The Singularity Institute’s Scary Idea – And Why I Don’t Buy It .. with largely the same upshot as this very blog post.

Anyway, for anyone who’s been following the US futurist and Singularitarian community for the last couple decades, Yudowsky and Soares’s recent book will come as no surprise – it’s basically the same story he’s been telling for 15-20-ish years now.

In fact if anyone wants a deeper look into his perspective, I’d encourage a romp through Eliezer’s book Rationality: from AI to Zombies, which is a wonderful rambling tome – and an utterly different book than one would expect from its name, as the author’s perspective on rationality is a fascinatingly eccentric one. Among other things one gets the clear sense that every single idea in Nick Bostrom’s influential anti-AGI work Superintelligence was aired earlier and more thoroughly in Eliezer’s writings, albeit in less formal and academic terms.

Anyway – having spent decades actually working on AGI systems while Eliezer has mostly been warning about them and scheming how to stop or prevent them, I feel compelled to respond to his latest articulation of doom. I do have a slight reservation that by doing so I might be drawing more attention to the book than it deserves … but ultimately I feel like, even though most of the book’s thoughts are misguided, the debate it provokes touches so many commonplace fears in our culture that it’s worth having.

Just as Eliezer’s arguments in his book mostly aren’t new, nor is the crux of my response. What is different is the heightened attention and importance attached to these issues now that everyone without blinders on can see we are palpably getting might close to actually creating human-level AGI. (No, of course scaled up LLMs are not going to give us AGI, but the same deeper hardware and software and industry and science trends that have given us LLMs are very likely to keep spawning more and more amazing AI technologies, some combination of which will likely produce AGI on something roughly like the 2029 timeframe Kurzweil projected in his 2005 book The Singularity Is Near, possibly even a little sooner.)

Eliezer is right about one thing: we cannot know with certainty that AGI won’t lead to human extinction – what my old friend Hugo de Garis, also a long-time AGI researcher, likes to call “Gigadeath.” But the leap from uncertainty to “everybody dies” represents a tremendous failure of imagination about both the nature of intelligence and our capacity to shape its development.

This was really the main idea of my “Singularity Institute’s Scary Idea” post way back when – the way they segued so glibly from “AGI might potentially kill everyone and we can’t totally rule it out” to “AGI will almost certainly kill everyone.” For a bunch of self-styled rationalists, it seemed to me they made this leap with an egregious lack of careful rationality. Their arguments in favor of this scary segue were inevitably extremely full of holes. Stuff like “a mind randomly chosen from mindspace would have probability near zero of caring about humans.” OK sure, probably, but that would be a very bizarre and difficult thing to do, to choose a random mind from mindspace (according to any reasonably broad distribution) – obviously what people will actually do is quite different, more like creating our “mind children,” drawn from the very biased distribution of AGI minds initialized with some version of human value system and an initial purpose of serving at least some humans. So…

Specific arguments and counterarguments aside, I feel like the core philosophical flaw in Eliezer’s reasoning on these matters is treating intelligence as pure mathematical optimization, divorced from the experiential, embodied, and social aspects that shape actual minds. If we think about AGI systems as open-ended intelligences—a concept I explore in my 2024 book “The Consciousness Explosion“ — we see them as living, self-organizing systems that seek both survival and self-transcendence, evolving in a fashion complexly coupled with their environments. The challenge is not to theoretically analyze their hypothetical behaviors in an isolated way, but rather to enter into the right sort of mutually beneficial interactions with them as they evolve, grow and learn.

An intelligence capable of recursive self-improvement and transcending from AGI to ASI would naturally tend toward complexity, nuance, and relational adaptability rather than monomaniacal optimization. Eliezer and Soares and their ilk are possessed with the fear that we’ll have a superintelligence that’s A) utterly narrowminded and pigheaded in pursuing psychopathic or megalomaniacal goals, while at the same time being B) incredibly deep and broad in understanding the universe and how to get things done – yes, this could theoretically happen, but there is no rational reason to assume it’s likely!

AGI development isn’t happening in some abstract space where optimization arguments determine everything. The AGIs that appear on earth will not be random digital minds sampled haphazardly from some notional mind-space. Rather, we and our computer systems are part of a fantastically complex evolving global brain — and AGI systems are emerging as part of this human-digital dynamic. The actual factors that will shape AGI’s impact, within this global-brain context, are concrete and manipulable: the cognitive architecture we choose, who owns and controls it, and what applications shape its early learning experiences.

This is what drives my own technical AGI work – my team at SingularityNET and TrueAGI is creating the Hyperon AGI system very differently from LLMs and other modern neural networks, both because we believe LLMs lack the cognitive architecture needed for human-level AGI, and also because we want to create AGI systems more capable of intentional beneficial activity. We’re trying to create AGI systems designed for self-understanding, deep reflection, and moral agency. This doesn’t guarantee safety, but it makes beneficial outcomes more likely than architectures focused solely on pursuing narrow rewards for narrow goals (like predicting the next token in a series, maximizing the profit of a certain company, or maximizing the success of a certain military).

This also ties in with my reasons for developing decentralized platforms for AI deployment, through SingularityNET and the ASI Alliance. Eliezer’s doom scenarios typically assume AGI emerges from a single actor, but when thousands or millions of diverse stakeholders contribute to and govern AGI’s development, the system is far less likely to embody the narrow, potentially destructive goal functions that alignment pessimists fear. Democratic, transparent, decentralized AGI development isn’t just ethically preferable—it’s technically safer, for reasons similar to why open-source software systems are often safer than their closed-source analogues.

Perhaps the most annoying thing about the AGI doom narrative is how it distracts from real, addressable issues. I’m not just talking about immediate problems like AI bias or military applications – though these are real and serious enough. I’m even more concerned about the transition period between early-stage AGI and superintelligence.

When automation eliminates jobs faster than new opportunities emerge, when countries that can’t afford universal basic income face massive displacement, we risk global terrorism and fascist crackdowns—as we’re already seeing in various parts of the world. These nearer-term challenges could determine whether early-stage AGI grows up in a context that enables beneficial development or one that virtually guarantees poor outcomes.

The LLM revolution has already demonstrated that Eliezer’s model is too simplistic. These systems show that intelligence and values aren’t orthogonal, as Eliezer, Bostrom and their intellectual allies have often argued. In theory, yes, you could pair an arbitrarily intelligent mind with an arbitrarily stupid value system. But in practice, certain kinds of minds naturally develop certain kinds of value systems.

Mammals, which are more generally intelligent than reptiles or earthworms, also tend to have more compassion and warmth. Humans tend to have a broader scope of compassion than most other mammals, because our greater general intelligence lets us empathize more broadly with systems different from ourselves. There’s deep intertwining between intelligence and values—we even see it in LLMs already, to a limited extent. The fact that we can meaningfully influence their behavior through training hints that value learning is tractable, even for these fairly limited sub-AGI systems.

After decades of these debates, I’m convinced the real danger isn’t that someone will build AGI—that ship has sailed. The danger we should be most worried about is that fear-mongering will either drive development underground into the hands of the least scrupulous actors, or create regulatory capture that hands control to a small elite.

Fear-mongering that forces centralized AGI development by incompetent or narrow, self-centered actors is far more palpably dangerous than the vast unwashed masses of humanity creating open source AGI in a participatory, democratic way.

Since 2006, I’ve organized the annual AGI research conference, and last year I started a conference series on Beneficial General Intelligence. Next month at the BGI-25 event in Istanbul, we’re bringing together people from around the world to think through how to nudge AGI and ASI development in positive directions. It’s not easy—there are tremendous social and economic forces working against us—but the idea that this effort is doomed to fail is both incorrect and potentially dangerous.

Yudkowsky and Soares’s “everybody dies” narrative, while well-intentioned and deeply felt (I have no doubt he believes his message in his heart as well as his eccentrically rational mind), isn’t just wrong—it’s profoundly counterproductive. By treating beneficial AGI as impossible, it threatens to become a self-fulfilling prophecy that cedes the field to those who don’t care about safety at all. We need voices advocating for careful, beneficial development, not voices saying we’re all doomed.

The future remains unwritten, and we’re helping to write it. The question isn’t whether humanity will build AGI—it’s how to build it wisely. That means architectures emphasizing compassion and self-reflection, decentralized governance preventing monopolistic or oligopolistic control, and applications in education, healthcare, and science that shape AGI’s values through beneficial action.

This won’t guarantee good outcomes — nothing can do that — but it’s far more likely to succeed than either halting development (next to impossible by this point) or proceeding without considering these factors (genuinely dangerous). After all these years and decades, I remain convinced: the most important work isn’t stopping AGI—it’s making sure we raise our AGI mind children well enough.