AI and Work (Some Predictions)

4 days ago 1

One of the main topics of this newsletter is the quest to cultivate sustainable and meaningful work in a digital age. Given this objective, it’s hard to avoid confronting the furiously disruptive potentials of AI.

I’ve been spending a lot time in recent years, in my roles as a digital theorist and technology journalist, researching and writing about this topic, so it occurred to me that it might be useful to capture in one place all of my current thoughts about the intersection of AI and work.

The obvious caveat applies: these predictions will shift — perhaps even substantially — as this inherently unpredictable sector continues to evolve. But here’s my current best stab at what’s going on now, what’s coming soon, and what’s likely just hype.

Let’s get to it…

Where AI Is Already Making a Splash

When generative AI made its show-stopping debut a few years ago, the smart money was on text production becoming the first killer app. For example, business users, it was thought, would soon outsource much of the tedious communication that makes up their day — meeting summaries, email, reports — into AI tools.

A fair amount of this is happening, especially when it comes to lengthy utilitarian communication where the quality doesn’t matter much. I recently attended a men’s retreat, for example, and it was clear that the organizer had used ChatGPT to create the final email summarizing the weekend schedule. And why not? It got the job done and saved some time.

It’s becoming increasingly clear, however, that for most people the act of writing in their daily lives isn’t a major problem that needs to be solved, which is capping the predicted ubiquity of this use case. (A survey of internet users found that only around 5.4% had used ChatGPT to help write emails and letters. And this includes the many who maybe experimented with this capability once or twice before moving on.)

The application that has instead leaped ahead to become the most exciting and popular use of these tools is smart search. If you have a question, instead of turning to Google you can query a new version of ChatGPT or Claude. These models can search the web to gather information, but unlike a traditional search engine, they can also process the information they find and summarize for you only what you care about. Want the information presented in a particular format, like a spreadsheet or a chart? A high-end model like GPT-4o can do this for you as well, saving even more extra steps.

Smart search has become the first killer app of the generative AI era because, like any good killer app, it takes an activity most people already do all the time — typing search queries into web sites — and provides a substantially, almost magically better experience. This feels similar to electronic spreadsheets conquering paper ledger books or email immediately replacing voice mail and fax. I would estimate that around 90% of the examples I see online right now from people exclaiming over the potential of AI are people conducting smart searches.

This behavioral shift is appearing in the data. A recent survey conducted by Future found that 27% of US-based respondents had used AI tools such as ChatGPT instead of a traditional search engine. From an economic perspective, this shift matters. Earlier this month, the stock price for Alphabet, the parent company for Google, fell after an Apple executive revealed that Google searches through the Safari web browser had decreased over the previous two months, likely due to the increased use of AI tools.

Keep in mind, web search is a massive business, with Google earning over $175 billion from search ads in 2023 alone. In my opinion, becoming the new Google Search is likely the best bet for a company like OpenAI to achieve profitability, even if it’s not as sexy as creating AGI or automating all of knowledge work (more on these applications later).

The other major success story for generative AI at the moment is computer programming. Individuals with only rudimentary knowledge of programming languages can now produce usable prototypes of simple applications using tools like ChatGPT, and somewhat more advanced projects with AI-enhanced agent-style helpers like Roo Code. This can be really useful for quickly creating tools for personal use or seeking to create a proof-of-concept for a future product. The tech incubator Y Combinator, for example, made waves when they reported that a quarter of the start-ups in their Winter 2025 batch generated 95% or more of their product’s codebases using AI.

How far can this automated coding take us? An academic computer scientist named Judah Diament recently went viral for noting that the ability for novice users to create simple applications isn’t new. There have been systems dedicated to this purpose for over four decades, from HyperCard to VisualBasic to Flash. As he elaborates: “And, of course, they all broke down when anything slightly complicated or unusual needs to be done (as required by every real, financially viable software product or service).”

This observation created major backlash — as does most expressions of AI skepticism these days — but Diament isn’t wrong. Despite recent hyperbolic statements by tech leaders, many professional programmers aren’t particularly worried that their jobs can be replicated by language model queries, as so much of what they do is experience-based architecture design and debugging, which are unrelated skills for which we currently have no viable AI solution.

Software developers do, however, use AI heavily: not to produce their code from scratch, but instead as helper utilities. Tools like GitHub’s Copilot are integrated directly into the environments in which these developers already work, and make it much simpler to look up obscure library or AI calls, or spit out tedious boilerplate code. The productivity gains here are notable. Programming without help from AI is rapidly becoming increasingly rare.

The Next Big AI Application

Language model-based AI systems can respond to prompts in pretty amazing ways. But if we focus only on outputs, we underestimate another major source of these models’ value: their ability to understand human language. This so-called natural language processing ability is poised to transform how we use software.

There is a push at the moment, for example, led by Microsoft and its Copilot product (not to be confused with GitHub Copilot), to use AI models to provide natural language interfaces to popular software. Instead of learning complicated sequences of clicks and settings to accomplish a task in these programs, you’ll be able to simply ask for what you need; e.g., “Hey Copilot, can you remove all rows from this spreadsheet where the dollar amount in column C is less than $10 dollars then sort everything that remains by the names in Column A? Also, the font is too small, make it somewhat larger.”

Enabling novice users to access to expert-level features in existing software will aggregate into huge productivity gains. As a bonus, the models required to understand these commands don’t have to be nearly as massive and complicated as the current cutting-edge models that the big AI companies use to show off their technology. Indeed, they might be small enough to run locally on devices, making them vastly cheaper and more efficient to operate.

Don’t sleep on this use case. Like smart search, it’s also not as sexy as AGI or full automation, but I’m increasingly convinced that within the next half-decade or so, informally-articulated commands are going to emerge as one of the dominate interfaces to the world of computation.

What About Agents?

One of the more attention-catching storylines surrounding AI at the moment is the imminent arrival of so-called agents which will automate more and more of our daily work, especially in the knowledge sectors once believed to be immune from machine encroachment.

Recent reports imply that agents are a major part of OpenAI’s revenue strategy for the near future. The company imagines business customers paying up to $20,000 a month for access to specialized bots that can perform key professional tasks. It’s the projection of this trend that led Elon Musk to recently quip: “If you want to do a job that’s kinda like a hobby, you can do a job. But otherwise, AI and the robots will provide any goods and services that you want.”

But progress in creating these agents has recently slowed. To understand why requires a brief snapshot of the current state of generative AI technology…

Not long ago, there was a belief in so-called scaling laws that argued, roughly speaking, that as you continued to increase the size of language models, their abilities would continue to rapidly increase.

For a while this proved true: GPT-2 was much better than the original GPT, GPT-3 was much better than GPT-2, and GPT-4 was a big improvement on GPT-3. The hope was that by continuing to scale these models, you’d eventually get to a system so smart and capable that it would achieve something like AGI, and could be used as the foundation for software agents to automate basically any conceivable task.

More recently, however, these scaling laws have begun to falter. Companies continue to invest massive amounts of capital in building bigger models, trained on ever-more GPUs crunching ever-larger data sets, but the performance of these models stopped leaping forward as much as they had in the past. This is why the long-anticipated GPT-5 has not yet been released, and why, just last week, Meta announced they were delaying the release of their newest, biggest model, as its capabilities were deemed insufficiently better than its predecessor.

In response to the collapse of the scaling laws, the industry has increasingly turned its attention in another direction: tuning existing models using reinforcement learning.

Say, for example, you want to make a model that is particularly good at math. You pay a bunch of math PhDs $100 an hour to come up with a lot of math problems with step-by-step solutions. You then take an existing model, like GPT-4, and feed it these problems one-by-one, using reinforcement learning techniques to tell it exactly where it’s getting certain steps in its answers right or wrong. Over time, this tuned model will get better at solving this specific type of problem.

This technique is why OpenAI is now releasing multiple, confusingly-named models, each seemingly optimized for different specialties. These are the result of distinct tunings. They would have preferred, of course, to simply produce a GPT-5 model that could do well on all of these tasks, but that hasn’t worked out as they hoped.

This tuning approach will continue to develop interesting tools, but it will be much more piecemeal and hit-or-miss than what was anticipated when we still believed in scaling laws. Part of the difficulty is that this approach depends on finding the right data for each task you want to tackle. Certain problems, like math, computer programming, and logical reasoning, are well-suited for tuning as they can be described by pairs of prompts and correct answers. But this is not the case for many other business activities, which can be esoteric and bespoke to a given context. This means many useful activities will remain un-automatable by language model agents into the foreseeable future.

I once said that the real Turing Test for our current age is an AI system that can successfully empty my email inbox, a goal that requires the mastery of any number of complicated tasks. Unfortunately for all of us, this is not a test we’re poised to see passed any time soon.

Are AGI and Superintelligence Imminent?

The Free Press recently published an article titled “AI Will Change What it Means to Be Human. Are We Ready?”. It summarized a common sentiment that has been feverishly promoted by Silicon Valley in recent years: that AI is on the cusp of changing everything in unfathomably disruptive ways.

As the article argues:

OpenAI CEO Sam Altman asserted in a recent talk that GPT-5 will be smarter than all of us. Anthropic CEO Dario Amodei described the powerful AI systems to come as “a country of geniuses in a data center.” These are not radical predictions. They are nearly here.

But here’s the thing: these are radical predictions. Many companies tried to build the equivalent of the proposed GPT-5 and found that continuing to scale up the size of their models isn’t yielding the desired results. As described above, they’re left tuning the models they already have for specific tasks that are well-described by synthetic data sets. This can produce cool demos and products, but it’s not a route to a singular “genius” system that’s smarter than humans in some general sense.

Indeed, if you look closer at the rhetoric of the AI prophets in recent months, you’ll see a creeping awareness that, in a post-scaling law world, they no longer have a convincing story for how their predictions will manifest.

A recent Nick Bostrom video, for example, which (true to character) predicts Superintelligence might happen in less than two years (!), adds the caveat that this outcome will require key “unlocks” from the industry, which is code for we don’t know how to build systems that achieve this goal, but, hey, maybe someone will figure it out!

(The AI centrist Gary Marcus subsequently mocked Bostrom by tweeting: “for all we know, we could be just one unlock and 3-6 weeks away from levitation, interstellar travel, immortality, or room temperature superconductors, or perhaps even all four!”)

Similarly, if you look closer at AI 2027, the splashy new doomsday manifesto which argues that AI might eliminate humanity as early as 2030, you won’t find a specific account of what type of system might be capable of such feats of tyrannical brilliance. The authors instead sidestep the issue by claiming that within the next year or so, the language models we’re tuning to solve computer programming tasks will somehow come up with, on their own, code that implements breakthrough new AI technology that mere humans cannot understand.

This is an incredible claim. (What sort of synthetic data set do they imagine being able to train a language model to crack the secrets of human-level intelligence?) It’s the technological equivalent of looking at the Wright Brother’s Flyer in 1903 and thinking, “well, if they could figure this out so quickly, we should have space travel cracked by the end of the decade.”

The current energized narratives around AGI and Superintelligence seem to be fueled by a convergence of three factors: (1) the fact that scaling laws did apply for the first few generations of language models, making it easy and logical to imagine them continuing to apply up the exponential curve of capabilities in the years ahead; (2) demos of models tuned to do well on specific written tests, which we tend to intuitively associate with intelligence; and (3) tech leaders pounding furiously on the drums of sensationalism, knowing they’re rarely held to account on their predictions.

But here’s the reality: We are not currently on a trajectory to genius systems. We might figure this out in the future, but the “unlocks” required will be sufficiently numerous and slow to master that we’ll likely have plenty of clear signals and warning along the way. So, we’re not out of the woods on these issues, but at the same time, humanity is not going to be eliminated by the machines in 2030 either.

In the meantime, the breakthroughs that are happening, especially in the world of work, should be both exciting and worrisome enough on their own for now. Let’s grapple with those first.

####

For more of my thoughts on AI, check out my New Yorker archive and my podcast (in recent months, I often discuss AI in the third act of the show).

For more on my thoughts on technology and work more generally, check out my recent books on the topic: Slow Productivity, A World Without Email, and Deep Work.

Read Entire Article