What's worth learning if we have AGI?

Matuschak, Andy

What’s worth learning if we have AGI?

March 2024. Part of “Letters from the Lab”, a series of informal essays on my research written for patrons. You can also listen to this essay (20 minutes).

GPT-4 recently reached its first birthday. Yet I confess I’ve still not metabolized the changed world we live in. I keep stumbling into plans and beliefs I formed in a pre-LLM world. Rather than the piecemeal updates I’ve been doing, I’d like to step back and understand: what transformational enabling environments become possible in a world with very powerful AI?

But to explore this question, I first need to understand something more fundamental: what is the purpose of learning and growth in a world with very powerful AI? What kinds of knowledge and capacity remain meaningful—or become so?

Contingency as a boundary condition

One common answer is that we’ll still need to know enough to instruct, supervise, and coordinate AI systems. As usually framed, that’s a moving target, one which depends on the AI’s capabilities. What will models be able to do in three years? In my lifetime? These questions rapidly degrade into guessing games. We can ask prediction markets, or consult scaling laws. But we can find more durable answers by exploring whatever bright lines we can find—places where it seems we can never completely outsource a task to AI, no matter how capable they become.

A few weeks ago, I asked GPT-4 to translate a fairly complex Javascript algorithm into Python. The output seemed mostly right, but it contained serious bugs which arose from nuanced differences in the two languages’ numeric primitives. I needed to understand each language well enough to detect the error and diagnose it. As it happens, even after I explained the mistake to GPT-4, it continued to emit variations with the same root problem, so I needed enough programming ability to write a patch myself.

But all that is a temporary state of affairs. I gave the model everything it needed to produce a correct solution. It could have formally verified that the translated program had identical execution semantics to the original. The model didn’t need any special information or clarification to improve its answer: it just needed the two languages’ documentation, and enough raw reasoning capacity to perform the translation correctly. So, we don’t need prediction markets or scaling laws to conclude that at some point—maybe next year, maybe in the distant future—I won’t need the knowledge I used to supervise that translation. That fact is a consequence of the task and its fundamental tractabilityYes, fine: some kinds of program translation are theoretically intractable in the general case. For example, suppose you’re translating from Python to a language that is only defined over total functions. We might like our AI to tell us when an input is untranslatable, but by Rice’s Theorem this is impossible in general, because totality is undecidable. Our AI must sometimes emit an invalid program or fail to halt. I can instead claim that we know of no principled reason why there should be a difference between the set of program translation tasks a human can perform and the set which a sufficiently capable AI can perform..

Not all tasks are like this. For example, suppose that you’re a composer working on a cello concerto. Could you outsource this to an AI someday, like you can outsource that program translation task?

The first problem you have is one of communication. You have some inchoate sense of what you’d like to express, and you’ll need some knowledge to make those ideas legible to the model. Today’s generative audio models let you describe what you want in words. To control the output, you might need knowledge of vocabulary like “staccato”, or of concepts like altered chord structures. Words aren’t a good medium for communicating musical ideas, so you might also need knowledge of representations like lead sheets, or ADSR envelopes, or spectrograms.

But the main problem with outsourcing your cello concerto to the AI isn’t that it’s hard to precisely communicate what you want to the model. It’s that you don’t yet know what you want. Composing is a process of discovery—what Donald Schön has called “reflecting-in-action”. As you try different themes, you notice how each seems to interact with the musical landscape you’d imagined. These reactions guide the search process, but they also clarify and transform your own intent. You can’t know in advance what knowledge you’ll need, or what context the AI must have, because your attempts uncover that. You understand what the piece wants to become through the process of composing it. That doesn’t mean the AI can’t help you along the way. In fact, it will surely expand your creative reach. But you can’t outsource the process without producing a very different result.

In this way, the composition task is in a different category from the program translation task. But that category isn’t just for the arts. It also encompasses much of what Herb Simon has called “sciences of the artificial”: engineering, architecture, city planning, business, education, medicine, economics, and other sciences concerned “not with how things are but with how they might be”. For many tasks in these domains, the “right” solution must be iteratively negotiated, navigating tradeoffs in a tangled web of factors which neither you nor an AI can specify in advance. The important work often lies not in solving problems but in determining what the problem actually is, as it intersects our evolving and interacting preferences.

For example, people often wonder: why is software so hard to build? Why is it so often buggy and delivered far behind schedule? Why can’t we just specify what we want, and mechanically turn that description into predictably perfect software? Well, those tools do exist, and people sometimes use them to produce or verify especially crucial components. But they don’t get used much in practice, and I believe that’s because in most situations, we can’t specify in advance precisely how we want software to behave. Now, AI demands much less precision than traditional formal modeling tools. In simple cases, we can let an AI fill in the details of a high-level description like “write a Python script to combine these CSVs, removing all the empty rows.” But for more complex software, we discover how we want it to behave in the iterative process of building and reflecting, much as we discover what a cello concerto wants to be in the process of composing it.

So, if inherently contingent tasks like these need us to stay in the loop, to supervise an AI as it performs an increasing share of the work, what kinds of knowledge and capacity will we require?

Learning to steer

Tools like Midjourney or DALL-E are great for casually generating whimsical pictures. “Illustration of two golden retrievers in Central Park playing chess, at golden hour.” I don’t need any special knowledge or capacity to get a satisfying result. But in this scenario, I don’t have a strong expressive intent; I’m not trying to find a perfect fit for a complex situation. I just want something cute to use in a party invitation.

But I can also use these image models in more sophisticated ways. If I’m creating images to support some of my ideas in a high-stakes presentation, I’ll need to exert more control. I’ll iterate on the image over multiple rounds, noticing and expressing what’s working and not working in each image. After a dozen images, I’ll often realize I’d taken the wrong approach in the first place (“You know, what this really needs is…”), and I’ll reinterpret my past ideas in new light.

If I’m a professional visual artist, these image models may still meaningfully expand my creative capacity. But to use these tools with complex and personal expressive intent, I’ll need to exert even more control. I’ll have stronger opinions about what I want, visually, and I’ll need to express those nuances to tightly constrain the output space.

To get a controlled result, I can’t just say “I like this image” or “I don’t like that image.” The output space is too large. I’ll need to steer, and for that I’ll need some special expertise.

My mental repertoire of visual ideas and techniques determines how I can frame what I’m trying to do, and the breadth of “moves” I can imagine making. To make good suggestions, I need a strong model of how different framings and moves might impact the output, relative to my intent, as well as how those choices might constrain future moves. With each new image—each new experiment—I need to perceive the consequences clearly, both in the individual elements and in their contributions to the whole. Did the move have the result I expected? What other consequences did it produce, and how do they interact with my aims? Those observations will often shift my sense of what I’m trying to do, or push me to reframe my approach completely.

As a shorthand, I’ll call this “taste”: my repertoire of domain ideas and techniques; my model of how different frames and moves will impact the situation; my ability to perceive and evaluate the results. I’ve been talking about image generation in this section, but I think the same arguments apply to the professional domains I described earlier, like software design, architecture, business, medicine, and so on. Domains outside the arts might call this concept “professional judgment” or “decision-making expertise”, but I claim there’s a unity here in one’s capacity to steer in highly contingent domains.

In a future with powerful AI, I believe we’ll still need this kind of taste to work deeply in these domainsOf course, one of the wonderful things about AI is that it will radically lower the floor to these domains. If you don’t have much taste, generative models will expand your capacity quite a lot. They’ll let you access some coarse result, one you couldn’t have gotten otherwise, and in many shallower situations that’ll be enough.. When the task depends on humans’ messy situations and ill-defined preferences, and when the action space is so high-dimensional that guess-and-check is intractable, this is the capacity you’ll need to discover what image (or software, or whatever) you want to create. Use of AI will expand what you can imagine and what you can reach, but you’ll still need to steer. In fact, for some tasks, I expect the kind of taste I’ve described will become much more important, as AI increasingly handles less contingent elements.

What about technical expertise?

So far I’ve mostly discussed our ongoing need for fuzzy heuristics and instincts. But today, most learning focuses on acquiring information, detailed conceptual understanding, and technical fluency. How might that change in a future with powerful AI? If I’m an inventor of electronic gadgets, should I still study physics? If I’m a composer, does music theory help me? If I’m an architect, should I still hone my drafting skills?

I find myself quite uncertain here—I think answers will vary quite a lot—but a few considerations come to mind:

Media of communication. Abstract terms and concepts should generally remain important for communication, both with AI systems executing on our behalf and with human collaborators. You’ll need to describe your situation’s constraints; express your idiosyncratic intent; understand and supervise tradeoffs. All this will often benefit from technical language. Domain-specific representations (like music notation) are likely to remain important for high-bandwidth expression.

Media of purpose. Your intent—even internally expressed—will often be framed in terms of complex domain-specific ideas. What kind of understanding would you need to invent Bitcoin, if you had powerful AI to help you? One important element would be a clear sense of what you were trying to achieve. Quite a few digital currencies had been invented, prior to Bitcoin, but they each had some important problem. Often these problems were quite technical. To conceive of something like Bitcoin as a goal, you’d need to understand issues like sybil attacks and tradeoffs around deflationary monetary policy. Another way to look at this is that your evaluation function—your sense of a given solution’s value—will depend in part on your domain knowledge.

Repertoire-building concepts. I’ve suggested that one component of taste is your repertoire of domain ideas, moves, and frames. You can build this repertoire by collecting a lifetime of one-off experiences, but it’s awfully inefficient. Conceptual understanding lets you move up the ladder of abstraction, unifying these isolated elements and suggesting others. For example, in cooking, many people make lots of one-off recipes without building any creative flexibility. But if you study the concept of braising rather than making a few recipes which happen to use that technique, you’ll see the unity in those dishes, and more clearly see their differences. You’ll see braising as an abstract “move” you can deploy in many contexts. Say that your AI cookbook suggests a steamed fennel dish. If it’s a cool night, you might realize that you feel more like braising it—but only if you can think in terms of braising, rather than individual recipes.

Bootstrapping ingredients. Another component of taste is your model of how different moves might impact what you’re working on. When you reach instinctively for one move over another, you’re leaning on patterns observed over long experience. But if an AI solves every problem for you start-to-finish, how will you ever learn to recognize these patterns? This suggests that you need technical knowledge just to bootstrap yourself—to apply basic moves in different contexts so that you can internalize their behavior and, perhaps, defer more to an AI in the future.

Loss of agency. Consider the negation: what would be lost if we outsourced all technical knowledge to AI systems? Here I think of Neil Postman’s Technopoly and Ivan Illich’s Tools for Conviviality. When we understand our tools, we can shape our environment and our society according to our values, rather than (exclusively) being shaped by our tools/environment. By contrast, when tools and systems we depend on are opaque, and few understand them, that can concentrate power among technocratic elites and risks disenfranchising individuals from personal and community autonomy. In a world with strong AI, we might want to maintain detailed technical knowledge just to maintain participation in our own destiny.

Fine: these observations suggest you’ll still have need for some technical knowledge in a future with powerful AI. But which kinds of knowledge? Surely some become less useful and others more so.

One shift I expect is in the value of knowledge involved in certain kinds of well-defined execution—parts of tasks where the expected inputs and outputs can be clearly specified. In a world with Mathematica, there may be value in learning how to factor friendly quadratics into binomials, since that helps illuminate the concept of zeros. But I’m not sure how much value remains in memorizing the quadratic formula. Likewise, with today’s AI (and with my broad domain knowledge) I don’t really need to know the details of a programming language’s syntax to work effectively in it.

Work in highly contingent domains may be difficult to outsource, but it will often include straightforward subtasks with few important tradeoffs. If there’s knowledge like language syntax that’s confined to those subtasks, without affecting the more contingent overall task, we may find we no longer need it. The same observation applies to procedural skills involved only in those subtasks. My optometrist used to operate a complex machine to assess my myopia; now an automated laser does that work. But that’s just one part of my appointments, and I’m glad I can still talk with her about the messy tradeoffs involved in choosing a prescription.

It would be interesting to audit curricula in various domains with this notion in mind. How much could be removed? What else should be added?

One concern I have here is that insights and revelations can appear where they’re not expected. Sometimes when I’m doing fairly mechanical interface design work, I notice an opportunity I hadn’t seen before. A physician doing a routine exam might notice something subtle, something that wasn’t even supposed to be tested. I’m not sure how to think about these tradeoffs.

Meaning in doing

Here’s another bright line we can draw: you’ll still want knowledge and capacity for activities where the meaning comes from doing it yourself.

Even if an AI can synthesize a perfect rendition of a melody performed on cello, lots of people will still want to learn to play cello because there’s tremendous pleasure in producing the sound. Many writers might be happy to use automated spell-check, but wouldn’t want to give up the satisfaction of honing a beautiful phrase.

If you read Aristotle or The Bible now, I expect you’ll still want to read it in a future with powerful AI. The meaning in philosophical and spiritual contemplation comes in large part from within. Some part of that meaning may also come from discussion with others, and perhaps one day an AI can help with that, though for the moment I find that I can’t emotionally connect to its responses in discussions on these topics.

I’d be very happy if our AI-powered future allowed more individuals to experience the pleasures of deep action in more domains. I see glimmers of that in non-artists’ experiences with Midjourney, and in non-programmers’ accounts of using GPT-4 to build software.

Some remaining questions

Hard sciences. So far I’ve been talking about arts and “sciences of the artificial”. What about the hard sciences? What do cellular biologists of the future need to know? Here I find myself quite uncertain. If you’re curious about how a particular cellular mechanism works, that’s a complex, ill-defined question, but I’m not sure how contingent it is. It’s a question about objective reality. It’s true that the best answer may depend on who’s asking and why, but there’s plenty of precedent in the scientific literature for what a good characterization of a cellular mechanism looks like. The cellular biologist needs to know enough to ask meaningful questions. Perhaps that requirement alone entails the rest of a typical syllabus; I’m not sure. As Michael Nielsen points out, AI-driven science will likely be bottlenecked on experiment in domains like biology “where historically contingent facts about the world crucially impact a phenomenon”, but then—is our role only in facilitating the experiment?

The limits of taste. Are my claims about taste always true? Can’t a sufficiently powerful AI learn my taste, then produce exactly what I’d want, given all the messy details of my situation? I think the answer will depend on just how personal, illegible, and demanding my intentions are. If I want my cello concerto to precisely capture the ineffable details of my inner emotional world, it’s hard to imagine an AI producing that unless it can accurately simulate my subjective experience. This seems less true in many professional situations, even ones which currently seem quite contingent. Diagnosticians need to use what I’ve called taste to steer their examinations and reflect on their findings. But a multimodal AI should in principle be able to reach the same diagnosis. This situation is pretty objective. An architect’s work is something of a middle ground—there are objective problems to be solved, but the solutions will also express the architect’s somewhat illegible creative preferences.

Expanding the pie. In this essay I’ve implicitly assumed the present world of professions and activities and kinds of knowledge. But in a world with powerful AI, I expect we’ll spend our time in very different ways. We’ll have new kinds of hobbies, new kinds of art, new scientific fields, new professions, new institutions. Without knowing what those things are, can we say anything about what kinds of learning and growth might be valuable for all these new pursuits?

Thanks to Catherine Olsson, Jason Crawford, Joe Edelman, Laura Deming, Michael Nielsen, Sara LaHue, and Sebastian Garren for helpful conversations. Thanks also to David Chapman for introducing me to the work of Donald Schön.

My work is made possible by a crowd-funded research grant from my Patreon community. If you find my work interesting, you can become a member to help make more of it happen. You’ll get more essays like this one, previews of prototypes, and events like seminars and unconferences.

Finally, a special thanks to my sponsor-level patrons as of March 2024: Adam Marblestone, Adam Wiggins, Andrew Sutherland, Andy Schriner, Ben Springwater, Bert Muthalaly, Boris Verbitsky, Calvin French-Owen, Dan Romero, David Wilkinson, fnnch, Heptabase, James Hill-Khurana, James Lindenbaum, Jesse Andrews, Kevin Lynagh, Kinnu, Lambda AI Hardware, Ludwig Petersson, Maksim Stepanenko, Matt Knox, Michael Slade, Mickey McManus, Mintter, Peter Hartree, Ross Boucher, Russel Simmons, Salem Al-Mansoori Sana Labs, Thomas Honeyman, Todor Markov, Tooz Wu, William Clausen, William Laitinen

AndyMatuschak

Andy Matuschak

How might we learn?

How can we develop transformative tools for thought?

Quantum Country

Why books don’t work

Timeful texts

More

More work

Orbit

How to write good prompts

Notes

Working notes

What‘s top of mind

The mnemonic medium

Evergreen notes

Enabling environments, video games, and the Primer

Programmable attention

Taking knowledge work seriously

Working with blips: our second system

Augmenting scholarship: a proto-proposal

Preparatory notes for a manifesto

A malleable reading environment

A startling glimpse of malleable software

Tools for tools for thought

Design-in-action: book markup grid

Invention, influence, and impact

A primitive for enabling environments

What can malleable software learn from Realtalk?

On sitting down to study independently

Five years of evergreen notes

Exorcising us of the Primer

A spring flood of projects

What's worth learning if we have AGI?

What does spatial computing want to become?

In praise of the particular

Initial results from highlight-driven prototype

On breadth vs. depth in learning

Highlight-driven practice/comprehension support

Studying myself studying linear algebra

Initial experiments in self-explanation support

Reading comprehension and memory systems

Fluid practice for fluid understanding

Ethics of AI-based invention: a personal inquiry

Memory systems and problem-solving practice

Becoming a Wizard-of-Oz learning assistant

Three years of crowdfunded research

Towards impact through intimacy

Cultivating depth and stillness in research

Lessons from summer 2022 prototype

Breaking the mnemonic medium out of its box

Prospects for consumer silent speech interfaces

The joyful surprises of user observation

A peritextual mnemonic medium

Implicit practice: a sight reading parable

Exponentials and forgetting in Quantum Country

Lessons from 2021

Tools for thought: science, design, art, craftsmanship?

Quantum Country’s suspiciously flat forgetting curves

Doing-centric explanatory mediums

Architectures for a more flexible mnemonic medium

Revamping the mnemonic medium for reader control

Armories for tool-maker/-user collaborations

Finding research–context fit

Crowdfunded research vs. the NSF CAREER grant

Too easy to be effortless

Ratcheting progress in tools for thought

In search of better questions

Reflections on 2020 as an independent researcher

Liquid olives and iPhones

Working with authors: entangled skills

The carrying capacity of a regular memory practice

The galaxy brain problem; speed-running UIs

“Skip”: exponential-backoff deferral mechanisms…

Thoughts on crowdfunding tools for thought

A nascent art direction for Orbit

Demonstrating a personal mnemonic medium

Bringing ideas into your Orbit

Building complex skills online

Numbers at play

Andy
Matuschak