Key insights of today’s newsletter:
In a recent TED talk, ex-OpenAI board member Helen Toner’s makes the case for why we should all have a voice in how AI shapes our future.
Even though we don’t fully comprehend the technology yet, not even the experts, we cannot sit still and do nothing.
Ultimately, AI policy is not about choosing progress over safety, it’s about how we manage uncertainty together.
↓ Dive deeper (9 min)
Alignment is a term used to describe the process of making models safe. It refers to risks and mitigations associated with today’s AI systems as well as future systems.
Recently, there have been several high-profile departures at OpenAI, including Jan Leike, Daniel Kokotajlo, and Ilya Sutskever, who were all part of the alignment team. In an thread on X, Leike criticized his former employer for not taking safety seriously enough:
“I believe much more of our bandwidth should be spent getting ready for the next generations of models, on security, monitoring, preparedness, safety, adversarial robustness, (super)alignment, confidentiality, societal impact, and related topics. (…) But over the past years, safety culture and processes have taken a backseat to shiny products.”
Given the recent shake up, I thought it would be appropriate to go over Helen Toner’s recent TED talk. Toner is an AI policy researcher and was a board member at OpenAI until six months ago, when she, together with three other board members, tried to oust Altman but ultimately failed to do so.
While I encourage everyone to watch the talk in full, here are the highlights accompanied with some reflections of my own. Enjoy.
00:21 No one understands
“Normally, the people building a new technology understand how it works inside and out. But for AI, a technology that’s radically reshaping the world around us, that’s not so. Experts do know plenty about how to build and run AI systems, of course. But when it comes to how they work on the inside, there are serious limits to how much we know.”
Toner is right to point out that no one truly understands AI, not even the experts. They understand all the individual parts that make up the thing, but not exactly why the thing does what it does.
04:18 Looking inside the box
“Researchers sometimes describe deep neural networks, the main kind of AI being built today, as a black box. But what they mean by that is not that it’s inherently mysterious and we have no way of looking inside the box. The problem is that when we do look inside, what we find are millions, billions or even trillions of numbers that get added and multiplied together in a particular way. What makes it hard for experts to know what’s going on is basically just, there are too many numbers, and we don’t yet have good ways of teasing apart what they’re all doing.”
Large language models can be understood as complex, non-linear functions operating in a multi-dimensional vector space, processing input data to generate outputs based on learned patterns and relationships in the data. In short, it’s math. Really, really complex math. And because of that, its incredibly challenging to observe what’s going on.
The question arises how we govern something that we don’t yet fully comprehend? Toner has some ideas.
5:10 Don’t be intimidated
First, don't be intimidated. Either by the technology itself or by the people and companies building it. On the technology, AI can be confusing, but it’s not magical. There are some parts of AI systems we do already understand well, and even the parts we don’t understand won’t be opaque forever. An area of research known as “AI interpretability” has made quite a lot of progress in the last few years in making sense of what all those billions of numbers are doing.
Interpretability refers to the extent to which an AI system’s internal processes and decisions can be understood and explained in human terms. This crucial for solving questions around alignment.
When we say a model or system is ‘aligned’, it means the models do what we ask, don’t do bad things when we ask, don’t fail catastrophically, and don’t actively deceive us. These are all different problems to solve, but all benefit from our ability to interpret what models are doing when firing those artificial neurons.
While Toner is optimistic about interpretability, I am less so. Models are only getting bigger and more complex. A significant breakthrough is needed to better comprehend these advanced models1. I do agree with her that we shouldn’t be intimidated by this challenge; nor by the technology or the people developing it.
6:10 You have a voice
And when it comes to those building the technology, technologists sometimes act as though if you’re not elbows deep in the technical details, then you’re not entitled to an opinion on what we should do with it. Expertise has its place, of course, but history shows us how important it is that the people affected by a new technology get to play a role in shaping how we use it. (…) You don’t have to be a scientist or engineer to have a voice.
I love the fact that Toner says this. You and me both have a say in how we want AI to shape society. It’s easy to feel like you’re stuck in the backseat, but we have a choice. All of us have the responsibility to use this technology in ways that are good for us and for our fellow human beings.
6:53 Policymaking
A lot of conversations about how to make policy for AI get bogged down in fights between, on the one side, people saying, "We have to regulate AI really hard right now because it’s so risky." And on the other side, people saying, “But regulation will kill innovation, and those risks are made up anyway.” But the way I see it, it’s not just a choice between slamming on the brakes or hitting the gas. If you’re driving down a road with unexpected twists and turns, then two things that will help you a lot are having a clear view out the windshield and an excellent steering system.
I have not a lot to add to this. Toner does an amazing job to break the complex topic that is AI safety down to its core: how do we manage uncertainty?
You could get the idea there are only two sides to this debate: the accelerationists that believe we’re not moving fast enough and the doomers that believe progress should be halted as soon as possible. But both are counter-productive when it comes to policy making and distract us from the real and immediate challenge we face.
9:33 Left unchecked
Left to their own devices, it looks like AI companies might go in a similar direction to social media companies, spending most of their resources on building web apps and for users’ attention. And by default, it looks like the enormous power of more advanced AI systems might stay concentrated in the hands of a small number of companies, or even a small number of individuals.
I’ve alluded to the idea of AI becoming the new infinite scroll in my latest article. This stuff is already happening. In the battle for our attention, companies are happy to cut corners or rush out new products. And the giants are becoming more gigantic.
But not all is lost. Quite the opposite. It’s early days. We have time to exert control and steer the direction of where all of this is going, as long as we continue to educate ourselves and act on it. Because we can’t wait for the future to unfold. We need to push for futures we actually want.
Watch the talk in full here.
Before you go…
As always, thanks for reading Teaching computers how to talk. Support for this newsletter comes from you. Any contribution, however small, will help me allocate more time to the newsletter and put out the best articles I can.
PS. You can pause or cancel donations anytime.
PPS. If you do end up donating, the universe will shower you with joy for the duration of one entire day. True story.
Anthropic just published a paper that is a step in the right direction. They have identified how millions of concepts are represented inside Claude Sonnet. It’s the first ever detailed look inside a modern, production-grade large language model. Read more here: https://www.anthropic.com/research/mapping-mind-language-model
Everything...EVERYTHING in moderation.
The quote at 6:10, under "you have a voice", is the most important one for me. Expertise tends to be narrow. An "AI expert" is typically someone who expertise covers how to get a certain class of prediction models to automate certain activities. If you want to know how to, for instance, get a computer to generate captions for images, ask an AI expert.
Problem is, journalists and policymakers (and CEOs, school principals, public agency directors...) go straight to AI experts for matters well beyond their narrow expertise. People who engineer LLMs have no more understanding of human intelligence, education, linguistics, creative arts, or corporate managment than a random person off the street. Some of them are nonetheless sought out for their opinions, which they happily provide, on the role of AI in these things. As is their right, of course. But when Geoff Hinton says that LLMs must necessarily develop semantic understanding in order to predict words, he's just shooting from the hip. He has no idea. Don't ask him, ask a linguist! When OpenAI says that GPT4 has "advanced reasoning capabilities" and broad "general knowledge", they're just tossing around words. Go ask a philosopher of mind, or an epistemologist! When Sundar Pichai talks about how AI enhances learning, he's speaking in his role as a salesman. Go ask a development psychologist, or an education researcher! When a software engineer says that an algorithm for generating music is "creative" in the way human musicians are, ask him if he's ever written his own music.
I don't fault AI experts for sharing their opinions about how AI should be used, or what they'll be used for in the future. We all get to have opinions. But I am very concerned that people in positions of authority believe AI experts have expertise on anything beyond programming computers to perform prediction and classification.