I promise I have not abandoned this blog after just 3 posts. I tried and failed to write this a few times, but I just decided to publish because I want to move on. Enjoy (or not)!
In January, AI Impacts released a paper analyzing a survey of thousands of AI experts1 on their predictions for the field. It’s a follow-up to similar surveys in 2016 and 2022, and the main takeaways were general uncertainty, a shift toward earlier expectations of advanced AI, and a predicted higher chance of very bad outcomes. Example questions:2
How many years until you expect a 10% probability of high-level machine intelligence existing?
How likely [in %] do you think it is that the following AI task will be feasible within the next 50 years: routinely and autonomously prove mathematical theorems that are publishable in top mathematics journals today, including generating the theorems to prove.
Imagine that over the past decade, there had been half as much progress in AI algorithms. You might imagine this as conceptual insights being half as frequent. How much less progress [in %] in AI capabilities would you expect to have seen?
There have been discussions on the validity of this study. The most common critique I’ve seen (which the paper does acknowledge) is whether being an “AI expert” means you can accurately forecast. Having technical expertise does not mean you can predict where things are going. Overall, I think we should be cautious about what we take away from these surveys; it may be important to take a pulse, but it is just one measurement. My take is that these studies measured AI hype more than anything else.
There’s also something troubling to me about the milestone questions. Some participants were asked to estimate the probability of an event given a year (e.g. second bullet). But are we really capable of estimating a probability, an actual number between 0 and 100, for a future event? When you dig deep, what are these estimates based on? I think intuitions are important provided we are upfront that this is what they are. But I worry that invoking probability in this way gives a false sense of objectivity, one that feels quite arrogant, to what is really just vibes3. Questions with graded responses feel more honest, such as where choices were one of “{very unlikely, unlikely, even odds, likely, very likely}” or one of “{no, a little, substantial, extreme} concern”. Did the conclusions drawn from this study require more than this level of precision?
I ask this because we are at what I assume is just the beginning of an AI sensationalist era. I wonder if placing too much emphasis on these probability estimates does more good than harm. In addition, the survey samples a very specific part of the population: highly-educated people who will likely continue to hold intellectual power in society. What will happen if studies like this are used to drive policy and lawmaking that affects those who don’t have a say in these matters?
But going back to probability, here’s a simple argument: if humans were good at estimating probabilities, we would have better guesses about the future. We wouldn’t succumb to the gambler’s fallacy. Much of betting would be boring. Puzzles like the Monty Hall problem wouldn’t be so interesting, because the solution would be intuitive. We wouldn’t have recency bias. We’d be more rational (which we aren’t) and maximize expected utility (which we don’t).
My guess is that human understanding and decision-making has little to do with probability estimation, and everything to do with emotion and past experiences. We may have good intuitions that guide us to make beneficial actions, but intuition is much less convenient to quantify. Probability is a useful tool to help us reason about uncertainty, if we understand the construction and assumptions. But if we force intuition into a probability estimate, we must understand what we are trying to do: it is an undefined conversion, completely subjective, and loses the richness of qualitative experience.
People (at least, in the West) have written about quantifying uncertainty using probability since the 17th century. One specific formulation is the Von Neumann–Morgenstern utility theorem (1947): an agent (not necessarily human) chooses between actions, where the choice is based on a probability distribution. If an agent’s preferences satisfy four specified axioms, the agent will always prefer actions that maximize expected utility (subjective desirability of actions). While mathematically sound, this framework does not seem to model human behavior well (e.g. Ellsberg paradox). If we’re just trying to build machines, maybe that’s okay. After all, something needn’t be human-like to be intelligent or even just useful. But if we are so insistent that we must have autonomous AGI (e/acc?), we must understand how the machines will exist in the context of human society. And we are then back to the very human problem of trying to estimate risk.
A lot of people, including myself, don’t buy the expected utility theorem (EUT) as a principle for AI. On avoiding exploitation of utility-maximizing AI, Adam Bales pokes holes in EUT. To quote him, this whole post is basically just trying to say:
Still, I believe we learn more from recognising our limitations than from imposing false certainty.
We are far from understanding how humans form intuitions and make decisions, but it’s okay to acknowledge that! And I think understanding ourselves will help us make better machine learning tools that benefit everyone.
In the end, I don’t have enough expertise to have constructive ideas for how to quantify intuitions and risk. What I do know is that people often use statistics as objective truth, and that to be quantified is to be valid. Numbers certainly have their usefulness, but we should always be critical when we see the reduction of intuition and human experience into probabilities, and the imposition of false certainty.
Defined to be participants who published at any of the six top-tier AI conferences in 2022.
Reminds me of the weird discourse on Twitter on “p(doom)” (the probability of doom due to advances in AI). Where are you pulling these numbers from? (my guess: your ass)