One significant way I’ve changed on my views related to risks from strongly superhuman intelligence (compared to 2017 bingeing LW DG) is that I no longer believe intelligence to be “magical”.
During my 2017 binge of LW, I recall Yudkowsky suggesting that a superintelligence could infer the laws of physics from a single frame of video showing a falling apple (Newton apparently came up with his idea of gravity, from observing a falling apple).
I now think that’s somewhere between deeply magical and utter nonsense. It hasn’t been shown that a perfect Bayesian engine (with [a] suitable [hyper]prior[s]) could locate general relativity or (even just Newtonian mechanics) in hypothesis space from a single frame of video.
I’m not even sure a single frame of video of a falling apple has enough bits to allow one to make that distinction in theory.
I think that I need to investigate at depth what intelligence (even strongly superhuman intelligence is actually capable of), and not just assume that intelligence can do anything not explicitly forbidden by the fundamental laws. The relevant fundamental laws with a bearing on cognitive and real-world capabilities seem to be:
Physics
Computer Science
Information theory
Mathematical Optimisation
The Relevant Question: Marginal Returns to Real World Capability of Cognitive Capabilities
I’ve done some armchair style thinking on “returns to real-world capability” of increasing intelligence, and I think the Yudkowsky style arguments around superintelligence are quite magical.
It seems doubtful that higher intelligence would enable that. E.g. marginal returns to real-world capability from increased predictive power diminish at an exponential rate. Better predictive power buys less capability at each step, and it buys a lot less. I would say that the marginal returns are “sharply diminishing”.
An explanation of “significantly/sharply diminishing”:
Sharply Diminishing Marginal Returns to Real World Capabilities From Increased Predictive Accuracy
A sensible way of measuring predictive accuracy is something analogous to −log(error rate). The following transitions:
90%−99%
99%−99.9%
99.9%−99.99%
99.99%−99.999%
99.999%−99.9999%
...
All make the same incremental jump in predictive accuracy.
We would like to measure the marginal return to real-world capability of increased predictive accuracy. The most compelling way I found to operationalise “returns to real-world capability” was monetary returns.
I think that’s a sensible operationalization:
Money is the basic economic unit of account
Money is preeminently fungible
Money can be efficiently levered into other forms of capability via the economy.
I see no obviously better proxy.
(I will however be interested in other operationalisations of “returns to real-world capability” that show different results).
The obvious way to make money from beliefs in propositions is by bets of some form. One way to place a bet and reliably profit is insurance. (Insurance is particularly attractive because in practice, it scales to arbitrary confidence and arbitrary returns/capital).
Suppose that you sell an insurance policy for event X, and for each prospective client Ci, you have a credence that X would not occur to them P(¬X|Ci). Suppose also that you sell your policy at $1,000,000.
At a credence of 90%, you cannot sell your policy for <$100,000. At a price of $100,000 and given the credence of 90%, your expected returns will be $0 for that customer. Assume the given customer is willing to pay at most $100,000 for the policy.
If your credence in X not happening increased, how would your expected returns change? This is the question we are trying to investigate to estimate real-world capability gains from increased predictive accuracy.
The results are below:
90%−99%:$0−$90,000
99%−99.9%:$90,000−$9,000
99.9%−99.99%:$9,000−$900
99.99%−99.999%:$900−$90
99.999%−99.9999%:$90−$9
...
As you can see, the marginal returns from linear increases in predictive accuracy are give by the below sequence: {$900000,$90000,$9000,$900,$90,$9,...}
f(n)=$900000$900000×10−n=11×10−n=10nf(n0∈o(10n)
(This construction could be extended to other kinds of bets, and I would expect the result to generalise [modulo some minor adjustments] to cross-domain predictive ability.
Alas, a shortform is not the place for such elaboration).
Thus returns to real-world capability of increased predictive accuracy are sharply diminishing.
Marginal Returns to Real World Capabilities From Other Cognitive Capabilities
Of course predictive accuracy is just one aspect of intelligence, there are many others:
Planning
Compression
Deduction
Induction
Other symbolic reasoning
Concept synthesis
Concept generation
Broad pattern matching
Etc.
And we’d want to investigate the relationship for aggregate cognitive capabilities/”general intelligence”. The example I illustrated earlier merely sought to demonstrate how returns to real-world capability could be “sharply diminishing”.
Marginal Returns to Cognitive Capabilities
Another inquiry that’s important to determining what intelligence is actually capable of is the marginal returns to investment of cognitive capabilities towards raising cognitive capabilities.
That is if an agent was improving its own cognitive architecture (recursive self improvement) or designing successor agents, how would the marginal increase in cognitive capabilities across each generation behave? What function characterises it?
Marginal Returns of Computational Resources
This isn’t even talking about the nature of marginal returns to predictive accuracy from the addition of extra computational resources.
By “computational resources” I mean the following:
Training compute
Inference compute
Training data
Inference data
Accessible memory
Bandwidth
Energy/power
Etc.
An aggregation of all of them
That could further bound how much capability you can purchase with the investment of additional economic resources. If those also diminish “significantly” or “sharply”, the situation becomes that much bleaker.
Marginal Returns to Cognitive Reinvestment
The other avenue to raising cognitive capabilities is the investment of cognitive capabilities themselves. As seen when designing successor agents or via recursive self-improvement.
We’d also want to investigate the marginal returns to cognitive reinvestment.
My Current Thoughts
It sounds like strongly superhuman intelligence would require herculean effort. I no longer think bootstrapping to ASI would be as easy as recursive self-improvement or “scaling”. I’m unconvinced that a hardware overhang would be sufficient (marginal returns may diminish too fast for it to be sufficient).
I currently expect marginal returns to real-world capability will diminish significantly or sharply for many cognitive capabilities (and the aggregate of them) across some “relevant cognitive intervals”.
I suspect that the same will prove to be true for marginal returns to cognitive capabilities of investing computational resources or other cognitive capabilities.
I don’t plan to rely on my suspicions and would want to investigate these issues at extensive depth (I’m currently planning to pursue a Masters and PhD, and these are the kinds of questions I’d like to research when I do so).
By “relevant cognitive intervals”, I am gesturing at the range of general cognitive capabilities an agent might belong in.
Humans being the only examples of general intelligence we are aware of, I’ll use them as a yardstick.
Some potential “relevant cognitive intervals” that seem particularly pertinent:
Subhuman to near-human
Near-human to beginner human
Beginner human to median human professional
Median human professional to expert human
Expert human to superhuman
Superhuman to strongly superhuman
Conclusions and Next Steps
The following questions:
Marginal returns to real world capability of increased cognitive capability across various cognitive intervals
Marginal returns to cognitive capability of increased computational resource investment across various cognitive intervals
Marginal returns to cognitive capability of cognitive investment (e.g. when designing successor agents or recursive self improvement) across various cognitive intervals
Are topics I plan to investigate at depth in future.
I No Longer Believe Intelligence to be Magical
Introduction
One significant way I’ve changed on my views related to risks from strongly superhuman intelligence (compared to 2017 bingeing LW DG) is that I no longer believe intelligence to be “magical”.
During my 2017 binge of LW, I recall Yudkowsky suggesting that a superintelligence could infer the laws of physics from a single frame of video showing a falling apple (Newton apparently came up with his idea of gravity, from observing a falling apple).
I now think that’s somewhere between deeply magical and utter nonsense. It hasn’t been shown that a perfect Bayesian engine (with [a] suitable [hyper]prior[s]) could locate general relativity or (even just Newtonian mechanics) in hypothesis space from a single frame of video.
I’m not even sure a single frame of video of a falling apple has enough bits to allow one to make that distinction in theory.
I think that I need to investigate at depth what intelligence (even strongly superhuman intelligence is actually capable of), and not just assume that intelligence can do anything not explicitly forbidden by the fundamental laws. The relevant fundamental laws with a bearing on cognitive and real-world capabilities seem to be:
Physics
Computer Science
Information theory
Mathematical Optimisation
The Relevant Question: Marginal Returns to Real World Capability of Cognitive Capabilities
I’ve done some armchair style thinking on “returns to real-world capability” of increasing intelligence, and I think the Yudkowsky style arguments around superintelligence are quite magical.
It seems doubtful that higher intelligence would enable that. E.g. marginal returns to real-world capability from increased predictive power diminish at an exponential rate. Better predictive power buys less capability at each step, and it buys a lot less. I would say that the marginal returns are “sharply diminishing”.
An explanation of “significantly/sharply diminishing”:
Sharply Diminishing Marginal Returns to Real World Capabilities From Increased Predictive Accuracy
A sensible way of measuring predictive accuracy is something analogous to −log(error rate). The following transitions:
90%−99%
99%−99.9%
99.9%−99.99%
99.99%−99.999%
99.999%−99.9999%
...
All make the same incremental jump in predictive accuracy.
We would like to measure the marginal return to real-world capability of increased predictive accuracy. The most compelling way I found to operationalise “returns to real-world capability” was monetary returns.
I think that’s a sensible operationalization:
Money is the basic economic unit of account
Money is preeminently fungible
Money can be efficiently levered into other forms of capability via the economy.
I see no obviously better proxy.
(I will however be interested in other operationalisations of “returns to real-world capability” that show different results).
The obvious way to make money from beliefs in propositions is by bets of some form. One way to place a bet and reliably profit is insurance. (Insurance is particularly attractive because in practice, it scales to arbitrary confidence and arbitrary returns/capital).
Suppose that you sell an insurance policy for event X, and for each prospective client Ci, you have a credence that X would not occur to them P(¬X|Ci). Suppose also that you sell your policy at $1,000,000.
At a credence of 90%, you cannot sell your policy for <$100,000. At a price of $100,000 and given the credence of 90%, your expected returns will be $0 for that customer. Assume the given customer is willing to pay at most $100,000 for the policy.
If your credence in X not happening increased, how would your expected returns change? This is the question we are trying to investigate to estimate real-world capability gains from increased predictive accuracy.
The results are below:
90%−99%:$0−$90,000
99%−99.9%:$90,000−$9,000
99.9%−99.99%:$9,000−$900
99.99%−99.999%:$900−$90
99.999%−99.9999%:$90−$9
...
As you can see, the marginal returns from linear increases in predictive accuracy are give by the below sequence: {$900000,$90000,$9000,$900,$90,$9,...}
f(n)=$900000$900000×10−n=11×10−n=10nf(n0∈o(10n)(This construction could be extended to other kinds of bets, and I would expect the result to generalise [modulo some minor adjustments] to cross-domain predictive ability.
Alas, a shortform is not the place for such elaboration).
Thus returns to real-world capability of increased predictive accuracy are sharply diminishing.
Marginal Returns to Real World Capabilities From Other Cognitive Capabilities
Of course predictive accuracy is just one aspect of intelligence, there are many others:
Planning
Compression
Deduction
Induction
Other symbolic reasoning
Concept synthesis
Concept generation
Broad pattern matching
Etc.
And we’d want to investigate the relationship for aggregate cognitive capabilities/”general intelligence”. The example I illustrated earlier merely sought to demonstrate how returns to real-world capability could be “sharply diminishing”.
Marginal Returns to Cognitive Capabilities
Another inquiry that’s important to determining what intelligence is actually capable of is the marginal returns to investment of cognitive capabilities towards raising cognitive capabilities.
That is if an agent was improving its own cognitive architecture (recursive self improvement) or designing successor agents, how would the marginal increase in cognitive capabilities across each generation behave? What function characterises it?
Marginal Returns of Computational Resources
This isn’t even talking about the nature of marginal returns to predictive accuracy from the addition of extra computational resources.
By “computational resources” I mean the following:
Training compute
Inference compute
Training data
Inference data
Accessible memory
Bandwidth
Energy/power
Etc.
An aggregation of all of them
That could further bound how much capability you can purchase with the investment of additional economic resources. If those also diminish “significantly” or “sharply”, the situation becomes that much bleaker.
Marginal Returns to Cognitive Reinvestment
The other avenue to raising cognitive capabilities is the investment of cognitive capabilities themselves. As seen when designing successor agents or via recursive self-improvement.
We’d also want to investigate the marginal returns to cognitive reinvestment.
My Current Thoughts
It sounds like strongly superhuman intelligence would require herculean effort. I no longer think bootstrapping to ASI would be as easy as recursive self-improvement or “scaling”. I’m unconvinced that a hardware overhang would be sufficient (marginal returns may diminish too fast for it to be sufficient).
I currently expect marginal returns to real-world capability will diminish significantly or sharply for many cognitive capabilities (and the aggregate of them) across some “relevant cognitive intervals”.
I suspect that the same will prove to be true for marginal returns to cognitive capabilities of investing computational resources or other cognitive capabilities.
I don’t plan to rely on my suspicions and would want to investigate these issues at extensive depth (I’m currently planning to pursue a Masters and PhD, and these are the kinds of questions I’d like to research when I do so).
By “relevant cognitive intervals”, I am gesturing at the range of general cognitive capabilities an agent might belong in.
Humans being the only examples of general intelligence we are aware of, I’ll use them as a yardstick.
Some potential “relevant cognitive intervals” that seem particularly pertinent:
Subhuman to near-human
Near-human to beginner human
Beginner human to median human professional
Median human professional to expert human
Expert human to superhuman
Superhuman to strongly superhuman
Conclusions and Next Steps
The following questions:
Marginal returns to real world capability of increased cognitive capability across various cognitive intervals
Marginal returns to cognitive capability of increased computational resource investment across various cognitive intervals
Marginal returns to cognitive capability of cognitive investment (e.g. when designing successor agents or recursive self improvement) across various cognitive intervals
Are topics I plan to investigate at depth in future.