Let me also copy over Forrest’s (my collaborator) notes here:
> people who believe false premises tend to take bad actions.
Argument 3:.
- 1; That AGI can very easily be hyped so that even smart people
can be made to falsely/incorrectly believe that there "might be"
_any_chance_at_all_ that AGI will "bring vastly positive changes".
- ie, strongly motivated marketing will always be stronger than truth,
especially when VC investors can be made to think (falsely)
that they could maybe get 10000X return on investment.
- that the nature of AGI, being unknown and largely artificial,
futuristic, saturated with modernism and tech optimism,
high geekery, and is also very highly funded, means that AGI
capabilities development has arbitrary intelligent marketing support.
- 2; People (and nearly all other animals) are mostly self-oriented.
- that altruism is usually essentially social signalling,
and is actually of very little value as benefit to anything
other than maybe some temporary social prestige building.
- ie, each possibly participating person will see the possibility
that maybe they could ride "up to riches" on the research bandwagon,
and/or on any major shift in the marketing dynamics;
that in any change there will be winners and loosers,
and they want a chance to be "on the winning side", since
everyone has bio-builtin social/market game addiction tendencies
(biases) and they think that they can use their high intelligence
to gain some personal strategic advantage.
- 3; People are *selectively* rational.
- Ie, that we should not expect deviations from rational agent models,
because our selective notion of rationality will likely match
our *also* self-selected models of 'rational actors'.
- as such, we can expect that there will be all sorts of
seemingly rational "arguments" that suggest that individual
selfish and self supporting action (favoring tech development)
is maybe "mostly harmless", and that at least some of the risks
are maybe over emphasized, and that "therefore" we should
maybe shift our actions towards the more (manufactured) "consensus"
that the "robustly good" action is "keep doing AGI capability
development" and also "increase safety work" -- and to be assuming
that anything else is either impossible or maybe "robustly bad",
or that at the very least, that the things that seem obvious
are probably not at all obvious, for complicated "rational reasons"
that just happen to align with their motivated preferred view.
- 4; thus the false belief that there "might be" some non-zero
small chance that AGI can be "aligned" so as to bring about
whatever positive changes (hype the huge return on investment!)
is so strong/motivating that it dominates all other considerations.
- as that selective motivated reasoning in the possibility that
someone can be part of the winning team and make history is
so strong that even the suggestion that the very notion that
*any* AGI persistently existing is inherently contradictory
with the notion of the continuing survival of life on this planet
is completely rejected without any further examination.
I am honestly very confused on how Forrest is so confident that radical positive changes will not happen in our lifetime.
More importantly, he seems to be complaining that his opponents have different goals, and claims they’re selectively rational. Heads up, but rational behavior can only be determined once what goals you have are determined. Now, to him, his goals probably are much less selfish than those that want AI progress to speed up, so it’s not rational for AI capabilities to increase. I too do not think AI progress is beneficial, and believe it probably is harmful, so I’d slowdown on the progress too.
This is critical, because Forrest is misidentifying why AI progress people want AI to progress. The fact that they have very different goals compared to you is the reason why they want AI to progress, and not a rationality failure.
Another critical crux is I am far more optimistic than Forrest or Remmelt on AGI Alignment working out in the end. If I had a pessimism level comparable to Forrest or Remmelt, I too would probably advocate far more around governance strategies.
This is for several reasons:
My general prior is most problems are solvable. This doesn’t always occur, see the halting problem’s unsolvability, or the likely non-solvability of a perpetual motion machine, but my prior is if there isn’t a theorem prohibiting it and it doesn’t rely on violating the laws of physics, I’d say it solvable. And AGI alignment is in this spot.
I believe alignment is progressing, not enough to be clear, but if AI alignment was as well resourced as AI capabilities research, then I’d give it a fair shot of solving the problem.
Finally, time. In the more conservative story described here, it still takes 20-30 years, and while AGI now would probably be incompatible with life due to instrumental convergence and inner alignment failures, so long as you have extremely pessimistic beliefs about progress in AI alignment, this is the type of time frame where I’d place 60% probability on having a working solution to the AGI alignment problem due to progress on it.
That prior for most problems being solvable is not justified. For starters, because you did not provide any reasons above to justify why beneficial AGI is not like a perpetual motion machine, AKA a “perpetual general benefit machine”.
Again no reasons given for the belief that AGI alignment is “progressing” or would have a “fair shot” of solving “the problem” if as well resourced as capabilities research. Basically nothing to argue against, because you are providing no arguments yet.
No reasons given, again. Presents instrumental convergence and intrinsic optimisation misalignment failures as the (only) threat models in terms of artificial general intelligence incompatibility with organic DNA-based life. Overlooks substrate-needs convergence.
Always happy to chat further about the substantive arguments. I was initially skeptical of Forrest’s “AGI-alignment is impossible” claim. But after probing and digging into this question intensely over the last year, I could not find anything unsound (in terms of premises) or invalid (in terms of logic) about his core arguments.
Let me also copy over Forrest’s (my collaborator) notes here:
I am honestly very confused on how Forrest is so confident that radical positive changes will not happen in our lifetime.
More importantly, he seems to be complaining that his opponents have different goals, and claims they’re selectively rational. Heads up, but rational behavior can only be determined once what goals you have are determined. Now, to him, his goals probably are much less selfish than those that want AI progress to speed up, so it’s not rational for AI capabilities to increase. I too do not think AI progress is beneficial, and believe it probably is harmful, so I’d slowdown on the progress too.
This is critical, because Forrest is misidentifying why AI progress people want AI to progress. The fact that they have very different goals compared to you is the reason why they want AI to progress, and not a rationality failure.
Another critical crux is I am far more optimistic than Forrest or Remmelt on AGI Alignment working out in the end. If I had a pessimism level comparable to Forrest or Remmelt, I too would probably advocate far more around governance strategies.
This is for several reasons:
My general prior is most problems are solvable. This doesn’t always occur, see the halting problem’s unsolvability, or the likely non-solvability of a perpetual motion machine, but my prior is if there isn’t a theorem prohibiting it and it doesn’t rely on violating the laws of physics, I’d say it solvable. And AGI alignment is in this spot.
I believe alignment is progressing, not enough to be clear, but if AI alignment was as well resourced as AI capabilities research, then I’d give it a fair shot of solving the problem.
Finally, time. In the more conservative story described here, it still takes 20-30 years, and while AGI now would probably be incompatible with life due to instrumental convergence and inner alignment failures, so long as you have extremely pessimistic beliefs about progress in AI alignment, this is the type of time frame where I’d place 60% probability on having a working solution to the AGI alignment problem due to progress on it.
Responding below:
That prior for most problems being solvable is not justified. For starters, because you did not provide any reasons above to justify why beneficial AGI is not like a perpetual motion machine, AKA a “perpetual general benefit machine”.
See reasons to shift your prior: https://www.lesswrong.com/posts/Qp6oetspnGpSpRRs4/list-3-why-not-to-assume-on-prior-that-agi-alignment
Again no reasons given for the belief that AGI alignment is “progressing” or would have a “fair shot” of solving “the problem” if as well resourced as capabilities research. Basically nothing to argue against, because you are providing no arguments yet.
No reasons given, again. Presents instrumental convergence and intrinsic optimisation misalignment failures as the (only) threat models in terms of artificial general intelligence incompatibility with organic DNA-based life. Overlooks substrate-needs convergence.
I’ll concede here that I unfortunately do not have good arguments, and I’m updating towards pessimism regarding the alignment problem.
Appreciating your honesty, genuinely!
Always happy to chat further about the substantive arguments. I was initially skeptical of Forrest’s “AGI-alignment is impossible” claim. But after probing and digging into this question intensely over the last year, I could not find anything unsound (in terms of premises) or invalid (in terms of logic) about his core arguments.