human intelligence may be alignment-limited

Previously, I argued that human mental development implies that AI self-improvement from sub-human capabilities is possible, and that human intelligence comes at the cost of a longer childhood and greater divergence from evolutionarily-specified goals.

In that post, I raised 2 hypotheses:

H:mutational_load = Human capabilities are limited by high intelligence requiring high genetic precision that can be achieved only rarely with normal rates of mutation generation and elimination.
H:drift_bound = The extent of SI in humans is limited by increased value drift outweighing increased capabilities.

Humans have a lot of mental variation. Some people can’t visualize 3d objects. Some people can’t remember faces. Some people have synaesthesia. Such variation also exists among very smart people; there isn’t convergence to a single intellectual archetype. You could argue that what’s needed genetically is precise specification of something lower-level that underlies all that variation, but I don’t think that’s correct.

So, I don’t think H:mutational_load is right. That leaves H:drift_bound as the only hypothesis that seems plausible to me.

Suppose that I’m correct that human intelligence comes at the cost of a longer childhood. The disadvantages of a long childhood vary depending on social circumstances. Humans may have some control mechanism which modifies the amount of mental self-improvement and thus the length of childhood depending on the surrounding environment. Certain environments—probably safe ones with ample food—would then be associated with both longer childhoods and a one-time increase in average intelligence. That would also cause greater divergence from evolutionarily-specified goals, which may show up as a decrease in fertility rates, or an increased rate of obsession with hobbies. That can obviously be pattern-matched to the situation in some countries today, but I don’t mean to say that it’s definitely true; I just want to raise it as a hypothesis.

If H:drift_bound is correct, it would be an example of an optimized system having a strong and adjustable tradeoff between capabilities and alignment, which would be evidence for AI systems also tending to have such a tradeoff.

Agents are adaptation-executors with adaptations that accomplish goals, not goal-maximizers. Understanding agents as maximizing goals is a simplification used by humans to make them easier to understand. This is as true when the goal is self-improvement as it is with anything else.

“Creation of a more-intelligent agent” involves actions that are different at each step. I consider it an open question whether intelligent systems applying recursive self-improvement tend to remain oriented towards creating more-intelligent agents more than they remain oriented towards non-instrumental specified goals. My view is that one of the following is true:

Instrumental convergence is correct, and can maintain creation of more-intelligent agents as a goal during recursive self-improvement despite the actions/adaptations involved being very different.
Self-improvement has a fixed depth set by the initial design, rather than unlimited potential depth. This may limit AI to approximately human-level intelligence because drift would be a similarly limiting factor for both humans and AI, but it does seem that many humans have self-improvement as a goal, and some humans have creation of a more-intelligent but different self or even a more-intelligent completely separate agent as a goal.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer