RSS

Fun­da­men­tal Con­trol­la­bil­ity Limits

TagLast edit: Dec 28, 2024, 2:58 PM by Remmelt

The research field Fundamental Controllability Limits has the purpose of verifying (both the empirical soundness of premises and validity of formal reasoning of):

  1. Theoretical limits to controlling any AGI using any method of causation.

  2. Threat models of AGI convergent dynamics that are impossible to control (by 1.).

  3. Impossibility theorems, by contradiction of ‘long-term AGI safety’ with convergence result (2.)

~ ~ ~

Definitions and Distinctions

‘AGI convergent dynamic that is impossible to control’:

Iterated interactions of AGI internals (with connected surroundings of environment) that converge on (unsafe) conditions, where the space of interactions falls outside even one theoretical limit of control.

‘Control:’

‘Long term’:

‘AGI safety’:

Ambient conditions/​contexts around planet Earth changed by the operation of AGI fall within the environmental range that humans need to survive (a minimum-threshold definition of safety).

‘AGI’:

That the notion of ‘artificial intelligence’ (AI) can be either “narrow” or “general”:

That the notion of ‘narrow AI’ specifically implies:

  1. a single domain of sense and action.

  2. no possibility for self base-code modification.

  3. a single well-defined meta-algorithm.

  4. that all aspects of its own self agency/​intention are fully defined by its builders/​developers/​creators.

That the notion of ‘general AI’ specifically implies:

  1. multiple domains of sense/​action;

  2. intrinsic non-reducible possibility for self-modification;

  3. and that/​therefore; that the meta-algorithm is effectively arbitrary; hence;

  4. that it is inherently undecidable as to whether all aspects of its own self agency/​intention are fully defined by only its builders/​developers/​creators.

[Source: https://​​mflb.com/​​ai_alignment_1/​​si_safety_qanda_out.html#p3]

The Robot, the Pup­pet-mas­ter, and the Psychohistorian

WillPetilloDec 28, 2024, 12:12 AM
8 points
2 comments3 min readLW link

The Con­trol Prob­lem: Un­solved or Un­solv­able?

RemmeltJun 2, 2023, 3:42 PM
55 points
46 comments14 min readLW link

Pro­jects I would like to see (pos­si­bly at AI Safety Camp)

Linda LinseforsSep 27, 2023, 9:27 PM
22 points
12 comments4 min readLW link

Why Re­cur­sive Self-Im­prove­ment Might Not Be the Ex­is­ten­tial Risk We Fear

Nassim_ANov 24, 2024, 5:17 PM
1 point
0 comments9 min readLW link
No comments.