DragonGod comments on DragonGod’s Shortform

6 points

“Foundations of Intelligent Systems” not “Agent Foundations”

I don’t like the term “agent foundations” to describe the kind of research I am most interested in, because:

I am unconvinced that “agent” is the “true name” of the artifacts that would determine the shape of humanity’s long term future
1. The most powerful artificial intelligent systems today do not cleanly fit into the agent ontology,
2. Future foundation models are unlikely to cleanly conform to the agent archetype
3. Simulators/foundation models may be the first (and potentially final) form of transformative AI; salvation/catastrophe may be realised without the emergence of superhuman general agents
4. I worry that foundational theory that conditions too strongly on the agent ontology may end up being of limited applicability to the intelligent systems that would determine the longterm future of earth originating civilisation
I am persuaded by/sympathetic to composite modular architectures for transformative AI such as Drexler’s “Open Agency” or “Comprehensive AI Services”
I do not want us to build generally capable strongly autonomous agents; my best case outcome for AI is for AI to amplify human cognition and capabilities, not to replace/supersede us.
1. I believe that cognition can be largely disconnected from volition
2. And that in principle, augmented/amplified humans (“cyborgs”) can be competitive with superhuman agents
3. I find the concept of an aligned sovereign (agent) very uncompelling. It runs counter to my values regarding autonomy and self actualisation. Human enfeeblement would be a tragedy in its own right.

Research paradigms/approaches I see as belonging to “Foundations of Intelligent Systems”:

	Theoretical	Empirical
Descriptive	Quintin Pope and Alex Turner’s Shard Theory John Wentworth’s “Basic Foundations for Agent Models” John Wentworth’s “Natural Abstractions” John Wentworth’s “Selection Theorems” Andrew Critch’s “Boundaries” Deepmind’s “Discovering Agents”	(Ambitious) Mechanistic Interpretability Other science of deep learning approaches
Normative	Garrabrant and Demski’s Embedded Agency MIRI’s Highly Reliable Agent Design	???

Please suggest more!

DragonGod 3 Mar 2023 0:36 UTC
3 points
Parent
As always let me know if you want me to publish this as a top level post.