RSS

Subagents

Tag

Why Subagents?

johnswentworth1 Aug 2019 22:17 UTC
174 points
48 comments7 min readLW link1 review

Multi-agent pre­dic­tive minds and AI alignment

Jan_Kulveit12 Dec 2018 23:48 UTC
63 points
18 comments10 min readLW link

Build­ing up to an In­ter­nal Fam­ily Sys­tems model

Kaj_Sotala26 Jan 2019 12:25 UTC
281 points
86 comments28 min readLW link2 reviews

A non-mys­ti­cal ex­pla­na­tion of in­sight med­i­ta­tion and the three char­ac­ter­is­tics of ex­is­tence: in­tro­duc­tion and preamble

Kaj_Sotala5 May 2020 19:09 UTC
134 points
40 comments12 min readLW link

Men­tal Mountains

Scott Alexander27 Nov 2019 5:30 UTC
151 points
14 comments15 min readLW link1 review
(slatestarcodex.com)

Forc­ing your­self to keep your iden­tity small is self-harm

Gordon Seidoh Worley3 Apr 2021 14:03 UTC
40 points
10 comments2 min readLW link

Re­solv­ing in­ter­nal con­flicts re­quires listen­ing to what parts want

Richard_Ngo19 May 2023 0:04 UTC
61 points
0 comments4 min readLW link

Quick thoughts on the im­pli­ca­tions of multi-agent views of mind on AI takeover

Kaj_Sotala11 Dec 2023 6:34 UTC
46 points
14 comments4 min readLW link

Book Sum­mary: Con­scious­ness and the Brain

Kaj_Sotala16 Jan 2019 14:43 UTC
170 points
20 comments26 min readLW link1 review

The hos­tile telepaths problem

Valentine27 Oct 2024 15:26 UTC
375 points
82 comments15 min readLW link

My cur­rent take on In­ter­nal Fam­ily Sys­tems “parts”

Kaj_Sotala26 Jun 2022 17:40 UTC
93 points
11 comments3 min readLW link
(kajsotala.fi)

Book sum­mary: Un­lock­ing the Emo­tional Brain

Kaj_Sotala8 Oct 2019 19:11 UTC
326 points
48 comments21 min readLW link3 reviews

Con­sis­tently Inconsistent

Kaj_Sotala4 Aug 2011 22:33 UTC
80 points
25 comments5 min readLW link

Subagents, in­tro­spec­tive aware­ness, and blending

Kaj_Sotala2 Mar 2019 12:53 UTC
110 points
18 comments9 min readLW link

Com­plex Be­hav­ior from Sim­ple (Sub)Agents

moridinamael10 May 2019 21:44 UTC
113 points
13 comments9 min readLW link1 review

Subagents, akra­sia, and co­her­ence in humans

Kaj_Sotala25 Mar 2019 14:24 UTC
138 points
31 comments16 min readLW link

In­te­grat­ing dis­agree­ing subagents

Kaj_Sotala14 May 2019 14:06 UTC
146 points
15 comments21 min readLW link

Subagents, neu­ral Tur­ing ma­chines, thought se­lec­tion, and blindspots

Kaj_Sotala6 Aug 2019 21:15 UTC
87 points
3 comments12 min readLW link

Subagents, trauma and rationality

Kaj_Sotala14 Aug 2019 13:14 UTC
111 points
4 comments19 min readLW link

[Question] How effec­tive are tul­pas?

Evenflair9 Mar 2020 17:35 UTC
40 points
60 comments2 min readLW link

Si­mu­late and Defer To More Ra­tional Selves

LoganStrohl17 Sep 2014 18:11 UTC
215 points
114 comments5 min readLW link

[Question] How to se­lect a long-term goal and al­ign my mind to­wards it?

Alexander24 Dec 2021 11:40 UTC
19 points
8 comments2 min readLW link

Shoulder Ad­vi­sors 101

Duncan Sabien (Deactivated)9 Oct 2021 5:30 UTC
198 points
124 comments14 min readLW link2 reviews

Seven Shiny Stories

Alicorn1 Jun 2010 0:43 UTC
144 points
34 comments7 min readLW link

Embed­ded Agency (full-text ver­sion)

15 Nov 2018 19:49 UTC
201 points
17 comments54 min readLW link

Two Co­or­di­na­tion Styles

abramdemski7 Feb 2018 9:00 UTC
40 points
14 comments7 min readLW link

In­ter­nal­iz­ing In­ter­nal Dou­ble Crux

TurnTrout30 Apr 2018 18:23 UTC
35 points
12 comments4 min readLW link

A Master-Slave Model of Hu­man Preferences

Wei Dai29 Dec 2009 1:02 UTC
97 points
94 comments3 min readLW link

Self-em­pa­thy as a source of “willpower”

Academian26 Oct 2010 14:20 UTC
83 points
32 comments2 min readLW link

Ro­bust Agency for Peo­ple and Organizations

Raemon19 Jul 2019 1:18 UTC
65 points
10 comments12 min readLW link

A Frame­work for In­ter­nal Debugging

Matt Goldenberg16 Jan 2019 16:04 UTC
44 points
3 comments5 min readLW link

On In­ter­nal Fam­ily Sys­tems and multi-agent minds: a re­ply to PJ Eby

Kaj_Sotala29 Oct 2019 14:56 UTC
41 points
31 comments25 min readLW link

City of Lights

Alicorn31 Mar 2010 23:30 UTC
55 points
43 comments4 min readLW link

Embed­ded Agency via Abstraction

johnswentworth26 Aug 2019 23:03 UTC
42 points
20 comments11 min readLW link

In­trap­er­sonal negotiation

datadataeverywhere23 Jan 2011 23:02 UTC
34 points
42 comments4 min readLW link

Neu­ral Ba­sis for Global Workspace Theory

Hazard22 Jun 2020 4:19 UTC
31 points
9 comments8 min readLW link

Ten­ta­tively con­sid­er­ing emo­tional sto­ries (IFS and “get­ting into Self”)

Kaj_Sotala30 Nov 2018 7:40 UTC
40 points
31 comments4 min readLW link
(kajsotala.fi)

Strate­gic ig­no­rance and plau­si­ble deniability

Kaj_Sotala10 Aug 2011 9:30 UTC
62 points
59 comments4 min readLW link

The Game of Masks

Slimepriestess27 Apr 2022 18:03 UTC
50 points
18 comments11 min readLW link
(hivewired.wordpress.com)

Should ra­tio­nal­ists be spiritual /​ Spiritu­al­ity as over­com­ing delusion

25 Mar 2024 16:48 UTC
49 points
57 comments29 min readLW link

In­de­ci­sion and in­ter­nal­ized au­thor­ity figures

Kaj_Sotala6 Jul 2024 10:10 UTC
67 points
1 comment2 min readLW link
(kajsotala.fi)

Re­solv­ing von Neu­mann-Mor­gen­stern In­con­sis­tent Preferences

niplav22 Oct 2024 11:45 UTC
31 points
5 comments58 min readLW link

Ayn Rand’s model of “liv­ing money”; and an up­side of burnout

AnnaSalamon16 Nov 2024 2:59 UTC
209 points
56 comments5 min readLW link

Hier­ar­chi­cal Agency: A Miss­ing Piece in AI Alignment

Jan_Kulveit27 Nov 2024 5:49 UTC
104 points
20 comments11 min readLW link

Men­tal sub­agent im­pli­ca­tions for AI Safety

moridinamael3 Jan 2021 18:59 UTC
11 points
0 comments3 min readLW link

The self-un­al­ign­ment problem

14 Apr 2023 12:10 UTC
150 points
24 comments10 min readLW link

Good­hart’s Law in­side the hu­man mind

Kaj_Sotala17 Apr 2023 13:48 UTC
117 points
13 comments16 min readLW link

Game The­ory with­out Argmax [Part 1]

Cleo Nardo11 Nov 2023 15:59 UTC
70 points
18 comments19 min readLW link

Game The­ory with­out Argmax [Part 2]

Cleo Nardo11 Nov 2023 16:02 UTC
31 points
14 comments13 min readLW link

Se­quence in­tro­duc­tion: non-agent and mul­ti­a­gent mod­els of mind

Kaj_Sotala7 Jan 2019 14:12 UTC
123 points
15 comments7 min readLW link1 review

Sys­tem 2 as work­ing-mem­ory aug­mented Sys­tem 1 reasoning

Kaj_Sotala25 Sep 2019 8:39 UTC
109 points
23 comments16 min readLW link

A mechanis­tic model of meditation

Kaj_Sotala6 Nov 2019 21:37 UTC
136 points
11 comments21 min readLW link

A non-mys­ti­cal ex­pla­na­tion of “no-self” (three char­ac­ter­is­tics se­ries)

Kaj_Sotala8 May 2020 10:37 UTC
113 points
65 comments20 min readLW link1 review

Crav­ing, suffer­ing, and pre­dic­tive pro­cess­ing (three char­ac­ter­is­tics se­ries)

Kaj_Sotala15 May 2020 13:21 UTC
90 points
56 comments19 min readLW link

From self to crav­ing (three char­ac­ter­is­tics se­ries)

Kaj_Sotala22 May 2020 12:16 UTC
57 points
21 comments11 min readLW link

On the con­struc­tion of the self

Kaj_Sotala29 May 2020 13:04 UTC
71 points
18 comments17 min readLW link

Three char­ac­ter­is­tics: impermanence

Kaj_Sotala5 Jun 2020 7:48 UTC
73 points
4 comments18 min readLW link

Con­flicts Between Men­tal Subagents: Ex­pand­ing Wei Dai’s Master-Slave Model

Scott Alexander4 Aug 2010 9:16 UTC
71 points
81 comments10 min readLW link

Con­di­tions un­der which mis­al­igned sub­agents can (not) arise in classifiers

anon111 Jul 2018 1:52 UTC
12 points
2 comments2 min readLW link

Syn­the­sis of sub­agents: exercise

Julija Kobrinovich20 Sep 2019 17:24 UTC
10 points
2 comments14 min readLW link

What Value Subagents?

Gordon Seidoh Worley20 Jul 2017 19:19 UTC
7 points
1 comment4 min readLW link
(mapandterritory.org)

Wild­fire of strategicness

TsviBT5 Jun 2023 13:59 UTC
38 points
19 comments1 min readLW link

Subagents of Carte­sian Frames

Scott Garrabrant2 Nov 2020 22:02 UTC
53 points
6 comments8 min readLW link

Com­mit­ting, As­sum­ing, Ex­ter­nal­iz­ing, and Internalizing

Scott Garrabrant9 Nov 2020 16:59 UTC
31 points
25 comments10 min readLW link

Eight Defi­ni­tions of Observability

Scott Garrabrant10 Nov 2020 23:37 UTC
34 points
26 comments12 min readLW link

One: a story

Richard_Ngo10 Oct 2023 0:18 UTC
30 points
0 comments4 min readLW link
(www.narrativeark.xyz)

Two Explorations

alkjash16 Dec 2020 21:27 UTC
63 points
8 comments9 min readLW link
(radimentary.wordpress.com)

Why Pro­duc­tivity Sys­tems Don’t Stick

Matt Goldenberg16 Jan 2021 17:45 UTC
61 points
22 comments3 min readLW link

Non-Co­er­cive Perfectionism

Matt Goldenberg26 Jan 2021 16:53 UTC
25 points
25 comments3 min readLW link

[Question] Any­one been through IFS or co­her­ence ther­apy?

warrenjordan15 Mar 2021 18:35 UTC
5 points
3 comments1 min readLW link

Re­ward Is Not Enough

Steven Byrnes16 Jun 2021 13:52 UTC
123 points
19 comments10 min readLW link1 review

Ac­tu­ally updating

SaraHax23 Aug 2019 17:46 UTC
56 points
10 comments4 min readLW link

An­nounc­ing the Align­ment of Com­plex Sys­tems Re­search Group

4 Jun 2022 4:10 UTC
91 points
20 comments5 min readLW link

The hor­ror of what must, yet can­not, be true

Kaj_Sotala2 Jun 2022 10:20 UTC
53 points
18 comments2 min readLW link
(kajsotala.fi)

Shard The­ory: An Overview

David Udell11 Aug 2022 5:44 UTC
166 points
34 comments10 min readLW link

Many ther­apy schools work with in­ner mul­ti­plic­ity (not just IFS)

17 Sep 2022 10:27 UTC
52 points
16 comments18 min readLW link

In­ter­nal com­mu­ni­ca­tion framework

15 Nov 2022 12:41 UTC
38 points
13 comments12 min readLW link

Slack mat­ters more than any outcome

Valentine31 Dec 2022 20:11 UTC
155 points
56 comments19 min readLW link1 review

Re­marks 1–18 on GPT (com­pressed)

Cleo Nardo20 Mar 2023 22:27 UTC
148 points
35 comments31 min readLW link

Reflec­tion of Hier­ar­chi­cal Re­la­tion­ship via Nuanced Con­di­tion­ing of Game The­ory Ap­proach for AI Devel­op­ment and Utilization

Kyoung-cheol Kim4 Jun 2021 7:20 UTC
2 points
2 comments7 min readLW link

Selec­tion pro­cesses for subagents

Ryan Kidd30 Jun 2022 23:57 UTC
36 points
2 comments9 min readLW link

Self and No-Self

Vaniver29 Dec 2019 6:15 UTC
48 points
3 comments2 min readLW link

A Cau­tion­ary Note on Un­lock­ing the Emo­tional Brain

eapache8 Feb 2020 17:21 UTC
54 points
20 comments2 min readLW link

The Soli­taire Prin­ci­ple: Game The­ory for One

alkjash17 Jan 2018 0:14 UTC
25 points
8 comments9 min readLW link
(radimentary.wordpress.com)

TDT for Humans

alkjash28 Feb 2018 5:40 UTC
26 points
7 comments5 min readLW link
(radimentary.wordpress.com)

Which Parts Are “Me”?

Eliezer Yudkowsky22 Oct 2008 18:15 UTC
66 points
117 comments5 min readLW link

Be­ware So­cial Cop­ing Strategies

Lulie5 Feb 2018 4:48 UTC
51 points
24 comments7 min readLW link

Make an ap­point­ment with your saner self

MalcolmOcean8 Feb 2019 5:05 UTC
28 points
0 comments4 min readLW link

In­te­grat­ing Three Models of (Hu­man) Cognition

jbkjr23 Nov 2021 1:06 UTC
33 points
4 comments32 min readLW link

Silence

alkjash18 Mar 2018 4:10 UTC
60 points
17 comments4 min readLW link
(radimentary.wordpress.com)

Ad­di­tive and Mul­ti­plica­tive Subagents

Scott Garrabrant6 Nov 2020 14:26 UTC
20 points
7 comments12 min readLW link

Prune

alkjash12 Jan 2018 22:50 UTC
71 points
10 comments4 min readLW link
(radimentary.wordpress.com)

Pro­saic mis­al­ign­ment from the Solomonoff Predictor

Cleo Nardo9 Dec 2022 17:53 UTC
42 points
3 comments5 min readLW link

A Clearer Think­ing tool that teaches you to use In­ter­nal Fam­ily Sys­tems concepts

spencerg28 Apr 2023 13:42 UTC
31 points
1 comment1 min readLW link
(programs.clearerthinking.org)

Species as Canon­i­cal Refer­ents of Su­per-Organisms

Yudhister Kumar18 Oct 2024 7:49 UTC
9 points
8 comments2 min readLW link
(www.yudhister.me)

Alien par­a­site tech­ni­cal guy

PhilGoetz27 Jul 2010 16:51 UTC
69 points
55 comments3 min readLW link

Restricted Anti­na­tal­ism on Subagents

Josephine13 May 2021 1:48 UTC
3 points
1 comment2 min readLW link
No comments.