RSS

Ro­bust Agents

TagLast edit: Sep 14, 2020, 11:17 PM by Ruby

Robust Agents are decision-makers who can perform well in a variety of situations. Whereas some humans rely on folk wisdom or instinct, and some AIs might be designed to achieve a narrow set of goals, a Robust Agent has a coherent set of values and decision-procedures. This enables them to adapt to new circumstances (such as succeeding in a new environment, or responding to a new strategy by a competitor).

See also

Be­ing a Ro­bust Agent

RaemonOct 18, 2018, 7:00 AM
151 points
32 comments7 min readLW link2 reviews

Up­com­ing sta­bil­ity of values

Stuart_ArmstrongMar 15, 2018, 11:36 AM
15 points
15 comments2 min readLW link

Ro­bust Agency for Peo­ple and Organizations

RaemonJul 19, 2019, 1:18 AM
65 points
10 comments12 min readLW link

Desider­ata for an AI

Nathan Helm-BurgerJul 19, 2023, 4:18 PM
9 points
0 comments4 min readLW link

Se­cu­rity Mind­set and Or­di­nary Paranoia

Eliezer YudkowskyNov 25, 2017, 5:53 PM
132 points
25 comments29 min readLW link

Gra­da­tions of Agency

Daniel KokotajloMay 23, 2022, 1:10 AM
41 points
6 comments5 min readLW link

Hu­mans are very re­li­able agents

alyssavanceJun 16, 2022, 10:02 PM
269 points
35 comments3 min readLW link

[Question] What if memes are com­mon in highly ca­pa­ble minds?

Daniel KokotajloJul 30, 2020, 8:45 PM
40 points
13 comments2 min readLW link

Embed­ded Agency (full-text ver­sion)

Nov 15, 2018, 7:49 PM
201 points
17 comments54 min readLW link

Ro­bust Delegation

Nov 4, 2018, 4:38 PM
116 points
10 comments1 min readLW link

The Power of Agency

lukeprogMay 7, 2011, 1:38 AM
113 points
78 comments1 min readLW link

Subagents, akra­sia, and co­her­ence in humans

Kaj_SotalaMar 25, 2019, 2:24 PM
140 points
31 comments16 min readLW link

On Be­ing Robust

TurnTroutJan 10, 2020, 3:51 AM
45 points
7 comments2 min readLW link

An an­gle of at­tack on Open Prob­lem #1

BenyaAug 18, 2012, 12:08 PM
48 points
85 comments7 min readLW link

Vingean Reflec­tion: Reli­able Rea­son­ing for Self-Im­prov­ing Agents

So8resJan 15, 2015, 10:47 PM
37 points
5 comments9 min readLW link

Even Su­per­hu­man Go AIs Have Sur­pris­ing Failure Modes

Jul 20, 2023, 5:31 PM
129 points
22 comments10 min readLW link
(far.ai)

Thoughts on the 5-10 Problem

ToflyJul 18, 2019, 6:56 PM
18 points
11 comments1 min readLW link

Can we achieve AGI Align­ment by bal­anc­ing mul­ti­ple hu­man ob­jec­tives?

Ben SmithJul 3, 2022, 2:51 AM
11 points
1 comment4 min readLW link

Sets of ob­jec­tives for a multi-ob­jec­tive RL agent to optimize

Nov 23, 2022, 6:49 AM
13 points
0 comments8 min readLW link

Re­ward is not Ne­c­es­sary: How to Create a Com­po­si­tional Self-Pre­serv­ing Agent for Life-Long Learning

Roman LeventovJan 12, 2023, 4:43 PM
17 points
2 comments2 min readLW link
(arxiv.org)

Tem­po­rally Lay­ered Ar­chi­tec­ture for Adap­tive, Distributed and Con­tin­u­ous Control

Roman LeventovFeb 2, 2023, 6:29 AM
6 points
4 comments1 min readLW link
(arxiv.org)

A multi-dis­ci­plinary view on AI safety research

Roman LeventovFeb 8, 2023, 4:50 PM
46 points
4 comments26 min readLW link

Ro­bust­ness to Scale

Scott GarrabrantFeb 21, 2018, 10:55 PM
130 points
23 comments2 min readLW link1 review

On agen­tic gen­er­al­ist mod­els: we’re es­sen­tially us­ing ex­ist­ing tech­nol­ogy the weak­est and worst way you can use it

Yuli_BanAug 28, 2024, 1:57 AM
10 points
2 comments9 min readLW link

[Aspira­tion-based de­signs] 2. For­mal frame­work, ba­sic algorithm

Apr 28, 2024, 1:02 PM
17 points
2 comments16 min readLW link

Beyond the Board: Ex­plor­ing AI Ro­bust­ness Through Go

AdamGleaveJun 19, 2024, 4:40 PM
41 points
2 comments1 min readLW link
(far.ai)

Au­to­mated mon­i­tor­ing systems

hiki_tNov 28, 2024, 6:54 PM
1 point
0 comments2 min readLW link

We need a uni­ver­sal defi­ni­tion of ‘agency’ and re­lated words

CstineSublimeJan 11, 2025, 3:22 AM
18 points
1 comment5 min readLW link

AISC pro­ject: Satis­fIA – AI that satis­fies with­out over­do­ing it

Jobst HeitzigNov 11, 2023, 6:22 PM
12 points
0 comments1 min readLW link
(docs.google.com)

Se­cu­rity Mind­set and the Lo­gis­tic Suc­cess Curve

Eliezer YudkowskyNov 26, 2017, 3:58 PM
106 points
49 comments20 min readLW link

Reflec­tion in Prob­a­bil­is­tic Logic

Eliezer YudkowskyMar 24, 2013, 4:37 PM
112 points
168 comments3 min readLW link

Tiling Agents for Self-Mod­ify­ing AI (OPFAI #2)

Eliezer YudkowskyJun 6, 2013, 8:24 PM
88 points
259 comments3 min readLW link

2-D Robustness

Vlad MikulikAug 30, 2019, 8:27 PM
85 points
8 comments2 min readLW link

Me­taphilo­soph­i­cal com­pe­tence can’t be dis­en­tan­gled from alignment

zhukeepaApr 1, 2018, 12:38 AM
46 points
39 comments3 min readLW link