RSS

Si­mu­la­tor Theory

TagLast edit: Dec 30, 2024, 9:49 AM by Dakara

Simulator Theory (in the context of AI) is an ontology or frame for understanding the working of large generative models, such as the GPT series from OpenAI. Broadly it views these models as simulating a learned distribution with various degrees of fidelity, which in the case of language models trained on a large corpus of text is the mechanics underlying our world.

It can also refer to an alignment research agenda, that deals with better understanding simulator conditionals, effects of downstream training, alignment-relevant properties such as myopia and agency in the context of language models, and using them as alignment research accelerators. See also: Cyborgism

Simulators

janusSep 2, 2022, 12:45 PM
631 points
168 comments41 min readLW link8 reviews
(generative.ink)

Con­di­tion­ing Pre­dic­tive Models: Large lan­guage mod­els as predictors

Feb 2, 2023, 8:28 PM
88 points
4 comments13 min readLW link

The Com­pleat Cybornaut

May 19, 2023, 8:44 AM
65 points
2 comments16 min readLW link

Why Si­mu­la­tor AIs want to be Ac­tive In­fer­ence AIs

Apr 10, 2023, 6:23 PM
93 points
9 comments8 min readLW link1 review

The Waluigi Effect (mega-post)

Cleo NardoMar 3, 2023, 3:22 AM
628 points
188 comments16 min readLW link

Good­bye, Shog­goth: The Stage, its An­i­ma­tron­ics, & the Pup­peteer – a New Metaphor

RogerDearnaleyJan 9, 2024, 8:42 PM
47 points
8 comments36 min readLW link

How to Con­trol an LLM’s Be­hav­ior (why my P(DOOM) went down)

RogerDearnaleyNov 28, 2023, 7:56 PM
64 points
30 comments11 min readLW link

Mo­ti­vat­ing Align­ment of LLM-Pow­ered Agents: Easy for AGI, Hard for ASI?

RogerDearnaleyJan 11, 2024, 12:56 PM
35 points
4 comments39 min readLW link

Si­mu­lacra are Things

janusJan 8, 2023, 11:03 PM
63 points
7 comments2 min readLW link

Con­di­tion­ing Gen­er­a­tive Models for Alignment

JozdienJul 18, 2022, 7:11 AM
59 points
8 comments20 min readLW link

‘simu­la­tor’ fram­ing and con­fu­sions about LLMs

Beth BarnesDec 31, 2022, 11:38 PM
104 points
11 comments4 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

Jan 2, 2023, 11:48 PM
50 points
4 comments3 min readLW link

A smart enough LLM might be deadly sim­ply if you run it for long enough

Mikhail SaminMay 5, 2023, 8:49 PM
19 points
16 comments8 min readLW link

Agents vs. Pre­dic­tors: Con­crete differ­en­ti­at­ing factors

evhubFeb 24, 2023, 11:50 PM
37 points
3 comments4 min readLW link

Su­per-Luigi = Luigi + (Luigi—Waluigi)

AlexeiMar 17, 2023, 3:27 PM
16 points
9 comments1 min readLW link

GPTs are Pre­dic­tors, not Imitators

Eliezer YudkowskyApr 8, 2023, 7:59 PM
416 points
99 comments3 min readLW link3 reviews

Re­marks 1–18 on GPT (com­pressed)

Cleo NardoMar 20, 2023, 10:27 PM
145 points
35 comments31 min readLW link

Us­ing pre­dic­tors in cor­rigible systems

porbyJul 19, 2023, 10:29 PM
19 points
6 comments27 min readLW link

Im­pli­ca­tions of simulators

TW123Jan 7, 2023, 12:37 AM
17 points
0 comments12 min readLW link

[Question] Goals of model vs. goals of simu­lacra?

dr_sApr 12, 2023, 1:02 PM
5 points
7 comments1 min readLW link

FAQ: What the heck is goal ag­nos­ti­cism?

porbyOct 8, 2023, 7:11 PM
66 points
38 comments28 min readLW link

Re­cur­ren­tGPT: a loom-type tool with a twist

mishkaMay 25, 2023, 5:09 PM
10 points
0 comments3 min readLW link
(arxiv.org)

In­ner Misal­ign­ment in “Si­mu­la­tor” LLMs

Adam ScherlisJan 31, 2023, 8:33 AM
84 points
12 comments4 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Outer al­ign­ment via care­ful conditioning

Feb 2, 2023, 8:28 PM
72 points
15 comments57 min readLW link

Con­di­tion­ing Pre­dic­tive Models: De­ploy­ment strategy

Feb 9, 2023, 8:59 PM
28 points
0 comments10 min readLW link

Two prob­lems with ‘Si­mu­la­tors’ as a frame

ryan_greenblattFeb 17, 2023, 11:34 PM
81 points
13 comments5 min readLW link

You’re not a simu­la­tion, ’cause you’re hallucinating

Stuart_ArmstrongFeb 21, 2023, 12:12 PM
25 points
6 comments1 min readLW link

One path to co­her­ence: con­di­tion­al­iza­tion

porbyJun 29, 2023, 1:08 AM
28 points
4 comments4 min readLW link

Why do we as­sume there is a “real” shog­goth be­hind the LLM? Why not masks all the way down?

Robert_AIZIMar 9, 2023, 5:28 PM
63 points
48 comments2 min readLW link

The al­gorithm isn’t do­ing X, it’s just do­ing Y.

Cleo NardoMar 16, 2023, 11:28 PM
53 points
43 comments5 min readLW link

Notes on Antelligence

AurigenaMay 13, 2023, 6:38 PM
2 points
0 comments9 min readLW link

The (lo­cal) unit of in­tel­li­gence is FLOPs

boazbarakJun 5, 2023, 6:23 PM
42 points
7 comments5 min readLW link

Philo­soph­i­cal Cy­borg (Part 1)

Jun 14, 2023, 4:20 PM
31 points
4 comments13 min readLW link

Higher Di­men­sion Carte­sian Ob­jects and Align­ing ‘Tiling Si­mu­la­tors’

lukemarksJun 11, 2023, 12:13 AM
22 points
0 comments5 min readLW link

Philo­soph­i­cal Cy­borg (Part 2)...or, The Good Successor

ukc10014Jun 21, 2023, 3:43 PM
21 points
1 comment31 min readLW link

Par­tial Si­mu­la­tion Ex­trap­o­la­tion: A Pro­posal for Build­ing Safer Simulators

lukemarksJun 17, 2023, 1:55 PM
16 points
0 comments10 min readLW link

Col­lec­tive Identity

May 18, 2023, 9:00 AM
59 points
12 comments8 min readLW link

How I Learned To Stop Wor­ry­ing And Love The Shoggoth

Peter MerelJul 12, 2023, 5:47 PM
9 points
15 comments5 min readLW link

Un­safe AI as Dy­nam­i­cal Systems

Robert_AIZIJul 14, 2023, 3:31 PM
11 points
0 comments3 min readLW link
(aizi.substack.com)

Memetic Judo #3: The In­tel­li­gence of Stochas­tic Par­rots v.2

Max TKAug 20, 2023, 3:18 PM
8 points
33 comments6 min readLW link

The Löbian Ob­sta­cle, And Why You Should Care

lukemarksSep 7, 2023, 11:59 PM
18 points
6 comments2 min readLW link

The util­ity of hu­mans within a Su­per Ar­tifi­cial In­tel­li­gence realm.

Marc MonroyOct 11, 2023, 5:30 PM
1 point
0 comments7 min readLW link

Re­veal­ing In­ten­tion­al­ity In Lan­guage Models Through AdaVAE Guided Sampling

jdpOct 20, 2023, 7:32 AM
119 points
15 comments22 min readLW link

Con­di­tion­ing Gen­er­a­tive Models

Adam JermynJun 25, 2022, 10:15 PM
24 points
18 comments10 min readLW link

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David JohnstonNov 5, 2022, 1:19 PM
8 points
4 comments16 min readLW link

AGI-level rea­soner will ap­pear sooner than an agent; what the hu­man­ity will do with this rea­soner is critical

Roman LeventovJul 30, 2022, 8:56 PM
24 points
10 comments1 min readLW link

Si­mu­la­tors, con­straints, and goal ag­nos­ti­cism: por­bynotes vol. 1

porbyNov 23, 2022, 4:22 AM
37 points
2 comments35 min readLW link

Pro­saic mis­al­ign­ment from the Solomonoff Predictor

Cleo NardoDec 9, 2022, 5:53 PM
42 points
3 comments5 min readLW link

Steer­ing Be­havi­our: Test­ing for (Non-)My­opia in Lan­guage Models

Dec 5, 2022, 8:28 PM
40 points
19 comments10 min readLW link

The Limit of Lan­guage Models

DragonGodJan 6, 2023, 11:53 PM
44 points
26 comments4 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #2 Semiotic physics—revamped

Feb 27, 2023, 12:25 AM
24 points
23 comments13 min readLW link

[Question] Could Si­mu­lat­ing an AGI Tak­ing Over the World Ac­tu­ally Lead to a LLM Tak­ing Over the World?

simeon_cJan 13, 2023, 6:33 AM
15 points
1 comment1 min readLW link

[ASoT] Si­mu­la­tors show us be­havi­oural prop­er­ties by default

JozdienJan 13, 2023, 6:42 PM
35 points
3 comments3 min readLW link

Un­der­speci­fi­ca­tion of Or­a­cle AI

Jan 15, 2023, 8:10 PM
30 points
12 comments19 min readLW link

Gra­di­ent Filtering

Jan 18, 2023, 8:09 PM
55 points
16 comments13 min readLW link

Con­di­tion­ing Pre­dic­tive Models: The case for competitiveness

Feb 6, 2023, 8:08 PM
20 points
3 comments11 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Mak­ing in­ner al­ign­ment as easy as possible

Feb 7, 2023, 8:04 PM
27 points
2 comments19 min readLW link

Con­di­tion­ing Pre­dic­tive Models: In­ter­ac­tions with other approaches

Feb 8, 2023, 6:19 PM
32 points
2 comments11 min readLW link

Cyborgism

Feb 10, 2023, 2:47 PM
336 points
46 comments35 min readLW link2 reviews

A note on ‘semiotic physics’

metasemiFeb 11, 2023, 5:12 AM
11 points
13 comments6 min readLW link

Pre­train­ing Lan­guage Models with Hu­man Preferences

Feb 21, 2023, 5:57 PM
135 points
20 comments11 min readLW link2 reviews

Im­plied “util­ities” of simu­la­tors are broad, dense, and shallow

porbyMar 1, 2023, 3:23 AM
45 points
7 comments3 min readLW link

In­stru­men­tal­ity makes agents agenty

porbyFeb 21, 2023, 4:28 AM
20 points
7 comments6 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon MöllerMar 3, 2023, 6:59 PM
31 points
2 comments7 min readLW link

[ASoT] Fine­tun­ing, RL, and GPT’s world prior

JozdienDec 2, 2022, 4:33 PM
44 points
8 comments5 min readLW link

On the fu­ture of lan­guage models

owencbDec 20, 2023, 4:58 PM
105 points
17 comments1 min readLW link

OpenAI Credit Ac­count (2510$)

Emirhan BULUTJan 21, 2024, 2:32 AM
1 point
0 comments1 min readLW link

The case for more am­bi­tious lan­guage model evals

JozdienJan 30, 2024, 12:01 AM
112 points
30 comments5 min readLW link

In­ter­view with Robert Kral­isch on Simulators

WillPetilloAug 26, 2024, 5:49 AM
17 points
0 comments75 min readLW link

Places of Lov­ing Grace [Story]

ankFeb 18, 2025, 11:49 PM
−1 points
0 comments4 min readLW link

Lan­guage and Ca­pa­bil­ities: Test­ing LLM Math­e­mat­i­cal Abil­ities Across Languages

Ethan EdwardsApr 4, 2024, 1:18 PM
24 points
2 comments36 min readLW link

A Re­view of In-Con­text Learn­ing Hy­pothe­ses for Au­to­mated AI Align­ment Research

alamertonApr 18, 2024, 6:29 PM
25 points
4 comments16 min readLW link

How are Si­mu­la­tors and Agents re­lated?

Robert KralischApr 29, 2024, 12:22 AM
6 points
0 comments7 min readLW link

Karpenchuk’s The­ory: Hu­man Life as a Si­mu­la­tion for Con­scious­ness Development

Karpenchuk Bohdan Aug 2, 2024, 12:03 AM
1 point
0 comments2 min readLW link

Us­ing ide­olog­i­cally-charged lan­guage to get gpt-3.5-turbo to di­s­obey it’s sys­tem prompt: a demo

Milan WAug 24, 2024, 12:13 AM
3 points
0 comments6 min readLW link

The Trinity Ar­chi­tect Hy­poth­e­sis (A fu­sion of The Trinity Para­dox & The Ar­chi­tect’s Cy­cle)

kaninwithriceFeb 24, 2025, 4:40 AM
1 point
0 comments2 min readLW link

The Frac­tal Hy­poth­e­sis: Are We Already in a Si­mu­la­tion?

QuanJan 9, 2025, 2:53 AM
1 point
0 comments3 min readLW link

Repli­ca­tors, Gods and Bud­dhist Cosmology

KristianRonnJan 16, 2025, 10:51 AM
10 points
3 comments26 min readLW link

How To Prevent a Dystopia

ankJan 29, 2025, 2:16 PM
−3 points
4 comments1 min readLW link

Ra­tional Utopia & Nar­row Way There: Mul­tiver­sal AI Align­ment, Non-Agen­tic Static Place AI, New Ethics… (V. 4)

ankFeb 11, 2025, 3:21 AM
13 points
8 comments35 min readLW link

Early Re­sults: Do LLMs com­plete false equa­tions with false equa­tions?

Robert_AIZIMar 30, 2023, 8:14 PM
14 points
0 comments4 min readLW link
(aizi.substack.com)

ICA Simulacra

OzyrusApr 5, 2023, 6:41 AM
26 points
2 comments7 min readLW link

Align­ment of Au­toGPT agents

OzyrusApr 12, 2023, 12:54 PM
14 points
1 comment4 min readLW link

Re­search Re­port: In­cor­rect­ness Cascades

Robert_AIZIApr 14, 2023, 12:49 PM
19 points
0 comments10 min readLW link
(aizi.substack.com)

I was Wrong, Si­mu­la­tor The­ory is Real

Robert_AIZIApr 26, 2023, 5:45 PM
75 points
7 comments3 min readLW link
(aizi.substack.com)

[Question] Im­pres­sions from base-GPT-4?

mishkaNov 8, 2023, 5:43 AM
25 points
25 comments1 min readLW link

Is In­ter­pretabil­ity All We Need?

RogerDearnaleyNov 14, 2023, 5:31 AM
1 point
1 comment1 min readLW link

Si­mu­la­tors In­crease the Like­li­hood of Align­ment by Default

Wuschel SchulzApr 30, 2023, 4:32 PM
13 points
1 comment5 min readLW link

Re­search Re­port: In­cor­rect­ness Cas­cades (Cor­rected)

Robert_AIZIMay 9, 2023, 9:54 PM
9 points
0 comments9 min readLW link
(aizi.substack.com)
No comments.