Truth­seek­ing, EA, Si­mu­lacra lev­els, and other stuff

27 Oct 2023 23:56 UTC
44 points
12 comments9 min readLW link

[Question] Do you be­lieve “E=mc^2” is a cor­rect and/​or use­ful equa­tion, and, whether yes or no, pre­cisely what are your rea­sons for hold­ing this be­lief (with such a de­gree of con­fi­dence)?

l8c27 Oct 2023 22:46 UTC
10 points
14 comments1 min readLW link

Value sys­tem­ati­za­tion: how val­ues be­come co­her­ent (and mis­al­igned)

Richard_Ngo27 Oct 2023 19:06 UTC
102 points
48 comments13 min readLW link

Techno-hu­man­ism is techno-op­ti­mism for the 21st century

Richard_Ngo27 Oct 2023 18:37 UTC
88 points
5 comments14 min readLW link
(www.mindthefuture.info)

Sanc­tu­ary for Humans

Nikola Jurkovic27 Oct 2023 18:08 UTC
21 points
9 comments1 min readLW link

Wire­head­ing and mis­al­ign­ment by com­po­si­tion on NetHack

pierlucadoro27 Oct 2023 17:43 UTC
34 points
4 comments4 min readLW link

We’re Not Ready: thoughts on “paus­ing” and re­spon­si­ble scal­ing policies

HoldenKarnofsky27 Oct 2023 15:19 UTC
200 points
33 comments8 min readLW link

Aspira­tion-based Q-Learning

27 Oct 2023 14:42 UTC
38 points
5 comments11 min readLW link

Linkpost: Rishi Su­nak’s Speech on AI (26th Oc­to­ber)

bideup27 Oct 2023 11:57 UTC
85 points
8 comments7 min readLW link
(www.gov.uk)

ASPR & WARP: Ra­tion­al­ity Camps for Teens in Taiwan and Oxford

Anna Gajdova27 Oct 2023 8:40 UTC
18 points
0 comments1 min readLW link

[Question] To what ex­tent is the UK Govern­ment’s re­cent AI Safety push en­tirely due to Rishi Su­nak?

Stephen Fowler27 Oct 2023 3:29 UTC
23 points
4 comments1 min readLW link

Bayesian Punishment

Rob Lucas27 Oct 2023 3:24 UTC
1 point
1 comment6 min readLW link

On­line Dialogues Party — Sun­day 5th November

Ben Pace27 Oct 2023 2:41 UTC
28 points
1 comment1 min readLW link

OpenAI’s new Pre­pared­ness team is hiring

leopold26 Oct 2023 20:42 UTC
60 points
2 comments1 min readLW link

Fake Deeply

Zack_M_Davis26 Oct 2023 19:55 UTC
33 points
7 comments1 min readLW link
(unremediatedgender.space)

Sym­bol/​Refer­ent Con­fu­sions in Lan­guage Model Align­ment Experiments

johnswentworth26 Oct 2023 19:49 UTC
94 points
44 comments6 min readLW link

Un­su­per­vised Meth­ods for Con­cept Dis­cov­ery in AlphaZero

aogara26 Oct 2023 19:05 UTC
9 points
0 comments1 min readLW link
(arxiv.org)

[Question] Non­lin­ear limi­ta­tions of ReLUs

magfrump26 Oct 2023 18:51 UTC
13 points
1 comment1 min readLW link

AI Align­ment Prob­lem: Re­quire­ment not op­tional (A Crit­i­cal Anal­y­sis through Mass Effect Tril­ogy)

TAWSIF AHMED26 Oct 2023 18:02 UTC
−9 points
0 comments4 min readLW link

[Thought Ex­per­i­ment] To­mor­row’s Echo—The fu­ture of syn­thetic com­pan­ion­ship.

Vimal Naran26 Oct 2023 17:54 UTC
−7 points
2 comments2 min readLW link

Disagree­ments over the pri­ori­ti­za­tion of ex­is­ten­tial risk from AI

Olivier Coutu26 Oct 2023 17:54 UTC
10 points
0 comments6 min readLW link

[Question] What if AGI had its own uni­verse to maybe wreck?

mseale26 Oct 2023 17:49 UTC
−1 points
2 comments1 min readLW link

Chang­ing Con­tra Dialects

jefftk26 Oct 2023 17:30 UTC
25 points
2 comments1 min readLW link
(www.jefftk.com)

5 psy­cholog­i­cal rea­sons for dis­miss­ing x-risks from AGI

Igor Ivanov26 Oct 2023 17:21 UTC
24 points
6 comments4 min readLW link

5. Risks from pre­vent­ing le­gi­t­i­mate value change (value col­lapse)

Nora_Ammann26 Oct 2023 14:38 UTC
13 points
1 comment9 min readLW link

4. Risks from caus­ing ille­gi­t­i­mate value change (perfor­ma­tive pre­dic­tors)

Nora_Ammann26 Oct 2023 14:38 UTC
8 points
3 comments5 min readLW link

3. Premise three & Con­clu­sion: AI sys­tems can af­fect value change tra­jec­to­ries & the Value Change Problem

Nora_Ammann26 Oct 2023 14:38 UTC
28 points
4 comments4 min readLW link

2. Premise two: Some cases of value change are (il)legitimate

Nora_Ammann26 Oct 2023 14:36 UTC
24 points
7 comments6 min readLW link

1. Premise one: Values are malleable

Nora_Ammann26 Oct 2023 14:36 UTC
21 points
1 comment15 min readLW link

0. The Value Change Prob­lem: in­tro­duc­tion, overview and motivations

Nora_Ammann26 Oct 2023 14:36 UTC
32 points
0 comments5 min readLW link

EPUBs of MIRI Blog Archives and se­lected LW Sequences

mesaoptimizer26 Oct 2023 14:17 UTC
44 points
6 comments1 min readLW link
(git.sr.ht)

UK Govern­ment pub­lishes “Fron­tier AI: ca­pa­bil­ities and risks” Dis­cus­sion Paper

A.H.26 Oct 2023 13:55 UTC
5 points
0 comments2 min readLW link
(www.gov.uk)

AI #35: Re­spon­si­ble Scal­ing Policies

Zvi26 Oct 2023 13:30 UTC
66 points
10 comments55 min readLW link
(thezvi.wordpress.com)

RA Bounty: Look­ing for feed­back on screen­play about AI Risk

Writer26 Oct 2023 13:23 UTC
30 points
6 comments1 min readLW link

Sen­sor Ex­po­sure can Com­pro­mise the Hu­man Brain in the 2020s

trevor26 Oct 2023 3:31 UTC
17 points
6 comments10 min readLW link

Notes on “How do we be­come con­fi­dent in the safety of a ma­chine learn­ing sys­tem?”

RohanS26 Oct 2023 3:13 UTC
4 points
0 comments13 min readLW link

Ap­ply to the Con­stel­la­tion Visit­ing Re­searcher Pro­gram and As­tra Fel­low­ship, in Berkeley this Winter

Nate Thomas26 Oct 2023 3:07 UTC
42 points
10 comments1 min readLW link

CHAI in­tern­ship ap­pli­ca­tions are open (due Nov 13)

Erik Jenner26 Oct 2023 0:53 UTC
34 points
0 comments3 min readLW link

Ar­chi­tects of Our Own Demise: We Should Stop Devel­op­ing AI Carelessly

Roko26 Oct 2023 0:36 UTC
170 points
75 comments3 min readLW link

EA In­fras­truc­ture Fund: June 2023 grant recommendations

Linch26 Oct 2023 0:35 UTC
21 points
0 comments1 min readLW link

Re­spon­si­ble Scal­ing Poli­cies Are Risk Man­age­ment Done Wrong

simeon_c25 Oct 2023 23:46 UTC
122 points
35 comments22 min readLW link1 review
(www.navigatingrisks.ai)

AI as a sci­ence, and three ob­sta­cles to al­ign­ment strategies

So8res25 Oct 2023 21:00 UTC
185 points
80 comments11 min readLW link

My hopes for al­ign­ment: Sin­gu­lar learn­ing the­ory and whole brain emulation

Garrett Baker25 Oct 2023 18:31 UTC
61 points
5 comments12 min readLW link

[Question] Ly­ing to chess play­ers for alignment

Zane25 Oct 2023 17:47 UTC
96 points
54 comments1 min readLW link

An­thropic, Google, Microsoft & OpenAI an­nounce Ex­ec­u­tive Direc­tor of the Fron­tier Model Fo­rum & over $10 mil­lion for a new AI Safety Fund

Zach Stein-Perlman25 Oct 2023 15:20 UTC
31 points
8 comments4 min readLW link
(www.frontiermodelforum.org)

“The Eco­nomics of Time Travel”—call for re­view­ers (Seeds of Science)

rogersbacon25 Oct 2023 15:13 UTC
4 points
2 comments1 min readLW link

Com­po­si­tional prefer­ence mod­els for al­ign­ing LMs

Tomek Korbak25 Oct 2023 12:17 UTC
18 points
2 comments5 min readLW link

[Question] Should the US House of Rep­re­sen­ta­tives adopt rank choice vot­ing for lead­er­ship po­si­tions?

jmh25 Oct 2023 11:16 UTC
16 points
6 comments1 min readLW link

Re­searchers be­lieve they have found a way for artists to fight back against AI style capture

vernamcipher25 Oct 2023 10:54 UTC
3 points
1 comment1 min readLW link
(finance.yahoo.com)

Why We Disagree

zulupineapple25 Oct 2023 10:50 UTC
7 points
2 comments2 min readLW link