The Align­ment Problems

Martín Soto12 Jan 2023 22:29 UTC
20 points
0 comments4 min readLW link

Pro­posal for In­duc­ing Steganog­ra­phy in LMs

Logan Riggs12 Jan 2023 22:15 UTC
22 points
3 comments2 min readLW link

An­nounc­ing the 2023 PIBBSS Sum­mer Re­search Fellowship

12 Jan 2023 21:31 UTC
32 points
0 comments1 min readLW link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
40 points
3 comments4 min readLW link
(www.theinsideview.ai)

[Question] What is a dis­agree­ment you have around AI safety?

tailcalled12 Jan 2023 16:58 UTC
16 points
7 comments1 min readLW link

Re­ward is not Ne­c­es­sary: How to Create a Com­po­si­tional Self-Pre­serv­ing Agent for Life-Long Learning

Roman Leventov12 Jan 2023 16:43 UTC
17 points
2 comments2 min readLW link
(arxiv.org)

ChatGPT strug­gles to re­spond to the real world

Alex Flint12 Jan 2023 16:02 UTC
31 points
9 comments24 min readLW link

Covid 1/​12/​23: Un­ex­pected Spike in Deaths

Zvi12 Jan 2023 14:30 UTC
31 points
2 comments8 min readLW link
(thezvi.wordpress.com)

[Linkpost] Scal­ing Laws for Gen­er­a­tive Mixed-Mo­dal Lan­guage Models

Amal 12 Jan 2023 14:24 UTC
15 points
2 comments1 min readLW link
(arxiv.org)

ea.do­mains—Do­mains Free to a Good Home

plex12 Jan 2023 13:32 UTC
24 points
0 comments1 min readLW link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC
46 points
12 comments1 min readLW link

Iron defi­cien­cies are very bad and you should treat them

Elizabeth12 Jan 2023 9:10 UTC
101 points
30 comments11 min readLW link
(acesounderglass.com)

Non­stan­dard anal­y­sis in ethics

Alok Singh12 Jan 2023 5:58 UTC
−1 points
0 comments78 min readLW link
(nickbostrom.com)

Ex­am­ple of the name­less ra­tio­nal­ist virtue

Alok Singh12 Jan 2023 5:45 UTC
−9 points
2 comments1 min readLW link

FFMI Gains: A List of Vitalities

porby12 Jan 2023 4:48 UTC
26 points
1 comment7 min readLW link

[Linkpost] Dream­erV3: A Gen­eral RL Architecture

simeon_c12 Jan 2023 3:55 UTC
23 points
3 comments1 min readLW link
(arxiv.org)

Microsoft Plans to In­vest $10B in OpenAI; $3B In­vested to Date | For­tune

DragonGod12 Jan 2023 3:55 UTC
23 points
10 comments2 min readLW link
(fortune.com)

Progress and re­search dis­rup­tive­ness

Eleni Angelou12 Jan 2023 3:51 UTC
3 points
2 comments1 min readLW link
(www.nature.com)

The Fable of the AI Coomer: Why the So­cial Prowess of Machines is AI’s Most Prox­i­mal Threat

Ace Delgado12 Jan 2023 1:15 UTC
−10 points
4 comments4 min readLW link

Write to Think

Michael Samoilov12 Jan 2023 0:33 UTC
10 points
2 comments2 min readLW link

Align­ment is not enough

Alan Chan12 Jan 2023 0:33 UTC
12 points
6 comments11 min readLW link
(coordination.substack.com)

How it feels to have your mind hacked by an AI

blaked12 Jan 2023 0:33 UTC
361 points
221 comments17 min readLW link

Cat­e­gor­i­cal-mea­sure-the­o­retic ap­proach to op­ti­mal poli­cies tend­ing to seek power

jacek12 Jan 2023 0:32 UTC
31 points
3 comments6 min readLW link

Any per­son/​mind should have the right to suicide

askofa12 Jan 2023 0:32 UTC
14 points
13 comments2 min readLW link

Have we re­ally for­saken nat­u­ral se­lec­tion?

KatjaGrace12 Jan 2023 0:10 UTC
34 points
7 comments2 min readLW link
(worldspiritsockpuppet.com)

[Question] Us­ing Finite Fac­tored Sets for Causal Rep­re­sen­ta­tion Learn­ing?

David Reber11 Jan 2023 22:06 UTC
2 points
3 comments1 min readLW link

GWWC’s Han­dling of Con­flict­ing Fund­ing Bars

jefftk11 Jan 2023 20:30 UTC
19 points
0 comments3 min readLW link
(www.jefftk.com)

How to write a big carte­sian product sym­bol in MathJax

Matthias G. Mayer11 Jan 2023 20:21 UTC
6 points
1 comment1 min readLW link

What’s the deal with AI con­scious­ness?

TW12311 Jan 2023 16:37 UTC
6 points
13 comments9 min readLW link
(aiwatchtower.substack.com)

[Question] Any sig­nifi­cant up­dates on long covid risk anal­y­sis?

Randomized, Controlled11 Jan 2023 14:31 UTC
23 points
11 comments1 min readLW link

in­ter­nal in non­stan­dard analysis

Alok Singh11 Jan 2023 9:58 UTC
9 points
1 comment1 min readLW link

Com­pound­ing Re­source X

Raemon11 Jan 2023 3:14 UTC
77 points
6 comments9 min readLW link

Run­ning With a Backpack

jefftk11 Jan 2023 3:00 UTC
19 points
11 comments1 min readLW link
(www.jefftk.com)

A sim­ple thought ex­per­i­ment show­ing why re­ces­sions are an un­nec­es­sary bug in our eco­nomic system

skogsnisse11 Jan 2023 0:43 UTC
1 point
1 comment1 min readLW link

We don’t trade with ants

KatjaGrace10 Jan 2023 23:50 UTC
269 points
108 comments7 min readLW link
(worldspiritsockpuppet.com)

[Question] Who are the peo­ple who are cur­rently prof­it­ing from in­fla­tion?

skogsnisse10 Jan 2023 21:39 UTC
1 point
2 comments1 min readLW link

Is Progress Real?

rogersbacon10 Jan 2023 17:42 UTC
5 points
14 comments14 min readLW link
(www.secretorum.life)

200 COP in MI: In­ter­pret­ing Re­in­force­ment Learning

Neel Nanda10 Jan 2023 17:37 UTC
25 points
1 comment10 min readLW link

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

10 Jan 2023 16:06 UTC
117 points
44 comments26 min readLW link

The Align­ment Prob­lem from a Deep Learn­ing Per­spec­tive (ma­jor rewrite)

10 Jan 2023 16:06 UTC
84 points
8 comments39 min readLW link
(arxiv.org)

Against us­ing stock prices to fore­cast AI timelines

10 Jan 2023 16:03 UTC
23 points
2 comments2 min readLW link

Sort­ing Peb­bles Into Cor­rect Heaps: The Animation

Writer10 Jan 2023 15:58 UTC
26 points
2 comments1 min readLW link
(youtu.be)

Es­cape Ve­loc­ity from Bul­lshit Jobs

Zvi10 Jan 2023 14:30 UTC
61 points
18 comments5 min readLW link
(thezvi.wordpress.com)

Scal­ing laws vs in­di­vi­d­ual differences

beren10 Jan 2023 13:22 UTC
44 points
21 comments7 min readLW link

Notes on writing

RP10 Jan 2023 4:01 UTC
35 points
11 comments3 min readLW link

Idea: Learn­ing How To Move Towards The Metagame

Algon10 Jan 2023 0:58 UTC
10 points
3 comments1 min readLW link

Re­view AI Align­ment posts to help figure out how to make a proper AI Align­ment review

10 Jan 2023 0:19 UTC
85 points
31 comments2 min readLW link

Against the para­dox of tolerance

pchvykov10 Jan 2023 0:12 UTC
1 point
11 comments3 min readLW link

In­creased Scam Qual­ity/​Quan­tity (Hy­poth­e­sis in need of data)?

Beeblebrox9 Jan 2023 22:57 UTC
9 points
6 comments1 min readLW link

Went­worth and Larsen on buy­ing time

9 Jan 2023 21:31 UTC
73 points
6 comments12 min readLW link