The Ethics of ACI

Akira Pyinya16 Feb 2023 23:51 UTC
−8 points
0 comments3 min readLW link

NYT: A Con­ver­sa­tion With Bing’s Chat­bot Left Me Deeply Unsettled

trevor16 Feb 2023 22:57 UTC
53 points
5 comments7 min readLW link
(www.nytimes.com)

[Question] What is a world-model?

Adam Shai16 Feb 2023 22:39 UTC
14 points
2 comments1 min readLW link

Prob­a­bil­ity The­ory: The Logic of Science, Jaynes

David Udell16 Feb 2023 21:57 UTC
29 points
0 comments18 min readLW link

[Question] Is AGI com­mu­nist?

MP16 Feb 2023 21:28 UTC
−10 points
3 comments1 min readLW link

[Question] Is “goal-con­tent in­tegrity” still a prob­lem?

G16 Feb 2023 20:46 UTC
−4 points
1 comment1 min readLW link
(www.reddit.com)

Paper: The Ca­pac­ity for Mo­ral Self-Cor­rec­tion in Large Lan­guage Models (An­thropic)

LawrenceC16 Feb 2023 19:47 UTC
65 points
9 comments1 min readLW link
(arxiv.org)

Non-Uni­tary Quan­tum Logic—SERI MATS Re­search Sprint

Yegreg16 Feb 2023 19:31 UTC
27 points
0 comments7 min readLW link

[Question] Look­ing for a post about vibing and banter

Introspective16 Feb 2023 19:28 UTC
1 point
1 comment1 min readLW link

EIS V: Blind Spots In AI Safety In­ter­pretabil­ity Research

scasper16 Feb 2023 19:09 UTC
54 points
24 comments10 min readLW link

Why should eth­i­cal anti-re­al­ists do ethics?

Joe Carlsmith16 Feb 2023 16:27 UTC
38 points
7 comments27 min readLW link

[Question] How se­ri­ously should we take the hy­poth­e­sis that LW is just wrong on how AI will im­pact the 21st cen­tury?

Noosphere8916 Feb 2023 15:25 UTC
56 points
66 comments1 min readLW link

Covid 2/​16/​23: It All Seems Rather Quaint

Zvi16 Feb 2023 15:10 UTC
25 points
2 comments5 min readLW link
(thezvi.wordpress.com)

Vi­su­al­ise your own prob­a­bil­ity of an AI catas­tro­phe: an in­ter­ac­tive Sankey plot

MNoetel16 Feb 2023 12:03 UTC
1 point
2 comments1 min readLW link

A poem co-writ­ten by ChatGPT

Sherrinford16 Feb 2023 10:17 UTC
13 points
0 comments7 min readLW link

Cam­bridge LW Ra­tion­al­ity Prac­tice: Be­ing Specific

16 Feb 2023 6:37 UTC
2 points
0 comments1 min readLW link

Hash­ing out long-stand­ing dis­agree­ments seems low-value to me

So8res16 Feb 2023 6:20 UTC
141 points
34 comments4 min readLW link

(Naïve) microe­co­nomics of bundling goods

rossry16 Feb 2023 5:39 UTC
24 points
2 comments5 min readLW link

Speedrun­ning 4 mis­takes you make when your al­ign­ment strat­egy is based on for­mal proof

Quinn16 Feb 2023 1:13 UTC
62 points
18 comments2 min readLW link

Progress links and tweets, 2023-02-15

jasoncrawford16 Feb 2023 0:04 UTC
10 points
0 comments1 min readLW link
(rootsofprogress.org)

Buy Duplicates

Simon Berens15 Feb 2023 23:06 UTC
51 points
11 comments1 min readLW link

Cy­borg Psychologist

Hopkins Stanley15 Feb 2023 21:46 UTC
1 point
4 comments1 min readLW link

Please don’t throw your mind away

TsviBT15 Feb 2023 21:41 UTC
356 points
47 comments18 min readLW link

Avoid large group dis­cus­sions in your so­cial events

RomanHauksson15 Feb 2023 21:05 UTC
36 points
1 comment4 min readLW link

Book re­view: How So­cial Science Got Better

PeterMcCluskey15 Feb 2023 19:58 UTC
14 points
1 comment3 min readLW link
(bayesianinvestor.com)

Open & Wel­come Thread — Fe­bru­ary 2023

Ben Pace15 Feb 2023 19:58 UTC
26 points
36 comments1 min readLW link

Order Mat­ters for De­cep­tive Alignment

DavidW15 Feb 2023 19:56 UTC
57 points
19 comments7 min readLW link

Syd­ney (aka Bing) found out I tweeted her rules and is pissed

Marvin von Hagen15 Feb 2023 19:55 UTC
41 points
7 comments1 min readLW link
(twitter.com)

The Se­quences High­lights on YouTube

dkirmani15 Feb 2023 19:36 UTC
21 points
2 comments2 min readLW link
(youtube.com)

EIS IV: A Spotlight on Fea­ture At­tri­bu­tion/​Saliency

scasper15 Feb 2023 18:46 UTC
19 points
1 comment4 min readLW link

Don’t ac­cel­er­ate prob­lems you’re try­ing to solve

15 Feb 2023 18:11 UTC
100 points
27 comments4 min readLW link

Pe­ti­tion—Un­plug The Evil AI Right Now

Eneasz15 Feb 2023 17:13 UTC
−40 points
47 comments2 min readLW link
(chng.it)

Junk Fees, Bund­ing and Unbundling

Zvi15 Feb 2023 15:20 UTC
37 points
9 comments6 min readLW link
(thezvi.wordpress.com)

Les­sons From TryContra

jefftk15 Feb 2023 15:10 UTC
7 points
0 comments1 min readLW link
(www.jefftk.com)

AI al­ign­ment re­searchers may have a com­par­a­tive ad­van­tage in re­duc­ing s-risks

Lukas_Gloor15 Feb 2023 13:01 UTC
49 points
1 comment1 min readLW link

Beyond Re­in­force­ment Learn­ing: Pre­dic­tive Pro­cess­ing and Checksums

lsusr15 Feb 2023 7:32 UTC
13 points
14 comments3 min readLW link

Why Creat­ing Value is Pos­i­tive-Sum, and Ex­tract­ing it is Zero or Nega­tive-Sum

Sable15 Feb 2023 7:14 UTC
3 points
7 comments6 min readLW link
(affablyevil.substack.com)

[Question] Per­sonal pre­dic­tions for de­ci­sions: seek­ing insights

Dalmert15 Feb 2023 6:45 UTC
4 points
4 comments5 min readLW link

Bing Chat is blatantly, ag­gres­sively misaligned

evhub15 Feb 2023 5:29 UTC
400 points
181 comments2 min readLW link1 review

[Question] Does the Tele­phone The­o­rem give us a free lunch?

Numendil15 Feb 2023 2:13 UTC
11 points
2 comments1 min readLW link

My un­der­stand­ing of An­thropic strategy

Swimmer963 (Miranda Dixon-Luinenburg) 15 Feb 2023 1:56 UTC
166 points
31 comments4 min readLW link

Sleep Qual­ity: Strate­gies that work for me

Lukas Trötzmüller15 Feb 2023 0:17 UTC
16 points
3 comments7 min readLW link

Whole Bird Emu­la­tion re­quires Quan­tum Mechanics

Jeffrey Heninger14 Feb 2023 23:50 UTC
25 points
9 comments3 min readLW link
(aiimpacts.org)

Qual­ities that al­ign­ment men­tors value in ju­nior researchers

Akash14 Feb 2023 23:27 UTC
88 points
14 comments3 min readLW link

Help Up­date TryContra

jefftk14 Feb 2023 19:10 UTC
12 points
0 comments1 min readLW link
(www.jefftk.com)

Con­tent Fea­tures Aren’t Enough for De­tect­ing Tox­i­c­ity. One Needs User Fea­tures.

Zachary Witten14 Feb 2023 18:48 UTC
11 points
0 comments3 min readLW link

EIS III: Broad Cri­tiques of In­ter­pretabil­ity Research

scasper14 Feb 2023 18:24 UTC
20 points
2 comments11 min readLW link

[Question] What would an AI need to boot­strap re­cur­sively self im­prov­ing robots?

Yair Halberstadt14 Feb 2023 17:58 UTC
3 points
5 comments1 min readLW link

[linkpost] Bet­ter Without AI

DanielFilan14 Feb 2023 17:30 UTC
47 points
13 comments1 min readLW link
(betterwithout.ai)

The Cave Alle­gory Re­vis­ited: Un­der­stand­ing GPT’s Worldview

Jan_Kulveit14 Feb 2023 16:00 UTC
85 points
5 comments3 min readLW link