Take 9: No, RLHF/​IDA/​de­bate doesn’t solve outer al­ign­ment.

Charlie SteinerDec 12, 2022, 11:51 AM
33 points
13 comments2 min readLW link

Creat­ing a database for base rates

nikosDec 12, 2022, 10:09 AM
2 points
1 comment3 min readLW link
(forum.effectivealtruism.org)

Triv­ial GPT-3.5 limi­ta­tion workaround

Dave LindberghDec 12, 2022, 8:42 AM
5 points
4 comments1 min readLW link

Ponzi schemes can be highly prof­itable if your timing is good

GeneSmithDec 12, 2022, 6:42 AM
10 points
18 comments5 min readLW link

Prod­ding ChatGPT to solve a ba­sic alge­bra problem

ShmiDec 12, 2022, 4:09 AM
14 points
6 comments1 min readLW link
(twitter.com)

Wider De­fault Au­dio Player in Chrome?

jefftkDec 12, 2022, 3:30 AM
11 points
2 comments1 min readLW link
(www.jefftk.com)

A brain­teaser for lan­guage models

Adam ScherlisDec 12, 2022, 2:43 AM
47 points
3 comments2 min readLW link

Bench­marks for Com­par­ing Hu­man and AI Intelligence

MrThinkDec 11, 2022, 10:06 PM
9 points
4 comments2 min readLW link

Reflec­tions on the PIBBSS Fel­low­ship 2022

Dec 11, 2022, 9:53 PM
32 points
0 comments18 min readLW link

A crisis for on­line com­mu­ni­ca­tion: bots and bot users will over­run the In­ter­net?

Mitchell_PorterDec 11, 2022, 9:11 PM
15 points
11 comments1 min readLW link

Finite Fac­tored Sets in Pictures

Magdalena WacheDec 11, 2022, 6:49 PM
174 points
35 comments12 min readLW link

For­mal­iza­tion as sus­pen­sion of intuition

adamShimiDec 11, 2022, 3:16 PM
54 points
18 comments1 min readLW link
(epistemologicalvigilance.substack.com)

An ar­gu­ment on an­i­mal con­scious­ness (so­lic­it­ing crit­i­cism)

SciHamsterDec 11, 2022, 3:12 PM
1 point
2 comments1 min readLW link

ChatGPT’s new novel ra­tio­nal­ity tech­nique of fact checking

ChristianKlDec 11, 2022, 1:54 PM
−14 points
7 comments1 min readLW link

Refram­ing in­ner alignment

davidadDec 11, 2022, 1:53 PM
53 points
13 comments4 min readLW link

A poem about ap­plied ra­tio­nal­ity by ChatGPT

ChristianKlDec 11, 2022, 1:43 PM
4 points
0 comments1 min readLW link

ChatGPT goes through a worm­hole hole in our Shandyesque uni­verse [vir­tual wacky weed]

Bill BenzonDec 11, 2022, 11:59 AM
−1 points
2 comments3 min readLW link

Us­ing Ob­sidian if you’re used to us­ing Roam

Solenoid_EntityDec 11, 2022, 8:59 AM
19 points
4 comments2 min readLW link

[fic­tion] Our Fi­nal Hour

Mati_RoyDec 11, 2022, 5:49 AM
23 points
5 comments3 min readLW link

Con­sider us­ing re­versible au­tomata for al­ign­ment research

Alex_AltairDec 11, 2022, 1:00 AM
88 points
30 comments2 min readLW link

High level dis­course struc­ture in ChatGPT: Part 2 [Quasi-sym­bolic?]

Bill BenzonDec 10, 2022, 10:26 PM
7 points
0 comments6 min readLW link

Poll Re­sults on AGI

Niclas KupperDec 10, 2022, 9:25 PM
18 points
0 comments2 min readLW link

Reflect­ing on the 2022 Guild of the Rose Workshops

moridinamaelDec 10, 2022, 9:21 PM
26 points
7 comments8 min readLW link

[Question] Rev­ers­ing a quan­tum simu­la­tion on the plane­tary scale

MythopoeistDec 10, 2022, 8:26 PM
2 points
3 comments1 min readLW link

ACX Zurich De­cem­ber Meetup

MBDec 10, 2022, 7:23 PM
1 point
0 comments1 min readLW link

[ASoT] Nat­u­ral ab­strac­tions and AlphaZero

Ulisse MiniDec 10, 2022, 5:53 PM
33 points
1 comment1 min readLW link
(arxiv.org)

[Question] How promis­ing are le­gal av­enues to re­strict AI train­ing data?

thehalliardDec 10, 2022, 4:31 PM
9 points
2 comments1 min readLW link

In­spira­tion as a Scarce Resource

zenbu zenbu zenbu zenbuDec 10, 2022, 3:23 PM
7 points
0 comments4 min readLW link
(inflorescence.substack.com)

Will Man­i­fold Mar­kets/​Me­tac­u­lus have built-in sup­port for re­flec­tive la­tent vari­ables by 2025?

tailcalledDec 10, 2022, 1:55 PM
35 points
0 comments1 min readLW link

My thoughts on OpenAI’s Align­ment plan

Donald HobsonDec 10, 2022, 10:35 AM
25 points
1 comment6 min readLW link

[Question] How would you im­prove ChatGPT’s fil­ter­ing?

Noah ScalesDec 10, 2022, 8:05 AM
9 points
6 comments1 min readLW link

[Question] A thought experiment

sisyphusDec 10, 2022, 5:23 AM
3 points
12 comments1 min readLW link

pa­tio11′s “Ob­ser­va­tions from an EA-ad­ja­cent (?) char­i­ta­ble effort”

RobertMDec 10, 2022, 12:27 AM
43 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

A dy­nam­i­cal sys­tems primer for en­tropy and optimization

Alex_AltairDec 10, 2022, 12:13 AM
45 points
3 comments7 min readLW link

[Linkpost] The Story Of VaccinateCA

hathDec 9, 2022, 11:54 PM
103 points
4 comments10 min readLW link
(www.worksinprogress.co)

Pro­saic mis­al­ign­ment from the Solomonoff Predictor

Cleo NardoDec 9, 2022, 5:53 PM
42 points
3 comments5 min readLW link

Take 8: Queer the in­ner/​outer al­ign­ment di­chotomy.

Charlie SteinerDec 9, 2022, 5:46 PM
31 points
2 comments2 min readLW link

[Question] Does a LLM have a util­ity func­tion?

DagonDec 9, 2022, 5:19 PM
17 points
11 comments1 min readLW link

Monthly Roundup #1

ZviDec 9, 2022, 5:10 PM
31 points
6 comments21 min readLW link
(thezvi.wordpress.com)

Work­ing to­wards AI al­ign­ment is better

Johannes C. MayerDec 9, 2022, 3:39 PM
8 points
2 comments2 min readLW link

You can still fetch the coffee to­day if you’re dead tomorrow

davidadDec 9, 2022, 2:06 PM
96 points
19 comments5 min readLW link

ChatGPT’s Misal­ign­ment Isn’t What You Think

stavrosDec 9, 2022, 11:11 AM
3 points
12 comments1 min readLW link

ML Safety at NeurIPS & Paradig­matic AI Safety? MLAISU W49

Dec 9, 2022, 10:38 AM
19 points
0 comments4 min readLW link
(newsletter.apartresearch.com)

[Question] What are your thoughts on the fu­ture of AI-as­sisted soft­ware de­vel­op­ment?

RomanHaukssonDec 9, 2022, 10:04 AM
4 points
4 comments1 min readLW link

Fear miti­gated the nu­clear threat, can it do the same to AGI risks?

Igor IvanovDec 9, 2022, 10:04 AM
6 points
8 comments5 min readLW link

Set­ting the Zero Point

Duncan Sabien (Deactivated)Dec 9, 2022, 6:06 AM
90 points
43 comments20 min readLW link1 review

Sys­tems of Survival

VaniverDec 9, 2022, 5:13 AM
63 points
5 comments5 min readLW link

[Question] Do You Have an In­ter­nal Monologue?

belkarxDec 9, 2022, 3:04 AM
23 points
7 comments1 min readLW link

[Question] How is the “sharp left turn defined”?

Chris_LeongDec 9, 2022, 12:04 AM
14 points
4 comments1 min readLW link

Linkpost for a gen­er­al­ist al­gorith­mic learner: ca­pa­ble of car­ry­ing out sort­ing, short­est paths, string match­ing, con­vex hull find­ing in one network

lovetheusersDec 9, 2022, 12:02 AM
7 points
1 comment1 min readLW link
(twitter.com)