In­ves­ti­gat­ing Alter­na­tive Fu­tures: Hu­man and Su­per­in­tel­li­gence In­ter­ac­tion Scenarios

Hiroshi Yamakawa3 Jan 2024 23:46 UTC
1 point
0 comments17 min readLW link

“At­ti­tudes Toward Ar­tifi­cial Gen­eral In­tel­li­gence: Re­sults from Amer­i­can Adults 2021 and 2023”—call for re­view­ers (Seeds of Science)

rogersbacon3 Jan 2024 20:11 UTC
4 points
0 comments1 min readLW link

What’s up with LLMs rep­re­sent­ing XORs of ar­bi­trary fea­tures?

Sam Marks3 Jan 2024 19:44 UTC
157 points
61 comments16 min readLW link

Spirit Air­lines Merger Play

sapphire3 Jan 2024 19:25 UTC
5 points
12 comments1 min readLW link

$300 for the best sci-fi prompt: the results

RomanS3 Jan 2024 19:10 UTC
16 points
19 comments7 min readLW link

Agent mem­branes/​bound­aries and for­mal­iz­ing “safety”

Chipmonk3 Jan 2024 17:55 UTC
26 points
46 comments3 min readLW link

Safety First: safety be­fore full al­ign­ment. The de­on­tic suffi­ciency hy­poth­e­sis.

Chipmonk3 Jan 2024 17:55 UTC
48 points
3 comments3 min readLW link

Prac­ti­cally A Book Re­view: Ap­pendix to “Non­lin­ear’s Ev­i­dence: De­bunk­ing False and Mislead­ing Claims” (ThingOfThings)

tailcalled3 Jan 2024 17:07 UTC
111 points
25 comments2 min readLW link
(thingofthings.substack.com)

Triv­ial Math­e­mat­ics as a Path Forward

ACrackedPot3 Jan 2024 16:41 UTC
−4 points
2 comments2 min readLW link

Copy­right Con­fronta­tion #1

Zvi3 Jan 2024 15:50 UTC
34 points
7 comments18 min readLW link
(thezvi.wordpress.com)

[Question] The­o­ret­i­cally, could we bal­ance the bud­get painlessly?

Logan Zoellner3 Jan 2024 14:46 UTC
4 points
12 comments1 min readLW link

Jo­hannes’ Biography

Johannes C. Mayer3 Jan 2024 13:27 UTC
23 points
0 comments10 min readLW link

What Helped Me—Kale, Blood, CPAP, X-tiamine, Methylphenidate

Johannes C. Mayer3 Jan 2024 13:22 UTC
35 points
12 comments2 min readLW link

[Question] Does LessWrong make a differ­ence when it comes to AI al­ign­ment?

PhilosophicalSoul3 Jan 2024 12:21 UTC
18 points
13 comments1 min readLW link

[Question] Ter­minol­ogy: <some­thing>-ware for ML?

Oliver Sourbut3 Jan 2024 11:42 UTC
17 points
27 comments1 min readLW link

Trad­ing off Lives

jefftk3 Jan 2024 3:40 UTC
53 points
12 comments2 min readLW link
(www.jefftk.com)

MonoPoly Restricted Trust

ymeskhout2 Jan 2024 23:02 UTC
42 points
37 comments9 min readLW link

Agent mem­branes and causal distance

Chipmonk2 Jan 2024 22:43 UTC
20 points
3 comments3 min readLW link

Fo­cus­ing on Mal-Alignment

John Fisher2 Jan 2024 19:51 UTC
1 point
0 comments1 min readLW link

Gentle­ness and the ar­tifi­cial Other

Joe Carlsmith2 Jan 2024 18:21 UTC
291 points
33 comments11 min readLW link

Oth­er­ness and con­trol in the age of AGI

Joe Carlsmith2 Jan 2024 18:15 UTC
37 points
0 comments7 min readLW link

Apol­o­giz­ing is a Core Ra­tion­al­ist Skill

johnswentworth2 Jan 2024 17:47 UTC
152 points
42 comments5 min readLW link

Cortés, AI Risk, and the Dy­nam­ics of Com­pet­ing Conquerors

James_Miller2 Jan 2024 16:37 UTC
14 points
2 comments3 min readLW link

OpenAI’s Pre­pared­ness Frame­work: Praise & Recommendations

Akash2 Jan 2024 16:20 UTC
66 points
1 comment7 min readLW link

Dat­ing Roundup #2: If At First You Don’t Succeed

Zvi2 Jan 2024 16:00 UTC
54 points
29 comments47 min readLW link
(thezvi.wordpress.com)

Look­ing for Read­ing Recom­men­da­tions: Con­tent Moder­a­tion, Power & Censorship

Joerg Weiss2 Jan 2024 11:37 UTC
2 points
7 comments1 min readLW link

AI Is Not Software

Davidmanheim2 Jan 2024 7:58 UTC
56 points
29 comments5 min readLW link

Are Me­tac­u­lus AI Timelines In­con­sis­tent?

Chris_Leong2 Jan 2024 6:47 UTC
16 points
7 comments2 min readLW link

Bos­ton Sols­tice 2023 Retrospective

jefftk2 Jan 2024 3:10 UTC
33 points
0 comments6 min readLW link
(www.jefftk.com)

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

2 Jan 2024 0:47 UTC
123 points
29 comments8 min readLW link
(arxiv.org)

Twin Cities ACX Meetup—Jan­uary 2024

Timothy M.1 Jan 2024 21:13 UTC
1 point
2 comments1 min readLW link

San Fran­cisco ACX Meetup “First Satur­day”

guenael1 Jan 2024 20:58 UTC
1 point
1 comment1 min readLW link

Mech In­terp Challenge: Jan­uary—De­ci­pher­ing the Cae­sar Cipher Model

CallumMcDougall1 Jan 2024 18:03 UTC
17 points
0 comments3 min readLW link

Aldix and the Book of Life

ville1 Jan 2024 17:23 UTC
1 point
0 comments4 min readLW link
(medium.com)

Me­tac­u­lus Hosts ACX 2024 Pre­dic­tion Contest

ChristianWilliams1 Jan 2024 16:38 UTC
4 points
0 comments1 min readLW link
(www.metaculus.com)

The Act It­self: Ex­cep­tion­less Mo­ral Norms

JohnBuridan1 Jan 2024 16:06 UTC
5 points
11 comments6 min readLW link

De­cep­tion Chess

Chris Land1 Jan 2024 15:40 UTC
7 points
2 comments4 min readLW link

Stop talk­ing about p(doom)

Isaac King1 Jan 2024 10:57 UTC
38 points
22 comments3 min readLW link

[Question] What should a non-ge­nius do in the face of rapid progress in GAI to en­sure a de­cent life?

kaler1 Jan 2024 8:22 UTC
11 points
16 comments1 min readLW link

A hermeneu­tic net for agency

TsviBT1 Jan 2024 8:06 UTC
58 points
4 comments30 min readLW link

Re­search Jan/​Feb 2024

Stephen Fowler1 Jan 2024 6:02 UTC
9 points
0 comments2 min readLW link

2023 in AI predictions

jessicata1 Jan 2024 5:23 UTC
107 points
35 comments5 min readLW link

Rhythm Stage Setup Components

jefftk1 Jan 2024 3:10 UTC
10 points
4 comments2 min readLW link
(www.jefftk.com)

Bayesian up­dat­ing in real life is mostly about un­der­stand­ing your hypotheses

Max H1 Jan 2024 0:10 UTC
63 points
4 comments11 min readLW link

Dark Art: Inception

Abu Ibrahim31 Dec 2023 21:09 UTC
10 points
0 comments3 min readLW link

A case for AI al­ign­ment be­ing difficult

jessicata31 Dec 2023 19:55 UTC
105 points
56 comments15 min readLW link
(unstableontology.com)

The Roots of Progress 2023 in review

jasoncrawford31 Dec 2023 18:16 UTC
22 points
0 comments11 min readLW link
(rootsofprogress.org)

Ex­tended Navel-Gaz­ing On My 2023 Donations

jenn31 Dec 2023 18:10 UTC
8 points
0 comments1 min readLW link
(jenn.site)

aisafety.info, the Table of Content

Charbel-Raphaël31 Dec 2023 13:57 UTC
23 points
1 comment11 min readLW link

AIOS

samhealy31 Dec 2023 13:23 UTC
−3 points
5 comments6 min readLW link