Causal­ity and a Cost Se­man­tics for Neu­ral Networks

scottviteri21 Aug 2023 21:02 UTC
22 points
1 comment1 min readLW link

Ideas for im­prov­ing epistemics in AI safety outreach

mic21 Aug 2023 19:55 UTC
64 points
6 comments3 min readLW link

Rice’s The­o­rem says that AIs can’t de­ter­mine much from study­ing AI source code

Michael Weiss-Malik21 Aug 2023 19:05 UTC
−11 points
4 comments1 min readLW link

Large Lan­guage Models will be Great for Censorship

Ethan Edwards21 Aug 2023 19:03 UTC
183 points
14 comments8 min readLW link
(ethanedwards.substack.com)

“Throw­ing Ex­cep­tions” Is A Strange Pro­gram­ming Pattern

Thoth Hermes21 Aug 2023 18:50 UTC
−2 points
13 comments6 min readLW link
(thothhermes.substack.com)

[Question] Which pos­si­ble AI sys­tems are rel­a­tively safe?

Zach Stein-Perlman21 Aug 2023 17:00 UTC
42 points
20 comments1 min readLW link

Self-shut­down AI

jan betley21 Aug 2023 16:48 UTC
13 points
2 comments2 min readLW link

Con­tex­tual Trans­la­tions—At­tempt 1

Varshul Gupta21 Aug 2023 14:30 UTC
−1 points
0 comments2 min readLW link
(dubverseblack.substack.com)

DIY De­liber­ate Practice

lynettebye21 Aug 2023 12:22 UTC
62 points
4 comments5 min readLW link
(lynettebye.com)

Down­stairs Open­ing: 2br Apartment

jefftk21 Aug 2023 0:50 UTC
8 points
2 comments3 min readLW link
(www.jefftk.com)

Effi­ciency and re­source use scal­ing parity

Ege Erdil21 Aug 2023 0:18 UTC
49 points
0 comments20 min readLW link

Ruin­ing an ex­pected-log-money maximizer

philh20 Aug 2023 21:20 UTC
29 points
32 comments1 min readLW link
(reasonableapproximation.net)

Steven Wolfram on AI Alignment

Bill Benzon20 Aug 2023 19:49 UTC
66 points
15 comments4 min readLW link

[Question] What value does per­sonal pre­dic­tion track­ing have?

fx20 Aug 2023 18:43 UTC
7 points
3 comments1 min readLW link

Jan Kul­veit’s Cor­rigi­bil­ity Thoughts Distilled

brook20 Aug 2023 17:52 UTC
20 points
1 comment5 min readLW link

Memetic Judo #3: The In­tel­li­gence of Stochas­tic Par­rots v.2

Max TK20 Aug 2023 15:18 UTC
8 points
33 comments6 min readLW link

ACX/​SSC Boulder meetup- Septem­ber 23

Josh Sacks20 Aug 2023 14:16 UTC
1 point
4 comments1 min readLW link

“Dirty con­cepts” in AI al­ign­ment dis­courses, and some guesses for how to deal with them

20 Aug 2023 9:13 UTC
65 points
4 comments3 min readLW link

Call for Papers on Global AI Gover­nance from the UN

Chris_Leong20 Aug 2023 8:56 UTC
19 points
0 comments1 min readLW link
(www.linkedin.com)

How do I read things on the internet

Vlad Sitalo20 Aug 2023 5:43 UTC
16 points
2 comments8 min readLW link
(vlad.roam.garden)

AI Fore­cast­ing: Two Years In

jsteinhardt19 Aug 2023 23:40 UTC
72 points
15 comments11 min readLW link
(bounded-regret.ghost.io)

Four man­age­ment/​lead­er­ship book summaries

nikola19 Aug 2023 23:38 UTC
25 points
2 comments7 min readLW link

In­ter­pret­ing a di­men­sion­al­ity re­duc­tion of a col­lec­tion of ma­tri­ces as two pos­i­tive semidefinite block di­ag­o­nal matrices

Joseph Van Name19 Aug 2023 19:52 UTC
16 points
2 comments5 min readLW link

Will AI kill ev­ery­one? Here’s what the god­fathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC
58 points
8 comments1 min readLW link
(youtu.be)

Ten vari­a­tions on red-pill-blue-pill

Richard_Kennaway19 Aug 2023 16:34 UTC
21 points
34 comments3 min readLW link

Are we run­ning out of new mu­sic/​movies/​art from a meta­phys­i­cal per­spec­tive? (up­dated)

stephen_s19 Aug 2023 16:24 UTC
4 points
23 comments1 min readLW link

[Question] Any ideas for a pre­dic­tion mar­ket ob­serv­able that quan­tifies “cul­ture-wari­sa­tion”?

Ppau19 Aug 2023 15:11 UTC
6 points
1 comment1 min readLW link

[Question] Clar­ify­ing how mis­al­ign­ment can arise from scal­ing LLMs

Util19 Aug 2023 14:16 UTC
3 points
1 comment1 min readLW link

Chess as a case study in hid­den ca­pa­bil­ities in ChatGPT

AdamYedidia19 Aug 2023 6:35 UTC
47 points
32 comments6 min readLW link

We can do bet­ter than DoWhatIMean (in­ex­tri­ca­bly kind AI)

lemonhope19 Aug 2023 5:41 UTC
25 points
8 comments2 min readLW link

Su­per­vised Pro­gram for Align­ment Re­search (SPAR) at UC Berkeley: Spring 2023 summary

19 Aug 2023 2:27 UTC
20 points
2 comments6 min readLW link

Could fabs own AI?

lemonhope19 Aug 2023 0:16 UTC
15 points
0 comments3 min readLW link

Is Chi­nese to­tal fac­tor pro­duc­tivity lower to­day than it was in 1956?

Ege Erdil18 Aug 2023 22:33 UTC
43 points
0 comments26 min readLW link

Ra­tion­al­ity-ish Mee­tups Show­case: 2019-2021

jenn18 Aug 2023 22:22 UTC
10 points
0 comments5 min readLW link

The U.S. is be­com­ing less stable

lc18 Aug 2023 21:13 UTC
146 points
68 comments2 min readLW link

Meetup Tip: Board Games

Screwtape18 Aug 2023 18:11 UTC
9 points
4 comments7 min readLW link

[Question] AI labs’ re­quests for input

Zach Stein-Perlman18 Aug 2023 17:00 UTC
29 points
0 comments1 min readLW link

6 non-ob­vi­ous men­tal health is­sues spe­cific to AI safety

Igor Ivanov18 Aug 2023 15:46 UTC
145 points
24 comments4 min readLW link

When dis­cussing AI doom bar­ri­ers pro­pose spe­cific plau­si­ble scenarios

anithite18 Aug 2023 4:06 UTC
5 points
0 comments3 min readLW link

Risks from AI Overview: Summary

18 Aug 2023 1:21 UTC
25 points
1 comment13 min readLW link
(www.safe.ai)

Manag­ing risks of our own work

Beth Barnes18 Aug 2023 0:41 UTC
66 points
0 comments2 min readLW link

ACI#5: From Hu­man-AI Co-evolu­tion to the Evolu­tion of Value Systems

Akira Pyinya18 Aug 2023 0:38 UTC
0 points
0 comments9 min readLW link

Memetic Judo #1: On Dooms­day Prophets v.3

Max TK18 Aug 2023 0:14 UTC
25 points
17 comments3 min readLW link

Look­ing for judges for cri­tiques of Align­ment Plans

Iknownothing17 Aug 2023 22:35 UTC
5 points
0 comments1 min readLW link

How is ChatGPT’s be­hav­ior chang­ing over time?

Phib17 Aug 2023 20:54 UTC
3 points
0 comments1 min readLW link
(arxiv.org)

Progress links di­gest, 2023-08-17: Cloud seed­ing, robotic sculp­tors, and rogue planets

jasoncrawford17 Aug 2023 20:29 UTC
15 points
1 comment4 min readLW link
(rootsofprogress.org)

Model of psy­chosis, take 2

Steven Byrnes17 Aug 2023 19:11 UTC
32 points
13 comments4 min readLW link

[Linkpost] Ro­bus­tified ANNs Re­veal Worm­holes Between Hu­man Cat­e­gory Percepts

Bogdan Ionut Cirstea17 Aug 2023 19:10 UTC
6 points
2 comments1 min readLW link

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC
322 points
86 comments26 min readLW link

Goldilocks and the Three Optimisers

dkl917 Aug 2023 18:15 UTC
−10 points
0 comments5 min readLW link
(dkl9.net)