RSS

TW123

Karma: 1,234

Risks from AI Overview: Summary

Aug 18, 2023, 1:21 AM
25 points
1 comment13 min readLW link
(www.safe.ai)

Catas­trophic Risks from AI #6: Dis­cus­sion and FAQ

Jun 27, 2023, 11:23 PM
24 points
1 comment13 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #5: Rogue AIs

Jun 27, 2023, 10:06 PM
15 points
0 comments22 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #4: Or­ga­ni­za­tional Risks

Jun 26, 2023, 7:36 PM
23 points
0 comments21 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #3: AI Race

Jun 23, 2023, 7:21 PM
18 points
9 comments29 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #2: Mal­i­cious Use

Jun 22, 2023, 5:10 PM
38 points
1 comment17 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #1: Introduction

Jun 22, 2023, 5:09 PM
40 points
1 comment5 min readLW link
(arxiv.org)

[MLSN #9] Ver­ify­ing large train­ing runs, se­cu­rity risks from LLM ac­cess to APIs, why nat­u­ral se­lec­tion may fa­vor AIs over humans

Apr 11, 2023, 4:03 PM
11 points
0 comments6 min readLW link
(newsletter.mlsafety.org)

[MLSN #8] Mechanis­tic in­ter­pretabil­ity, us­ing law to in­form AI al­ign­ment, scal­ing laws for proxy gaming

Feb 20, 2023, 3:54 PM
20 points
0 comments4 min readLW link
(newsletter.mlsafety.org)

What’s the deal with AI con­scious­ness?

TW123Jan 11, 2023, 4:37 PM
6 points
13 comments9 min readLW link
(aiwatchtower.substack.com)

Im­pli­ca­tions of simulators

TW123Jan 7, 2023, 12:37 AM
17 points
0 comments12 min readLW link

“AI” is an indexical

TW123Jan 3, 2023, 10:00 PM
10 points
0 comments6 min readLW link
(aiwatchtower.substack.com)

A Year of AI In­creas­ing AI Progress

TW123Dec 30, 2022, 2:09 AM
148 points
3 comments2 min readLW link

Did ChatGPT just gaslight me?

TW123Dec 1, 2022, 5:41 AM
123 points
45 comments9 min readLW link
(aiwatchtower.substack.com)

A philoso­pher’s cri­tique of RLHF

TW123Nov 7, 2022, 2:42 AM
55 points
8 comments2 min readLW link

ML Safety Schol­ars Sum­mer 2022 Retrospective

TW123Nov 1, 2022, 3:09 AM
29 points
0 comments1 min readLW link

An­nounc­ing the In­tro­duc­tion to ML Safety course

Aug 6, 2022, 2:46 AM
73 points
6 comments7 min readLW link

$20K In Boun­ties for AI Safety Public Materials

Aug 5, 2022, 2:52 AM
71 points
9 comments6 min readLW link

Ex­am­ples of AI In­creas­ing AI Progress

TW123Jul 17, 2022, 8:06 PM
107 points
14 comments1 min readLW link

Open Prob­lems in AI X-Risk [PAIS #5]

Jun 10, 2022, 2:08 AM
61 points
6 comments36 min readLW link