RSS

quetzal_rainbow

Karma: 1,651

[Question] How do you shut down an es­caped model?

quetzal_rainbow2 Jun 2024 19:51 UTC
15 points
8 comments1 min readLW link

Train­ing of su­per­in­tel­li­gence is se­cretly adversarial

quetzal_rainbow7 Feb 2024 13:38 UTC
15 points
2 comments5 min readLW link

There is no sharp bound­ary be­tween de­on­tol­ogy and consequentialism

quetzal_rainbow8 Jan 2024 11:01 UTC
8 points
2 comments1 min readLW link

Where Does Ad­ver­sar­ial Pres­sure Come From?

quetzal_rainbow14 Dec 2023 22:31 UTC
16 points
1 comment2 min readLW link

Pre­dictable Defect-Co­op­er­ate?

quetzal_rainbow18 Nov 2023 15:38 UTC
7 points
1 comment2 min readLW link

They are made of re­peat­ing patterns

quetzal_rainbow13 Nov 2023 18:17 UTC
49 points
4 comments2 min readLW link

[Question] How to model un­cer­tainty about prefer­ences?

quetzal_rainbow24 Mar 2023 19:04 UTC
10 points
2 comments1 min readLW link

[Question] What liter­a­ture on the neu­ro­science of de­ci­sion mak­ing can you recom­mend?

quetzal_rainbow16 Mar 2023 15:32 UTC
3 points
0 comments1 min readLW link

[Question] What spe­cific thing would you do with AI Align­ment Re­search As­sis­tant GPT?

quetzal_rainbow8 Jan 2023 19:24 UTC
45 points
9 comments1 min readLW link

[Question] Are there any tools to con­vert LW se­quences to PDF or any other file for­mat?

quetzal_rainbow7 Dec 2022 5:28 UTC
2 points
2 comments1 min readLW link

quet­zal_rain­bow’s Shortform

quetzal_rainbow20 Nov 2022 16:00 UTC
1 point
97 comments1 min readLW link