RSS

Si­tu­a­tional Awareness

TagLast edit: 16 Jun 2023 14:42 UTC by Mateusz Bagiński

Ajeya Cotra uses the term “situational awarenessto refer to a cluster of skills including “being able to refer to and make predictions about yourself as distinct from the rest of the world,” “understanding the forces out in the world that shaped you and how the things that happen to you continue to be influenced by outside forces,” “understanding your position in the world relative to other actors who may have power over you,” “understanding how your actions can affect the outside world including other actors,” etc.

Alternatively, from an ML-perspective, situational awareness can be characterized as a strong form of out-of-context meta-learning applied to situationally-relevant statements.

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya Cotra18 Jul 2022 19:06 UTC
365 points
94 comments75 min readLW link1 review

In­ves­ti­gat­ing the Abil­ity of LLMs to Rec­og­nize Their Own Writing

30 Jul 2024 15:41 UTC
32 points
0 comments15 min readLW link

Owain Evans on Si­tu­a­tional Aware­ness and Out-of-Con­text Rea­son­ing in LLMs

Michaël Trazzi24 Aug 2024 4:30 UTC
55 points
0 comments5 min readLW link

Re­sults from the Tur­ing Sem­i­nar hackathon

7 Dec 2023 14:50 UTC
29 points
1 comment6 min readLW link

Paper: On mea­sur­ing situ­a­tional aware­ness in LLMs

4 Sep 2023 12:54 UTC
108 points
16 comments5 min readLW link
(arxiv.org)

Some Quick Fol­low-Up Ex­per­i­ments to “Taken out of con­text: On mea­sur­ing situ­a­tional aware­ness in LLMs”

Miles Turpin3 Oct 2023 2:22 UTC
31 points
0 comments9 min readLW link

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

8 Jul 2024 22:24 UTC
101 points
28 comments5 min readLW link

[Question] Is there any rigor­ous work on us­ing an­thropic un­cer­tainty to pre­vent situ­a­tional aware­ness /​ de­cep­tion?

David Scott Krueger (formerly: capybaralet)4 Sep 2024 12:40 UTC
17 points
7 comments1 min readLW link

Contin­gency: A Con­cep­tual Tool from Evolu­tion­ary Biol­ogy for Alignment

clem_acs12 Jun 2023 20:54 UTC
57 points
2 comments14 min readLW link
(acsresearch.org)

The in­tel­li­gence-sen­tience or­thog­o­nal­ity thesis

Ben Smith13 Jul 2023 6:55 UTC
18 points
9 comments9 min readLW link

The Zeroth Skillset

katydee30 Jan 2013 12:46 UTC
74 points
109 comments2 min readLW link

Facts vs In­ter­pre­ta­tions—An Ex­er­cise in Cog­ni­tive Reframing

Declan Molony27 Feb 2024 7:57 UTC
15 points
0 comments3 min readLW link

Re­veal­ing In­ten­tion­al­ity In Lan­guage Models Through AdaVAE Guided Sampling

jdp20 Oct 2023 7:32 UTC
119 points
15 comments22 min readLW link

Per­cep­tual Blindspots: How to In­crease Self-Awareness

Declan Molony26 Mar 2024 5:37 UTC
14 points
3 comments2 min readLW link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

17 Apr 2024 21:09 UTC
44 points
1 comment3 min readLW link
(tiny.cc)

Early situ­a­tional aware­ness and its im­pli­ca­tions, a story

Jacob Pfau6 Feb 2023 20:45 UTC
29 points
6 comments3 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon Möller3 Mar 2023 18:59 UTC
30 points
2 comments7 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

25 Nov 2022 14:36 UTC
39 points
9 comments6 min readLW link
(vkrakovna.wordpress.com)

LM Si­tu­a­tional Aware­ness, Eval­u­a­tion Pro­posal: Vio­lat­ing Imitation

Jacob Pfau26 Apr 2023 22:53 UTC
16 points
2 comments2 min readLW link
No comments.