RSS

Si­tu­a­tional Awareness

TagLast edit: Jun 16, 2023, 2:42 PM by Mateusz Bagiński

Ajeya Cotra uses the term “situational awarenessto refer to a cluster of skills including “being able to refer to and make predictions about yourself as distinct from the rest of the world,” “understanding the forces out in the world that shaped you and how the things that happen to you continue to be influenced by outside forces,” “understanding your position in the world relative to other actors who may have power over you,” “understanding how your actions can affect the outside world including other actors,” etc.

Alternatively, from an ML-perspective, situational awareness can be characterized as a strong form of out-of-context meta-learning applied to situationally-relevant statements.

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya CotraJul 18, 2022, 7:06 PM
368 points
95 comments75 min readLW link1 review

Owain Evans on Si­tu­a­tional Aware­ness and Out-of-Con­text Rea­son­ing in LLMs

Michaël TrazziAug 24, 2024, 4:30 AM
55 points
0 comments5 min readLW link

Re­vis­ing Stages-Over­sight Re­veals Greater Si­tu­a­tional Aware­ness in LLMs

Sanyu RajakumarMar 12, 2025, 5:56 PM
7 points
0 comments13 min readLW link

Paper: On mea­sur­ing situ­a­tional aware­ness in LLMs

Sep 4, 2023, 12:54 PM
109 points
16 comments5 min readLW link
(arxiv.org)

Some Quick Fol­low-Up Ex­per­i­ments to “Taken out of con­text: On mea­sur­ing situ­a­tional aware­ness in LLMs”

Miles TurpinOct 3, 2023, 2:22 AM
31 points
0 comments9 min readLW link

[Question] Is there any rigor­ous work on us­ing an­thropic un­cer­tainty to pre­vent situ­a­tional aware­ness /​ de­cep­tion?

David Scott Krueger (formerly: capybaralet)Sep 4, 2024, 12:40 PM
19 points
7 comments1 min readLW link

In­ves­ti­gat­ing the Abil­ity of LLMs to Rec­og­nize Their Own Writing

Jul 30, 2024, 3:41 PM
32 points
0 comments15 min readLW link

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

Jul 8, 2024, 10:24 PM
109 points
37 comments5 min readLW link

LM Si­tu­a­tional Aware­ness, Eval­u­a­tion Pro­posal: Vio­lat­ing Imitation

Jacob PfauApr 26, 2023, 10:53 PM
16 points
2 comments2 min readLW link

Contin­gency: A Con­cep­tual Tool from Evolu­tion­ary Biol­ogy for Alignment

clem_acsJun 12, 2023, 8:54 PM
57 points
2 comments14 min readLW link
(acsresearch.org)

The in­tel­li­gence-sen­tience or­thog­o­nal­ity thesis

Ben SmithJul 13, 2023, 6:55 AM
19 points
9 comments9 min readLW link

The Zeroth Skillset

katydeeJan 30, 2013, 12:46 PM
74 points
109 comments2 min readLW link

Re­veal­ing In­ten­tion­al­ity In Lan­guage Models Through AdaVAE Guided Sampling

jdpOct 20, 2023, 7:32 AM
119 points
15 comments22 min readLW link

Facts vs Interpretations

Declan MolonyFeb 27, 2024, 7:57 AM
15 points
1 comment3 min readLW link

Emer­gent Misal­ign­ment and Emer­gent Alignment

Alvin ÅnestrandApr 3, 2025, 8:04 AM
2 points
0 comments8 min readLW link

Do mod­els know when they are be­ing eval­u­ated?

Feb 17, 2025, 11:13 PM
54 points
3 comments12 min readLW link

Per­cep­tual Blindspots: How to In­crease Self-Awareness

Declan MolonyMar 26, 2024, 5:37 AM
14 points
3 comments2 min readLW link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

Apr 17, 2024, 9:09 PM
44 points
1 comment3 min readLW link
(tiny.cc)

Cross-con­text ab­duc­tion: LLMs make in­fer­ences about pro­ce­du­ral train­ing data lev­er­ag­ing declar­a­tive facts in ear­lier train­ing data

Sohaib ImranNov 16, 2024, 11:22 PM
36 points
11 comments14 min readLW link

Early situ­a­tional aware­ness and its im­pli­ca­tions, a story

Jacob PfauFeb 6, 2023, 8:45 PM
29 points
6 comments3 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon MöllerMar 3, 2023, 6:59 PM
31 points
2 comments7 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

Nov 25, 2022, 2:36 PM
39 points
9 comments6 min readLW link
(vkrakovna.wordpress.com)
No comments.