Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Olli Järviniemi
Karma:
1,301
All
Posts
Comments
New
Top
Old
Schelling game evaluations for AI control
Olli Järviniemi
8 Oct 2024 12:01 UTC
65
points
4
comments
11
min read
LW
link
Distinguish worst-case analysis from instrumental training-gaming
Olli Järviniemi
and
Buck
5 Sep 2024 19:13 UTC
37
points
0
comments
5
min read
LW
link
Untrustworthy models: a frame for scheming evaluations
Olli Järviniemi
19 Aug 2024 16:27 UTC
46
points
3
comments
8
min read
LW
link
Near-mode thinking on AI
Olli Järviniemi
4 Aug 2024 20:47 UTC
127
points
8
comments
5
min read
LW
link
An experiment on hidden cognition
Olli Järviniemi
22 Jul 2024 3:26 UTC
25
points
2
comments
7
min read
LW
link
Brief notes on the Wikipedia game
Olli Järviniemi
14 Jul 2024 2:28 UTC
67
points
9
comments
4
min read
LW
link
Dialogue introduction to Singular Learning Theory
Olli Järviniemi
8 Jul 2024 16:58 UTC
97
points
14
comments
8
min read
LW
link
A civilization ran by amateurs
Olli Järviniemi
30 May 2024 17:57 UTC
61
points
7
comments
6
min read
LW
link
Testing for parallel reasoning in LLMs
meemi
and
Olli Järviniemi
19 May 2024 15:28 UTC
3
points
7
comments
9
min read
LW
link
Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant
Olli Järviniemi
and
evhub
6 May 2024 7:07 UTC
95
points
13
comments
1
min read
LW
link
(arxiv.org)
On precise out-of-context steering
Olli Järviniemi
3 May 2024 9:41 UTC
9
points
6
comments
3
min read
LW
link
Instrumental deception and manipulation in LLMs—a case study
Olli Järviniemi
24 Feb 2024 2:07 UTC
39
points
13
comments
12
min read
LW
link
Urging an International AI Treaty: An Open Letter
Olli Järviniemi
31 Oct 2023 11:26 UTC
48
points
2
comments
1
min read
LW
link
(aitreaty.org)
Olli Järviniemi’s Shortform
Olli Järviniemi
23 Mar 2023 10:59 UTC
3
points
22
comments
1
min read
LW
link
Takeaways from calibration training
Olli Järviniemi
29 Jan 2023 19:09 UTC
38
points
1
comment
3
min read
LW
link
Back to top