Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
simeon_c
Karma:
1,334
@SaferAI
All
Posts
Comments
New
Top
Old
Page
1
A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
simeon_c
and
Henry Papadatos
Mar 13, 2025, 6:29 PM
10
points
0
comments
1
min read
LW
link
(arxiv.org)
Towards Quantitative AI Risk Management
Henry Papadatos
and
simeon_c
Oct 16, 2024, 7:26 PM
28
points
1
comment
6
min read
LW
link
simeon_c’s Shortform
simeon_c
Apr 4, 2024, 9:01 AM
5
points
73
comments
1
min read
LW
link
Forecasting future gains due to post-training enhancements
elifland
,
Joel Becker
and
simeon_c
Mar 8, 2024, 2:11 AM
31
points
2
comments
1
min read
LW
link
(docs.google.com)
Davidad’s Provably Safe AI Architecture—ARIA’s Programme Thesis
simeon_c
Feb 1, 2024, 9:30 PM
69
points
17
comments
1
min read
LW
link
(www.aria.org.uk)
A Brief Assessment of OpenAI’s Preparedness Framework & Some Suggestions for Improvement
simeon_c
Jan 22, 2024, 8:08 PM
14
points
0
comments
6
min read
LW
link
(uploads-ssl.webflow.com)
Responsible Scaling Policies Are Risk Management Done Wrong
simeon_c
Oct 25, 2023, 11:46 PM
123
points
35
comments
22
min read
LW
link
1
review
(www.navigatingrisks.ai)
[Question]
Do LLMs Implement NLP Algorithms for Better Next Token Predictions?
simeon_c
Sep 19, 2023, 12:28 PM
5
points
1
comment
1
min read
LW
link
[Question]
In the Short-Term, Why Couldn’t You Just RLHF-out Instrumental Convergence?
simeon_c
Sep 16, 2023, 10:44 AM
21
points
6
comments
1
min read
LW
link
AGI x Animal Welfare: A High-EV Outreach Opportunity?
simeon_c
Jun 28, 2023, 8:44 PM
29
points
0
comments
1
min read
LW
link
The Cruel Trade-Off Between AI Misuse and AI X-risk Concerns
simeon_c
Apr 22, 2023, 1:49 PM
24
points
1
comment
2
min read
LW
link
AI Takeover Scenario with Scaled LLMs
simeon_c
Apr 16, 2023, 11:28 PM
42
points
15
comments
8
min read
LW
link
Navigating AI Risks (NAIR) #1: Slowing Down AI
simeon_c
Apr 14, 2023, 2:35 PM
11
points
3
comments
1
min read
LW
link
(navigatingairisks.substack.com)
Request to AGI organizations: Share your views on pausing AI progress
Akash
and
simeon_c
Apr 11, 2023, 5:30 PM
141
points
11
comments
1
min read
LW
link
[Question]
Could Simulating an AGI Taking Over the World Actually Lead to a LLM Taking Over the World?
simeon_c
Jan 13, 2023, 6:33 AM
15
points
1
comment
1
min read
LW
link
[Linkpost] DreamerV3: A General RL Architecture
simeon_c
Jan 12, 2023, 3:55 AM
23
points
3
comments
1
min read
LW
link
(arxiv.org)
[Question]
Are Mixture-of-Experts Transformers More Interpretable Than Dense Transformers?
simeon_c
Dec 31, 2022, 11:34 AM
8
points
5
comments
1
min read
LW
link
AGI Timelines in Governance: Different Strategies for Different Timeframes
simeon_c
and
AmberDawn
Dec 19, 2022, 9:31 PM
65
points
28
comments
10
min read
LW
link
Extracting and Evaluating Causal Direction in LLMs’ Activations
Fabien Roger
and
simeon_c
Dec 14, 2022, 2:33 PM
29
points
5
comments
11
min read
LW
link
Is GPT3 a Good Rationalist? - InstructGPT3 [2/2]
simeon_c
Apr 7, 2022, 1:46 PM
11
points
0
comments
7
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel