Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Tomáš Gavenčiak
Karma:
180
A researcher in CS theory, AI safety and other stuff.
All
Posts
Comments
New
Top
Old
Measuring Beliefs of Language Models During Chain-of-Thought Reasoning
Baram Sosis
and
Tomáš Gavenčiak
Apr 18, 2025, 10:56 PM
8
points
0
comments
13
min read
LW
link
Announcing Human-aligned AI Summer School
Jan_Kulveit
and
Tomáš Gavenčiak
May 22, 2024, 8:55 AM
50
points
0
comments
1
min read
LW
link
(humanaligned.ai)
InterLab – a toolkit for experiments with multi-agent interactions
Tomáš Gavenčiak
,
Ada Böhm
and
Jan_Kulveit
Jan 22, 2024, 6:23 PM
69
points
0
comments
8
min read
LW
link
(acsresearch.org)
Sparsity and interpretability?
Ada Böhm
,
RobertKirk
and
Tomáš Gavenčiak
Jun 1, 2020, 1:25 PM
41
points
3
comments
7
min read
LW
link
How can Interpretability help Alignment?
RobertKirk
and
Tomáš Gavenčiak
May 23, 2020, 4:16 PM
37
points
3
comments
9
min read
LW
link
What is Interpretability?
RobertKirk
,
Tomáš Gavenčiak
and
Ada Böhm
Mar 17, 2020, 8:23 PM
39
points
1
comment
11
min read
LW
link
Back to top