Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Truthful AI
Tag
Last edit:
7 Apr 2022 16:40 UTC
by
Ruby
Relevant
New
Old
How do LLMs give truthful answers? A discussion of LLM vs. human reasoning, ensembles & parrots
Owain_Evans
28 Mar 2024 2:34 UTC
26
points
0
comments
9
min read
LW
link
Truthfulness, standards and credibility
Joe_Collman
7 Apr 2022 10:31 UTC
12
points
2
comments
32
min read
LW
link
A tension between two prosaic alignment subgoals
Alex Lawsen
19 Mar 2023 14:07 UTC
31
points
8
comments
1
min read
LW
link
Benchmark Study #2: TruthfulQA (Task, MCQ)
Bruce W. Lee
6 Jan 2024 2:39 UTC
11
points
2
comments
4
min read
LW
link
(arxiv.org)
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter
,
Francis Rhys Ward
,
HarrietW
,
LAThomson
,
Ollie J
,
Patrik Bartak
and
Sam F. Brown
8 Nov 2023 11:37 UTC
49
points
0
comments
18
min read
LW
link
No comments.
Back to top