RSS

leogao

Karma: 2,884

Weak-to-Strong Gen­er­al­iza­tion: Elic­it­ing Strong Ca­pa­bil­ities With Weak Supervision

leogao16 Dec 2023 5:39 UTC
53 points
5 comments1 min readLW link

Shap­ley Value At­tri­bu­tion in Chain of Thought

leogao14 Apr 2023 5:56 UTC
101 points
5 comments4 min readLW link

[ASoT] Some thoughts on hu­man abstractions

leogao16 Mar 2023 5:42 UTC
42 points
4 comments5 min readLW link

Clar­ify­ing wire­head­ing terminology

leogao24 Nov 2022 4:53 UTC
65 points
6 comments1 min readLW link

Scal­ing Laws for Re­ward Model Overoptimization

20 Oct 2022 0:20 UTC
102 points
13 comments1 min readLW link
(arxiv.org)

[Question] How many GPUs does NVIDIA make?

leogao8 Oct 2022 17:54 UTC
27 points
2 comments1 min readLW link

Towards de­con­fus­ing wire­head­ing and re­ward maximization

leogao21 Sep 2022 0:36 UTC
81 points
7 comments4 min readLW link

Hu­mans Reflect­ing on HRH

leogao29 Jul 2022 21:56 UTC
26 points
4 comments2 min readLW link

leogao’s Shortform

leogao24 May 2022 20:08 UTC
5 points
123 comments1 min readLW link

[ASoT] Con­se­quen­tial­ist mod­els as a su­per­set of mesaoptimizers

leogao23 Apr 2022 17:57 UTC
37 points
2 comments4 min readLW link

[ASoT] Some thoughts about im­perfect world modeling

leogao7 Apr 2022 15:42 UTC
7 points
0 comments4 min readLW link

[ASoT] Some thoughts about LM monologue limi­ta­tions and ELK

leogao30 Mar 2022 14:26 UTC
10 points
0 comments2 min readLW link

[ASoT] Some thoughts about de­cep­tive mesaoptimization

leogao28 Mar 2022 21:14 UTC
24 points
5 comments7 min readLW link

[ASoT] Search­ing for con­se­quen­tial­ist structure

leogao27 Mar 2022 19:09 UTC
26 points
2 comments4 min readLW link

[ASoT] Some ways ELK could still be solv­able in practice

leogao27 Mar 2022 1:15 UTC
26 points
1 comment2 min readLW link

[ASoT] Ob­ser­va­tions about ELK

leogao26 Mar 2022 0:42 UTC
31 points
0 comments3 min readLW link

What do paradigm shifts look like?

leogao16 Mar 2022 19:17 UTC
18 points
2 comments1 min readLW link

EleutherAI’s GPT-NeoX-20B release

leogao10 Feb 2022 6:56 UTC
30 points
3 comments1 min readLW link
(eaidata.bmk.sh)

NFTs, Coin Col­lect­ing, and Ex­pen­sive Paintings

leogao24 Jan 2022 1:01 UTC
29 points
35 comments5 min readLW link

Re­tail In­vestor Advantages

leogao7 Dec 2021 2:08 UTC
13 points
13 comments1 min readLW link