RSS

Caleb Biddulph

Karma: 711

Caleb Bid­dulph’s Shortform

Caleb BiddulphJan 30, 2025, 9:35 PM
4 points
23 comments1 min readLW link

[Question] Why not train rea­son­ing mod­els with RLHF?

Caleb BiddulphJan 30, 2025, 7:58 AM
4 points
4 comments1 min readLW link