RSS

dannyhalawi

Karma: 135

Covert Mal­i­cious Finetuning

2 Jul 2024 2:41 UTC
89 points
4 comments3 min readLW link

Ap­proach­ing Hu­man-Level Fore­cast­ing with Lan­guage Models

29 Feb 2024 22:36 UTC
60 points
6 comments3 min readLW link