RSS

Tony Wang

Karma: 280

Covert Mal­i­cious Finetuning

2 Jul 2024 2:41 UTC
88 points
4 comments3 min readLW link

Take­aways from a Mechanis­tic In­ter­pretabil­ity pro­ject on “For­bid­den Facts”

15 Dec 2023 11:05 UTC
33 points
8 comments10 min readLW link

Even Su­per­hu­man Go AIs Have Sur­pris­ing Failure Modes

20 Jul 2023 17:31 UTC
129 points
22 comments10 min readLW link
(far.ai)

Cam­bridge LW Meetup: When Science Isn’t Enough

13 Apr 2023 17:36 UTC
2 points
0 comments1 min readLW link

Cam­bridge LW Ra­tion­al­ity Prac­tice: Be­ing Specific

16 Feb 2023 6:37 UTC
2 points
0 comments1 min readLW link

Cam­bridge LW Meetup: Lifehacks

29 Nov 2022 5:45 UTC
2 points
0 comments1 min readLW link

Cam­bridge LW Meetup: See the Invisible

Tony Wang13 Oct 2022 5:44 UTC
1 point
0 comments1 min readLW link

Cam­bridge LW Meetup: Authen­tic Re­lat­ing Games

Tony Wang19 Sep 2022 14:51 UTC
1 point
0 comments1 min readLW link

Cam­bridge LW Meetup: Con­struc­tive Complaining

Tony Wang13 Aug 2022 4:52 UTC
2 points
0 comments1 min readLW link

Cam­bridge LW Meetup: Per­sonal Finance

Tony Wang14 Jun 2022 0:12 UTC
3 points
0 comments1 min readLW link

Cam­bridge LW Meetup: Books That Change

8 May 2022 5:23 UTC
5 points
0 comments1 min readLW link

Cam­bridge LW Meetup: Bean on Why You Should Stop Wor­ry­ing and Love the Bomb

5 Apr 2022 18:34 UTC
9 points
0 comments1 min readLW link