RSS

Ulisse Mini

Karma: 1,720

Born too late to explore Earth; born too early to explore the galaxy; born just the right time to save humanity.

https://​​uli.rocks/​​about

[Question] What ra­tio­nal­ity failure modes are there?

Ulisse MiniJan 19, 2024, 9:12 AM
42 points
11 comments1 min readLW link

[Question] What ML gears do you like?

Ulisse MiniNov 11, 2023, 7:10 PM
25 points
4 comments1 min readLW link

Paper: Un­der­stand­ing and Con­trol­ling a Maze-Solv­ing Policy Network

Oct 13, 2023, 1:38 AM
70 points
0 comments1 min readLW link
(arxiv.org)

Ac­tAdd: Steer­ing Lan­guage Models with­out Optimization

Sep 6, 2023, 5:21 PM
105 points
3 comments2 min readLW link
(arxiv.org)

Open prob­lems in ac­ti­va­tion engineering

Jul 24, 2023, 7:46 PM
51 points
2 comments1 min readLW link
(coda.io)

[ASoT] GPT2 Steer­ing & The Tuned Lens

Ulisse MiniJul 1, 2023, 2:12 PM
23 points
0 comments2 min readLW link

LIMA: Less Is More for Alignment

Ulisse MiniMay 30, 2023, 5:10 PM
16 points
6 comments1 min readLW link
(arxiv.org)

TinyS­to­ries: Small Lan­guage Models That Still Speak Co­her­ent English

Ulisse MiniMay 28, 2023, 10:23 PM
66 points
8 comments2 min readLW link
(arxiv.org)

Steer­ing GPT-2-XL by adding an ac­ti­va­tion vector

May 13, 2023, 6:42 PM
437 points
98 comments50 min readLW link1 review

How to get good at programming

Ulisse MiniMay 5, 2023, 1:14 AM
39 points
3 comments2 min readLW link

Un­der­stand­ing and con­trol­ling a maze-solv­ing policy network

Mar 11, 2023, 6:59 PM
332 points
28 comments23 min readLW link

Pre­dic­tions for shard the­ory mechanis­tic in­ter­pretabil­ity results

Mar 1, 2023, 5:16 AM
105 points
10 comments5 min readLW link

[ASoT] Policy Tra­jec­tory Visualization

Ulisse MiniFeb 7, 2023, 12:13 AM
9 points
2 comments1 min readLW link

In­cen­tives con­sid­ered harmful

Ulisse MiniJan 15, 2023, 6:38 AM
6 points
0 comments1 min readLW link
(uli.rocks)

[Question] Where do you find peo­ple who ac­tu­ally do things?

Ulisse MiniJan 13, 2023, 6:57 AM
7 points
12 comments1 min readLW link

[Question] Effec­tive Evil Causes?

Ulisse MiniDec 30, 2022, 2:56 AM
−12 points
2 comments1 min readLW link

[ASoT] Nat­u­ral ab­strac­tions and AlphaZero

Ulisse MiniDec 10, 2022, 5:53 PM
33 points
1 comment1 min readLW link
(arxiv.org)

[ASoT] Prob­a­bil­ity In­fects Con­cepts it Touches

Ulisse MiniDec 7, 2022, 1:48 AM
10 points
4 comments1 min readLW link

Three Fables of Mag­i­cal Girls and Longtermism

Ulisse MiniDec 2, 2022, 10:01 PM
33 points
11 comments2 min readLW link

[ASoT] Reflec­tivity in Nar­row AI

Ulisse MiniNov 21, 2022, 12:51 AM
6 points
1 comment1 min readLW link