RSS

Bogdan Ionut Cirstea

Karma: 1,629

Automated /​ strongly-augmented safety research.

Dens­ing Law of LLMs

Bogdan Ionut CirsteaDec 8, 2024, 7:35 PM
9 points
2 comments1 min readLW link
(arxiv.org)

LLMs Do Not Think Step-by-step In Im­plicit Reasoning

Bogdan Ionut CirsteaNov 28, 2024, 9:16 AM
11 points
0 comments1 min readLW link
(arxiv.org)

Do Large Lan­guage Models Perform La­tent Multi-Hop Rea­son­ing with­out Ex­ploit­ing Short­cuts?

Bogdan Ionut CirsteaNov 26, 2024, 9:58 AM
9 points
0 comments1 min readLW link
(arxiv.org)

Disen­tan­gling Rep­re­sen­ta­tions through Multi-task Learning

Bogdan Ionut CirsteaNov 24, 2024, 1:10 PM
14 points
1 comment1 min readLW link
(arxiv.org)

Re­ward Bases: A sim­ple mechanism for adap­tive ac­qui­si­tion of mul­ti­ple re­ward type

Bogdan Ionut CirsteaNov 23, 2024, 12:45 PM
11 points
0 comments1 min readLW link

A Lit­tle Depth Goes a Long Way: the Ex­pres­sive Power of Log-Depth Transformers

Bogdan Ionut CirsteaNov 20, 2024, 11:48 AM
16 points
0 comments1 min readLW link
(openreview.net)

The Com­pu­ta­tional Com­plex­ity of Cir­cuit Dis­cov­ery for In­ner Interpretability

Bogdan Ionut CirsteaOct 17, 2024, 1:18 PM
11 points
2 comments1 min readLW link
(arxiv.org)

Think­ing LLMs: Gen­eral In­struc­tion Fol­low­ing with Thought Generation

Bogdan Ionut CirsteaOct 15, 2024, 9:21 AM
7 points
0 comments1 min readLW link
(arxiv.org)

In­struc­tion Fol­low­ing with­out In­struc­tion Tuning

Bogdan Ionut CirsteaSep 24, 2024, 1:49 PM
17 points
0 comments1 min readLW link
(arxiv.org)

Val­i­dat­ing /​ find­ing al­ign­ment-rele­vant con­cepts us­ing neu­ral data

Bogdan Ionut CirsteaSep 20, 2024, 9:12 PM
7 points
0 comments1 min readLW link
(docs.google.com)

To CoT or not to CoT? Chain-of-thought helps mainly on math and sym­bolic reasoning

Bogdan Ionut CirsteaSep 19, 2024, 4:13 PM
21 points
1 comment1 min readLW link
(arxiv.org)

AlignedCut: Vi­sual Con­cepts Dis­cov­ery on Brain-Guided Univer­sal Fea­ture Space

Bogdan Ionut CirsteaSep 14, 2024, 11:23 PM
17 points
1 comment1 min readLW link
(arxiv.org)

Univer­sal di­men­sions of vi­sual representation

Bogdan Ionut CirsteaAug 28, 2024, 10:38 AM
10 points
0 comments1 min readLW link
(arxiv.org)

[Linkpost] Au­to­mated De­sign of Agen­tic Systems

Bogdan Ionut CirsteaAug 19, 2024, 11:06 PM
8 points
1 comment1 min readLW link
(arxiv.org)

[Linkpost] ‘The AI Scien­tist: Towards Fully Au­to­mated Open-Ended Scien­tific Dis­cov­ery’

Bogdan Ionut CirsteaAug 15, 2024, 9:32 PM
20 points
1 comment1 min readLW link
(arxiv.org)

[Linkpost] Tran­scen­dence: Gen­er­a­tive Models Can Out­perform The Ex­perts That Train Them

Bogdan Ionut CirsteaJun 18, 2024, 11:00 AM
19 points
3 comments1 min readLW link
(arxiv.org)

[Linkpost] The Ex­pres­sive Ca­pac­ity of State Space Models: A For­mal Lan­guage Perspective

Bogdan Ionut CirsteaMay 28, 2024, 1:49 PM
4 points
3 comments1 min readLW link
(arxiv.org)

[Linkpost] Towards a The­o­ret­i­cal Un­der­stand­ing of the ‘Rev­er­sal Curse’ via Train­ing Dynamics

Bogdan Ionut CirsteaMay 11, 2024, 10:59 PM
6 points
0 comments1 min readLW link
(arxiv.org)

[Linkpost] MindEye2: Shared-Sub­ject Models En­able fMRI-To-Image With 1 Hour of Data

Bogdan Ionut CirsteaMar 10, 2024, 1:30 AM
10 points
0 comments1 min readLW link
(openreview.net)

In­duc­ing hu­man-like bi­ases in moral rea­son­ing LMs

Feb 20, 2024, 4:28 PM
23 points
3 comments14 min readLW link