Jakub Halmeš

Karma: 14

https://jakubhalmes.com/

Jakub Halmeš Jan 29, 2025, 4:52 PM
6 points
0
on: Jakub Halmeš′s Shortform
I wonder if you could take the R1-Zero training regime, penalize/restrict using existing words from all languages (maybe only in the scratchpad, not the final response), and obtain a model which can solve math problems by reasoning in a non-existent language.

Jakub Halmeš Jan 23, 2025, 4:00 PM
4 points
0
in reply to: Jesse Hoogland’s comment on: Jesse Hoogland’s Shortform
During the training process, we observe that CoT often exhibits language mixing, particularly when RL prompts involve multiple languages. To mitigate the issue of language mixing, we introduce a language consistency reward during RL training, which is calculated as the proportion of target language words in the CoT. Although ablation experiments show that such alignment results in a slight degradation in the model’s performance, this reward aligns with human preferences, making it more readable.
I also found this trade-off between human readability and performance noteworthy.

Jakub Halmeš Jan 13, 2025, 4:46 PM
1 point
0
in reply to: Dagon’s comment on: Jakub Halmeš′s Shortform
Yes, fair here means that their subjective EVs are equal. The post referenced in the sibling comment calls it “Even Odds”, which is probably better.

Jakub Halmeš Jan 13, 2025, 4:43 PM
1 point
0
in reply to: Unnamed’s comment on: Jakub Halmeš′s Shortform
I did not realize that. Thank you for the reference!

Jakub Halmeš Jan 11, 2025, 5:47 PM
7 points
0
on: Jakub Halmeš′s Shortform
If Alice thinks X happens with a probability of 20% while Bob thinks it’s 40%, what would be a fair bet between them?
I created a Claude Artifact, which calculates a bet such that the expected value is the same for both.

In this case, Bob wins if X happens (he thinks it’s more likely). If Alice bets $100, he should bet $42.86, and the EV of such bet for both players (according to their beliefs) is $14.29.
EDIT: I updated the calculator to handle the case when A’s probability is higher than B’s correctly.

Jakub Halmeš′s Shortform

Jakub HalmešJan 11, 2025, 5:47 PM

1 point

7 comments LW link

The Inner Alignment Problem

Jakub HalmešFeb 24, 2024, 5:55 PM

1 point

1 comment3 min readLW link

(jakubhalmes.substack.com)

Jakub Halmeš Feb 24, 2024, 10:20 AM
1 point
0
on: The Inner Alignment Problem
I wrote this mostly for personal purposes. I wanted to organize my thoughts about the problem while reading the paper, and publishing the notes, even if no one reads them, forces me to write more clearly and precisely.
I would like to get some feedback if there may be value in posts such as this one for other people. Please let me know! Thank you.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer