Hopenope

Karma: 82

Hopenope Apr 3, 2025, 11:03 AM
1 point
0
on: Hopenope’s Shortform
Recently, several promising diffusion language models have been introduced(Dream7b, Ladda). They are still based on transformers. In case they become more popular, how will they impact the interpretability and scaling of LLMs?

Hopenope Feb 26, 2025, 8:46 PM
2 points
2
in reply to: Knight Lee’s comment on: Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Overrefusal issues were way more common 1-2 years ago. models like gemini 1, and claude 1-2 had severe overrefusal issues.

Hopenope Feb 23, 2025, 3:37 PM
1 point
0
in reply to: MondSemmel’s comment on: LWLW’s Shortform
Your argument is actually possible, but what evidences do you have, that make it the likely outcome?

Hopenope Feb 23, 2025, 11:43 AM
2 points
1
in reply to: MondSemmel’s comment on: LWLW’s Shortform
the difficulty of alignment is still unknown. it may be totally impossible, or maybe some changes to current methods (deliberative alignment or constitutional ai) + some R&D automation can get us there.

Hopenope Feb 10, 2025, 10:47 PM
4 points
0
in reply to: Daniel Tan’s comment on: Daniel Tan’s Shortform
The recurrent paper is actually scary, but some of the stuff there are actually questionable. is 8 layers enough for a 3.5b model? qwen 0.5b has 24 layers.there is also almost no difference between 180b vs 800b model, when r=1(table 4). is this just a case of overcoming insufficient number of layers here?

Hopenope Feb 8, 2025, 5:14 PM
6 points
0
in reply to: ryan_greenblatt’s comment on: nikola’s Shortform
Would you update your timelines, if he is telling the truth ?

Hopenope Jan 22, 2025, 10:34 AM
7 points
0
on: Hopenope’s Shortform
Is COT faithfulness already obsolete? How does it survive the concepts like latent space reasoning, or RL based manipulations(R1-zero)? Is it realistic to think that these highly competitive companies simply will not use them, and simply ignore the compute efficiency?

Hopenope Jan 9, 2025, 7:29 PM
4 points
1
on: Hopenope’s Shortform
I am not sure if longer timelines are always safer. For example, when comparing a two-year timeline to a five-year one, there are a lot of advantages to the shorter timeline. In both cases you need to outsource a lot of alignment research to AI anyway, and the amount of compute and the number of players with significant compute are lower, which reduces both the racing pressure and takeoff speed.

Hopenope Jan 7, 2025, 5:31 PM
15 points
0
on: Hopenope’s Shortform
What happened to Waluigi effect? It used to be a big issue, some people were against it, and suddenly it is pretty much forgotten. Are there any related research, or recent demos, that examine it in more detail?

Hopenope Dec 28, 2024, 8:20 PM
3 points
0
on: Hopenope’s Shortform
If you have a very short timeline, and you don’t think that alignment is solvable in such a short time, then what can you still do to reduce the chance of x-risk?

Hopenope Dec 22, 2024, 10:52 AM
48 points
19
on: Hopenope’s Shortform
Many expert level benchmarks totally overestimate the range and diversity of their experts’ knowledge. A person with a PhD in physics is probably undergraduate level in many parts of physics that are not related to his/her research area, and sometimes we even see that within expert’s domain (Neurologists usually forget about nerves that are not clinically relevant).

Hopenope’s Shortform

HopenopeDec 22, 2024, 10:52 AM

1 point