quetzal_rainbow comments on o1 is a bad idea

quetzal_rainbow 12 Nov 2024 8:04 UTC
2 points
0
I agree with your technical points, but I don’t think that we could particularly expect the other path. Safety properties of LLMs seem to be desirable from extremely safety-pilled point of view, not from perspective of average capabilities researcher and RL seems to be The Answer to many learning problems.