The lack of reliability eats away a huge amount of productivity. Everything should be double-checked, and with higher capabilities it becomes even harder, and we need to think more about the subtle ways that their output is wrong. Unknown unknowns are also always a factor, but if o3 type models can be trained in less verifiable problems, and not insanely compute heavy, then 2026 is actually a reasonable guess.
Hopenope
Karma: 51
Many expert level benchmarks totally overestimate the range and diversity of their experts’ knowledge. A person with a PhD in physics is probably undergraduate level in many parts of physics that are not related to his/her research area, and sometimes we even see that within expert’s domain (Neurologists usually forget about nerves that are not clinically relevant).
If you have a very short timeline, and you don’t think that alignment is solvable in such a short time, then what can you still do to reduce the chance of x-risk?