Regarding coding in general, I basically only prompt programme these days. I only bother editing the actual code when I notice a persistent bug that the models are unable to fix after multiple iterations.
I don’t know jackshit about web development and have been making progress on a dashboard for alignment research with very little effort. Very easy to build new projects quickly. The difficulty comes when there is a lot of complexity in the code. It’s still valuable to understand how high-level things work and low-level things the model will fail to proactively implement.
I’d be down to do this. Specifically, I want to do this, but I want to see if the models are qualitatively better at alignment research tasks.
In general, what I’m seeing is that there is not big jump with o1 Pro. However, it is possibly getting closer to one-shot a website based on a screenshot and some details about how the user likes their backend setup.
In the case of math, it might be a bigger jump (especially if you pair it well with Sonnet).
Regarding coding in general, I basically only prompt programme these days. I only bother editing the actual code when I notice a persistent bug that the models are unable to fix after multiple iterations.
I don’t know jackshit about web development and have been making progress on a dashboard for alignment research with very little effort. Very easy to build new projects quickly. The difficulty comes when there is a lot of complexity in the code. It’s still valuable to understand how high-level things work and low-level things the model will fail to proactively implement.