Results are in and updated—it looks like dmartin80 wins.
We previously posted the results, but then a participant investigated our app and found an error in the calculations. We then spent some time redoing some of the calculations and realized that there were some errors. The main update was that dmartin had a much higher Surprise score than originally estimated—changing this led to their entry winning.
To help make up for the confusion, we’re awarding an additional $100 prize for 2nd place. This will be awarded to kairos_. I’ll cover this cost for this personally.
Again, thanks to all who participated!
We have a very basic web application showing some results here. It was coded quickly (with AI) and has some quirks, but if you search around you can get the main information.
We didn’t end up applying the Goodharting penalty for any submissions. No models seemed to goodhart under a cursory glance.
If time permits, we’ll later write a longer post highlighting the posts more and going over lessons learned from this.
Thank you for running the competition! It made me use & appreciate squiggle more, and I expect that a bunch of my estimation workflows in the future will be generating and then tweaking an AI-generated squiggle model.
Results are in and updated—it looks like dmartin80 wins.
We previously posted the results, but then a participant investigated our app and found an error in the calculations. We then spent some time redoing some of the calculations and realized that there were some errors. The main update was that dmartin had a much higher Surprise score than originally estimated—changing this led to their entry winning.
To help make up for the confusion, we’re awarding an additional $100 prize for 2nd place. This will be awarded to kairos_. I’ll cover this cost for this personally.
Again, thanks to all who participated!
We have a very basic web application showing some results here. It was coded quickly (with AI) and has some quirks, but if you search around you can get the main information.
We didn’t end up applying the Goodharting penalty for any submissions. No models seemed to goodhart under a cursory glance.
If time permits, we’ll later write a longer post highlighting the posts more and going over lessons learned from this.
Thank you for running the competition! It made me use & appreciate squiggle more, and I expect that a bunch of my estimation workflows in the future will be generating and then tweaking an AI-generated squiggle model.