I do not have a gauge for how much I’m actually bringing to this convo, so you should weigh my opinion lightly, however:
I believe your third point kinda nails it. There are models for gains from collective intelligence (groups of agents collaborating) and the benefits of collaboration bottleneck hard on your ability to verify which outputs from the collective are the best, and even then the dropoff happens pretty quick the more agents collaborate.
10 people collaborating with no communication issues and accurate discrimination between good and bad ideas are better than a lone person on some tasks, 100 moreso
You do not see jumps like that moving from 1,000 to 1,000,000 unless you set unrealistic variables.
I think inference time probably works in a similar way: dependent on discrimination between right and wrong answers and steeply falling off as inference time increases
My understanding is that o3 is similar to o1 but probably with some specialization to make long chains of thought stay coherent? The cost per token from leaks I’ve seen is the same as o1, it came out very quickly after o1 and o1 was bizarrely better at math and coding than 4o
Apologies if this was no help, responding with the best intentions
I do not have a gauge for how much I’m actually bringing to this convo, so you should weigh my opinion lightly, however:
I believe your third point kinda nails it. There are models for gains from collective intelligence (groups of agents collaborating) and the benefits of collaboration bottleneck hard on your ability to verify which outputs from the collective are the best, and even then the dropoff happens pretty quick the more agents collaborate.
10 people collaborating with no communication issues and accurate discrimination between good and bad ideas are better than a lone person on some tasks, 100 moreso
You do not see jumps like that moving from 1,000 to 1,000,000 unless you set unrealistic variables.
I think inference time probably works in a similar way: dependent on discrimination between right and wrong answers and steeply falling off as inference time increases
My understanding is that o3 is similar to o1 but probably with some specialization to make long chains of thought stay coherent? The cost per token from leaks I’ve seen is the same as o1, it came out very quickly after o1 and o1 was bizarrely better at math and coding than 4o
Apologies if this was no help, responding with the best intentions