Why “zero-shot”? You’re talking about getting something right in one try, so wouldn’t “one-shot” make more sense?
Humanity has actually made substantial progress toward formalizing zero-shot reasoning over the past century or so.
I think this paragraph gives an overly optimistic impression of how much progress has been made. We are still very confused about what probabilities really are, we haven’t made any progress on the problem of Apparent Unformalizability of “Actual” Induction, and decision theory seems to have mostly stalled since about 8 years ago (the MIRI paper you cite does not seem to represent a substantial amount of progress over UDT 1.1).
I think working on zero-shot reasoning today will most likely turn out to be unhelpful if:
takeoff is slow (which I assign ~20%)
This isn’t obvious to me. Can you explain why you think this?
In ML, “one-shot” means that you get to look at one example of good behavior (eg. how to classify an image), and then you have to be able to replicate that good behavior. “Zero-shot” means getting it right without any prior examples. (See also footnote 1.)
Why “zero-shot”? You’re talking about getting something right in one try, so wouldn’t “one-shot” make more sense?
I’ve flip-flopped between “one-shot” and “zero-shot”. I’m calling it “zero-shot” in analogy with zero-shot learning, which refers to the ability to perform a task after zero demonstrations. “One-shot reasoning” probably makes more sense to folks outside of ML.
I think this paragraph gives an overly optimistic impression of how much progress has been made. We are still very confused about what probabilities really are, we haven’t made any progress on the problem of Apparent Unformalizability of “Actual” Induction, and decision theory seems to have mostly stalled since about 8 years ago (the MIRI paper you cite does not seem to represent a substantial amount of progress over UDT 1.1).
I used “substantial progress” to mean “real and useful progress”, rather than “substantial fraction of the necessary progress”. Most of my examples happened in the eary to mid-1900s, suggesting that if we continue at that rate we might need at least another century.
This isn’t obvious to me. Can you explain why you think this?
I’d feel much better about delegating the problem to a post-AGI society, because I’d expect such a society to be far more stable if takeoff is slow, and far more capable of taking its merry time to solve the full problem in earnest. (I think it will be more stable because I think it would be much harder for a single actor to attain a decisive strategic advantage over the rest of the world.)
I’m calling it “zero-shot” in analogy with zero-shot learning, which refers to the ability to perform a task after zero demonstrations.
I see. Given this, I think “zero-shot learning” makes sense but “zero-shot reasoning” still doesn’t, since in the former “zero” refers to “zero demonstrations” and you’re learning something without doing a learning process targeted at that specific thing, whereas in the latter “zero” isn’t referring to anything and you’re trying to get the reasoning correct in one attempt so “one-shot” is a more sensible description.
I used “substantial progress” to mean “real and useful progress”, rather than “substantial fraction of the necessary progress”. Most of my examples happened in the eary to mid-1900s, suggesting that if we continue at that rate we might need at least another century.
Ok, I don’t think we have a substantive disagreement here then. My complaint was that providing only positive examples of progress in that paragraph without tempering them with negative ones is liable to give an overly optimistic impression to people who aren’t familiar with the field.
I’d feel much better about delegating the problem to a post-AGI society, because I’d expect such a society to be far more stable if takeoff is slow, and far more capable of taking its merry time to solve the full problem in earnest. (I think it will be more stable because I think it would be much harder for a single actor to attain a decisive strategic advantage over the rest of the world.)
Are you saying that in the slow-takeoff world, we will be able to coordinate to stop AI progress after reaching AGI and then solve the full alignment problem at leisure? If so, what’s your conditional probability P(successful coordination to stop AI progress | slow takeoff)?
I see. Given this, I think “zero-shot learning” makes sense but “zero-shot reasoning” still doesn’t, since in the former “zero” refers to “zero demonstrations” and you’re learning something without doing a learning process targeted at that specific thing, whereas in the latter “zero” isn’t referring to anything and you’re trying to get the reasoning correct in one attempt so “one-shot” is a more sensible description.
I was imagining something like “zero failed attempts”, where each failed attempt approximately corresponds to a demonstration.
Are you saying that in the slow-takeoff world, we will be able to coordinate to stop AI progress after reaching AGI and then solve the full alignment problem at leisure? If so, what’s your conditional probability P(successful coordination to stop AI progress | slow takeoff)?
More like, conditioning on getting international coordination after our first AGI, P(safe intelligence explosion | slow takeoff) is a lot higher, like 80%. I don’t think slow takeoff does very much to help international coordination.
Why “zero-shot”? You’re talking about getting something right in one try, so wouldn’t “one-shot” make more sense?
I think this paragraph gives an overly optimistic impression of how much progress has been made. We are still very confused about what probabilities really are, we haven’t made any progress on the problem of Apparent Unformalizability of “Actual” Induction, and decision theory seems to have mostly stalled since about 8 years ago (the MIRI paper you cite does not seem to represent a substantial amount of progress over UDT 1.1).
This isn’t obvious to me. Can you explain why you think this?
In ML, “one-shot” means that you get to look at one example of good behavior (eg. how to classify an image), and then you have to be able to replicate that good behavior. “Zero-shot” means getting it right without any prior examples. (See also footnote 1.)
I’ve flip-flopped between “one-shot” and “zero-shot”. I’m calling it “zero-shot” in analogy with zero-shot learning, which refers to the ability to perform a task after zero demonstrations. “One-shot reasoning” probably makes more sense to folks outside of ML.
I used “substantial progress” to mean “real and useful progress”, rather than “substantial fraction of the necessary progress”. Most of my examples happened in the eary to mid-1900s, suggesting that if we continue at that rate we might need at least another century.
I’d feel much better about delegating the problem to a post-AGI society, because I’d expect such a society to be far more stable if takeoff is slow, and far more capable of taking its merry time to solve the full problem in earnest. (I think it will be more stable because I think it would be much harder for a single actor to attain a decisive strategic advantage over the rest of the world.)
I see. Given this, I think “zero-shot learning” makes sense but “zero-shot reasoning” still doesn’t, since in the former “zero” refers to “zero demonstrations” and you’re learning something without doing a learning process targeted at that specific thing, whereas in the latter “zero” isn’t referring to anything and you’re trying to get the reasoning correct in one attempt so “one-shot” is a more sensible description.
Ok, I don’t think we have a substantive disagreement here then. My complaint was that providing only positive examples of progress in that paragraph without tempering them with negative ones is liable to give an overly optimistic impression to people who aren’t familiar with the field.
Are you saying that in the slow-takeoff world, we will be able to coordinate to stop AI progress after reaching AGI and then solve the full alignment problem at leisure? If so, what’s your conditional probability P(successful coordination to stop AI progress | slow takeoff)?
I was imagining something like “zero failed attempts”, where each failed attempt approximately corresponds to a demonstration.
More like, conditioning on getting international coordination after our first AGI, P(safe intelligence explosion | slow takeoff) is a lot higher, like 80%. I don’t think slow takeoff does very much to help international coordination.