I think one of the key intuitions here is that in a high dimensionality problem, random babbling takes far too long to solve the problem, as the computational complexity of random babbling is 2^n. If n is say over 100, then it requires more random ideas than anyone will make in a million years.
Given that most real world problems are high dimensional, babbling will lead you nowhere to the solution.
This is true. I’m trying to reconcile this with the intuition that we need very-different ideas from what we already have.
Like, we need a focused chain of ideas (to avoid the combinatorial explosion of babble), but it needs to be “aimed” in a different-enough direction to be more-helpful-than-the-norm, while in a good-enough direction to be helpful-at-all.
Like, if we’re talking about some specific idea-tree, my mental model is “low branching factor is good, but initialization matters more for getting what will turn out to be the most helpful ideas”. And maybe(?) John’s model is “the existing branch(es) of AI alignment are incomplete, but that should be solved depth-first instead of making new branches”.
This could be correct, but that doesn’t square with our shared belief that AI alignment is pre-paradigmatic. If we really are preparadigmatic, I’d expect we should look for more Newtons, but John’s advice points to looking for more Tellers or Von Neumanns.
What would John rather have, for the same monetary/effort cost: Another researcher creating a new paradigm (new branches), or another researcher helping him (depth first)?
And if he thinks the monetary/effort cost of helping that researcher can’t possibly be comparable, what precisely does that mean? That e.g. we would prefer new branches but there’s no practical way the field of AI alignment can actually support them in any substantial way? Really?
What would John rather have, for the same monetary/effort cost: Another researcher creating a new paradigm (new branches), or another researcher helping him (depth first)?
I think “new approach” vs “existing approach” is the wrong way to look at it. An approach is not the main thing which expertise is supposed to involve, here. Expertise in this context is much more about understanding the relevant problems/constraints. The main preference I have is a new researcher who understands the problems/constraints over one who doesn’t. Among researchers who understand the problems/constraints, I’d rather have one with their own new program than working on an existing program, but that’s useful if-and-only-if they understand the relevant problems and constraints.
The problem with a random-idea-generator is that the exponential majority of the ideas it generates won’t satisfy any known constraints or address any of the known hard barriers, or even a useful relaxation of the known constraints/barriers.
That said, I do buy your argument at the top of the thread that in fact GPT fails to even generate new bad ideas.
Ah, okay yeah that makes sense. The many-paths argument may work, but IFF the researcher/idea is even remotely useful for the problem, which a randomly-generated one won’t. Oops
I think one of the key intuitions here is that in a high dimensionality problem, random babbling takes far too long to solve the problem, as the computational complexity of random babbling is 2^n. If n is say over 100, then it requires more random ideas than anyone will make in a million years.
Given that most real world problems are high dimensional, babbling will lead you nowhere to the solution.
This is true. I’m trying to reconcile this with the intuition that we need very-different ideas from what we already have.
Like, we need a focused chain of ideas (to avoid the combinatorial explosion of babble), but it needs to be “aimed” in a different-enough direction to be more-helpful-than-the-norm, while in a good-enough direction to be helpful-at-all.
Like, if we’re talking about some specific idea-tree, my mental model is “low branching factor is good, but initialization matters more for getting what will turn out to be the most helpful ideas”. And maybe(?) John’s model is “the existing branch(es) of AI alignment are incomplete, but that should be solved depth-first instead of making new branches”.
This could be correct, but that doesn’t square with our shared belief that AI alignment is pre-paradigmatic. If we really are preparadigmatic, I’d expect we should look for more Newtons, but John’s advice points to looking for more Tellers or Von Neumanns.
What would John rather have, for the same monetary/effort cost: Another researcher creating a new paradigm (new branches), or another researcher helping him (depth first)?
And if he thinks the monetary/effort cost of helping that researcher can’t possibly be comparable, what precisely does that mean? That e.g. we would prefer new branches but there’s no practical way the field of AI alignment can actually support them in any substantial way? Really?
I think “new approach” vs “existing approach” is the wrong way to look at it. An approach is not the main thing which expertise is supposed to involve, here. Expertise in this context is much more about understanding the relevant problems/constraints. The main preference I have is a new researcher who understands the problems/constraints over one who doesn’t. Among researchers who understand the problems/constraints, I’d rather have one with their own new program than working on an existing program, but that’s useful if-and-only-if they understand the relevant problems and constraints.
The problem with a random-idea-generator is that the exponential majority of the ideas it generates won’t satisfy any known constraints or address any of the known hard barriers, or even a useful relaxation of the known constraints/barriers.
That said, I do buy your argument at the top of the thread that in fact GPT fails to even generate new bad ideas.
Ah, okay yeah that makes sense. The many-paths argument may work, but IFF the researcher/idea is even remotely useful for the problem, which a randomly-generated one won’t. Oops