We discuss the definition of “knowledge” a bit in this appendix; compared to your definitions, we want to only say that the model “knows” the value of X when it is actually behaving differently based on the value of X in order to obtain a lower loss. I think this is strictly weaker than your level 2 (since the model needs to actually be using that knowledge) and incomparable to your level 1 (since the model’s behavior might depend on an estimate of X without the model having any introspective knowledge about that dependence), though I might be misunderstanding.
This definitely isn’t well-defined, and this is the main way in which ELK itself is not well-defined and something I’d love to fix. That said, for now I feel like we can just focus on cases where the counterexamples obviously involve the model knowing things (according to this informal definition). Someday in the future we’ll need to argue about complicated border cases, because our solutions work in every obvious case. But I think we’ll have to make a lot of progress before we run into those problems (and I suspect that progress will mostly resolve the ambiguity).
(The situation is reversed if someone tries to make impossibility arguments about ELK—those arguments very often involve murky cases where it’s not really clear if the model knows.)
First in this list is the case where the correct translation is arbitrarily computationally complex.
However, this assumption implies that ELK is intractable.
This isn’t clear to me—we argue that direct translation can be arbitrarily complex, and that we need to solve ELK anyway, but we don’t think the translator can be arbitrarily complex relative to the predictor. So we can still hope that jointly learning the (predictor, translator) is not much harder than learning the predictor alone.
The answer (at least, as I see it) is by arguing that this case is impossible.
If we find a case that’s impossible, I definitely want to try to refine the ELK problem statement, rather than implicitly narrowing the statement to something like “solve ELK in all the cases where it’s possibly possible” (not sure if that’s what you are suggesting here). And right now I don’t know of any cases that seem impossible.
This definitely isn’t well-defined, and this is the main way in which ELK itself is not well-defined and something I’d love to fix. That said, for now I feel like we can just focus on cases where the counterexamples obviously involve the model knowing things (according to this informal definition). Someday in the future we’ll need to argue about complicated border cases, because our solutions work in every obvious case. But I think we’ll have to make a lot of progress before we run into those problems (and I suspect that progress will mostly resolve the ambiguity).
Well, it might be that a proposed solution follows relatively easily from a proposed definition of knowledge, in some cases. That’s the sort of solution I’m going after at the moment.
This still leaves the question of borderline cases, since the definition of knowledge may be imperfect. So it’s not necessarily that I’m trying to solve the borderline cases.
We discuss the definition of “knowledge” a bit in this appendix;
Ah, yep, I missed that!
This isn’t clear to me—we argue that direct translation can be arbitrarily complex, and that we need to solve ELK anyway, but we don’t think the translator can be arbitrarily complex relative to the predictor. So we can still hope that jointly learning the (predictor, translator) is not much harder than learning the predictor alone.
Ahh, I see. I had 100% interpreted the computational complexity of the Reporter to be ‘relative to the predictor’ already. I’m not sure how else it could be interpreted, since the reporter is given the predictor’s state as input, or at least given some form of query access.
What’s the intended mathematical content of the statement “the direct translation can be arbitrarily complex”, then?
Also, why don’t you think the direct translator can be arbitrarily complex relative to the predictor?
> The answer (at least, as I see it) is by arguing that this case is impossible.
If we find a case that’s impossible, I definitely want to try to refine the ELK problem statement, rather than implicitly narrowing the statement to something like “solve ELK in all the cases where it’s possibly possible” (not sure if that’s what you are suggesting here). And right now I don’t know of any cases that seem impossible.
Yeah, sorry, poor wording on my part. What I meant in that part was “argue that the direct translator cannot be arbitrarily complex”, although I immediately mention the case you’re addressing here in the parenthetical right after what you quote.
Yeah, sorry, poor wording on my part. What I meant in that part was “argue that the direct translator cannot be arbitrarily complex”, although I immediately mention the case you’re addressing here in the parenthetical right after what you quote.
Ah, I just totally misunderstood the sentence, the intended reading makes sense.
Well, it might be that a proposed solution follows relatively easily from a proposed definition of knowledge, in some cases. That’s the sort of solution I’m going after at the moment.
I agree that’s possible, and it does seem like a good reason to try to clarify a definition of knowledge.
Ahh, I see. I had 100% interpreted the computational complexity of the Reporter to be ‘relative to the predictor’ already. I’m not sure how else it could be interpreted, since the reporter is given the predictor’s state as input, or at least given some form of query access.
What’s the intended mathematical content of the statement “the direct translation can be arbitrarily complex”, then?
Sorry, what I mean is:
The computational complexity of the reporter can be arbitrarily large.
But it’s not clear the computational complexity of the reporter can be arbitrary larger than the predictor.
E.g. maybe the reporter can have complexity 0.1% of the predictor’s complexity, but that means that the reporter gets arbitrarily complex in the limit where the predictor is arbitrarily complex.
Also, why don’t you think the direct translator can be arbitrarily complex relative to the predictor?
I assume this was based on my confusing use of “relative to.” But answering just in case: if we are defining “knowledge” in terms of what the predictor actually uses in order to get a low loss, then there’s some hope that the reporter can’t really be more complex than the predictor (for the part that is actually playing a role in the predictor’s computation) plus a term that depends only on the complexity of the human’s model.
We discuss the definition of “knowledge” a bit in this appendix; compared to your definitions, we want to only say that the model “knows” the value of X when it is actually behaving differently based on the value of X in order to obtain a lower loss. I think this is strictly weaker than your level 2 (since the model needs to actually be using that knowledge) and incomparable to your level 1 (since the model’s behavior might depend on an estimate of X without the model having any introspective knowledge about that dependence), though I might be misunderstanding.
This definitely isn’t well-defined, and this is the main way in which ELK itself is not well-defined and something I’d love to fix. That said, for now I feel like we can just focus on cases where the counterexamples obviously involve the model knowing things (according to this informal definition). Someday in the future we’ll need to argue about complicated border cases, because our solutions work in every obvious case. But I think we’ll have to make a lot of progress before we run into those problems (and I suspect that progress will mostly resolve the ambiguity).
(The situation is reversed if someone tries to make impossibility arguments about ELK—those arguments very often involve murky cases where it’s not really clear if the model knows.)
This isn’t clear to me—we argue that direct translation can be arbitrarily complex, and that we need to solve ELK anyway, but we don’t think the translator can be arbitrarily complex relative to the predictor. So we can still hope that jointly learning the (predictor, translator) is not much harder than learning the predictor alone.
If we find a case that’s impossible, I definitely want to try to refine the ELK problem statement, rather than implicitly narrowing the statement to something like “solve ELK in all the cases where it’s possibly possible” (not sure if that’s what you are suggesting here). And right now I don’t know of any cases that seem impossible.
Well, it might be that a proposed solution follows relatively easily from a proposed definition of knowledge, in some cases. That’s the sort of solution I’m going after at the moment.
This still leaves the question of borderline cases, since the definition of knowledge may be imperfect. So it’s not necessarily that I’m trying to solve the borderline cases.
Ah, yep, I missed that!
Ahh, I see. I had 100% interpreted the computational complexity of the Reporter to be ‘relative to the predictor’ already. I’m not sure how else it could be interpreted, since the reporter is given the predictor’s state as input, or at least given some form of query access.
What’s the intended mathematical content of the statement “the direct translation can be arbitrarily complex”, then?
Also, why don’t you think the direct translator can be arbitrarily complex relative to the predictor?
Yeah, sorry, poor wording on my part. What I meant in that part was “argue that the direct translator cannot be arbitrarily complex”, although I immediately mention the case you’re addressing here in the parenthetical right after what you quote.
In any case, what you say makes sense.
Ah, I just totally misunderstood the sentence, the intended reading makes sense.
I agree that’s possible, and it does seem like a good reason to try to clarify a definition of knowledge.
Sorry, what I mean is:
The computational complexity of the reporter can be arbitrarily large.
But it’s not clear the computational complexity of the reporter can be arbitrary larger than the predictor.
E.g. maybe the reporter can have complexity 0.1% of the predictor’s complexity, but that means that the reporter gets arbitrarily complex in the limit where the predictor is arbitrarily complex.
I assume this was based on my confusing use of “relative to.” But answering just in case: if we are defining “knowledge” in terms of what the predictor actually uses in order to get a low loss, then there’s some hope that the reporter can’t really be more complex than the predictor (for the part that is actually playing a role in the predictor’s computation) plus a term that depends only on the complexity of the human’s model.