Correct. I’ll just add that a single action can be a large chunk of the program. It doesn’t have to be (god forbid) character by character.
But the (most probable) models don’t know that, so the predictions for the next round are going to be wrong (compared to what the real human would do if called in) because it’s going to be based on the real human not having that memory.
It’ll have some probability distribution over the contents of the humans’ memories. This will depend on which timesteps they actually participated in, so it’ll have a probability distribution over that. I don’t think that’s really a problem though. If humans are taking over one time in a thousand, then it’ll think (more or less) there’s a 1⁄000 chance that they’ll remember the last action. (Actually, it can do better by learning that humans take over in confusing situations, but that’s not really relevant here).
Maybe we can just provide an input to the models that indicates whether the real human was called in for the last time step?
That would work too. With the edit that the model may as well be allowed to depend on the whole history of which actions were human-selected, not just whether the last one was.
Actually before we keep going with our discussions, it seems to make sense to double check that your proposal is actually the most promising proposal (for human imitation) to discuss. Can you please take a look at the list of 10 links related to human imitations that I collected (as well as any relevant articles those pages further link to), and perhaps write a post on why your proposal is better than the previous ones, why you made the design choices that you did, and how it addresses or avoids the existing criticisms of human imitations? ETA: I’m also happy to discuss with you your views of past proposals/criticisms here in the comments or through another channel if you prefer to do that before writing up a post.
If humans are taking over one time in a thousand, then it’ll think (more or less) there’s a 1⁄000 chance that they’ll remember the last action.
But there’s a model/TM that thinks there a 100% chance that the human will remember the last action (because that’s hard coded into the TM) and that model will do really well in the next update. So we know any time a human steps in no matter when, it will cause a big update (during the next update) because it’ll raise models like this from obscurity to prominence. If the AI “knows” this, it will call in the human for every time step, but maybe it doesn’t “know” this? (I haven’t thought this through formally and will leave it to you.)
With the edit that the model may as well be allowed to depend on the whole history of which actions were human-selected, not just whether the last one was.
I was assuming the models would save that input on its work tape for future use.
In any case, I think I understand your proposal well enough now that we can go back to some of the other questions.
Correct. I’ll just add that a single action can be a large chunk of the program. It doesn’t have to be (god forbid) character by character.
It’ll have some probability distribution over the contents of the humans’ memories. This will depend on which timesteps they actually participated in, so it’ll have a probability distribution over that. I don’t think that’s really a problem though. If humans are taking over one time in a thousand, then it’ll think (more or less) there’s a 1⁄000 chance that they’ll remember the last action. (Actually, it can do better by learning that humans take over in confusing situations, but that’s not really relevant here).
That would work too. With the edit that the model may as well be allowed to depend on the whole history of which actions were human-selected, not just whether the last one was.
Actually before we keep going with our discussions, it seems to make sense to double check that your proposal is actually the most promising proposal (for human imitation) to discuss. Can you please take a look at the list of 10 links related to human imitations that I collected (as well as any relevant articles those pages further link to), and perhaps write a post on why your proposal is better than the previous ones, why you made the design choices that you did, and how it addresses or avoids the existing criticisms of human imitations? ETA: I’m also happy to discuss with you your views of past proposals/criticisms here in the comments or through another channel if you prefer to do that before writing up a post.
Sorry to put this on hold, but I’ll come back to this conversation after the AAAI deadline on September 5.
Commenting here.
But there’s a model/TM that thinks there a 100% chance that the human will remember the last action (because that’s hard coded into the TM) and that model will do really well in the next update. So we know any time a human steps in no matter when, it will cause a big update (during the next update) because it’ll raise models like this from obscurity to prominence. If the AI “knows” this, it will call in the human for every time step, but maybe it doesn’t “know” this? (I haven’t thought this through formally and will leave it to you.)
I was assuming the models would save that input on its work tape for future use.
In any case, I think I understand your proposal well enough now that we can go back to some of the other questions.