I’m not sure I follow. I think you are proposing a gamification of interpretability, but I don’t know how the game works. I can gather something about player choice making the LLM run and maybe some analogies to physical movement, but I can’t really grasp it. Could you rephrase it from it’s basic principles up instead of from an example?
I think we can expose complex geometry in a familiar setting of our planet in a game. Basically, let’s show people a whole simulated multiverse of all-knowing and then find a way for them to learn how to see/experience “more of it all at once” or if they want to remain human-like “slice through it in order to experience the illusion of time”.
If we have many human agents in some simulation (billions of them), then they can cooperate and effectively replace the agentic ASI, they will be the only time-like thing, while the ASI will be the space-like places, just giant frozen sculptures.
I’m not sure I follow. I think you are proposing a gamification of interpretability, but I don’t know how the game works. I can gather something about player choice making the LLM run and maybe some analogies to physical movement, but I can’t really grasp it. Could you rephrase it from it’s basic principles up instead of from an example?
I think we can expose complex geometry in a familiar setting of our planet in a game. Basically, let’s show people a whole simulated multiverse of all-knowing and then find a way for them to learn how to see/experience “more of it all at once” or if they want to remain human-like “slice through it in order to experience the illusion of time”.
If we have many human agents in some simulation (billions of them), then they can cooperate and effectively replace the agentic ASI, they will be the only time-like thing, while the ASI will be the space-like places, just giant frozen sculptures.
I wrote some more and included the staircase example, it’s a work in progress of course: https://forum.effectivealtruism.org/posts/9XJmunhgPRsgsyWCn/share-ai-safety-ideas-both-crazy-and-not?commentId=ddK9HkCikKk4E7prk