When discussing the GPT-4o model, my son (20) said that it leads to a higher bandwidth of communication with LLMs and he said: “a symbiosis.” We discussed that there are further stages than this, like Neuralink. I think there is a small chance that this (a close interaction between a human and a model) can be extended in such a way that it gets aligned in a way a human is internally aligned, as follows:
The model is already the Though Generator. The human already has a Steering System, albeit it is not accessible, but plausibly, it can be reverse-engineered. What is missing is the Thought Assessor, something that learns to predict how well the model satisfies the Steering System.
Staying closer to the human may be better than finding global solutions. Or it may allow smaller-scale optimization and iteration.
Now, I don’t think this is automatically safe. The human Steering System is running already outside its specs and a powerful model can find the breaking points (same as global commerce can find the appetite breaking points). But these are problems we already have and it provides a “scale model” or working on them.
When discussing the GPT-4o model, my son (20) said that it leads to a higher bandwidth of communication with LLMs and he said: “a symbiosis.” We discussed that there are further stages than this, like Neuralink. I think there is a small chance that this (a close interaction between a human and a model) can be extended in such a way that it gets aligned in a way a human is internally aligned, as follows:
This assumes some background about Thought Generator, Thought Assessor, and Steering System from brain-like AGI.
The model is already the Though Generator. The human already has a Steering System, albeit it is not accessible, but plausibly, it can be reverse-engineered. What is missing is the Thought Assessor, something that learns to predict how well the model satisfies the Steering System.
Staying closer to the human may be better than finding global solutions. Or it may allow smaller-scale optimization and iteration.
Now, I don’t think this is automatically safe. The human Steering System is running already outside its specs and a powerful model can find the breaking points (same as global commerce can find the appetite breaking points). But these are problems we already have and it provides a “scale model” or working on them.