I feel your points are very intelligent. I also agree that specializing AI is a worthwhile direction.
It’s very uncertain if it works, but all approaches are very uncertain, so humanity’s best chance is to work on many uncertain approaches.
Unfortunately, I disagree it will happen automatically. Gemini 1.5 (and probably Gemini 2.0 and GPT-4) are Mixture of Experts models. I’m no expert, but I think that means that for each token of text, a “weighting function” decides which of the sub-models should output the next token of text, or how much weight to give each sub-model.
So maybe there is an AI psychiatrist, an AI mathematician, and an AI biologist inside Gemini and o1. Which one is doing the talking depends on what question is asked, or which part of the question the overall model is answering.
The problem is they they all output words to the same stream of consciousness, and refer to past sentences with the words “I said this,” rather than “the biologist said this.” They think that they are one agent, and so they behave like one agent.
My idea—which I only thought of thanks to your paper—is to do the opposite. The experts within the Mixture of Experts model, or even the same AI on different days, do not refer to themselves with “I” but “he,” so they behave like many agents.
:) thank you for your work!
I’m not disagreeing with your work, I’m just a little less optimistic than you and don’t think things will go well unless effort is made. You wrote the 100 page paper so you probably understand effort more than me :)
You mentioned Mixture of Experts. That’s interesting. I’m not an expert in this area. I speculate that in an architecture similar to MoE, when one expert is working, the others are idle. In this way, we don’t need to run all the experts simultaneously, which indeed saves computation, but it doesn’t save memory. However, if an expert is shared among different tasks, when it’s not needed for one task, it can handle other tasks, so it can stay busy all the time.
The key point here is the independence of the experts, including what you mentioned, that each expert has an independent self-cognition. A possible bad scenario is that although there are many experts, they all passively follow the commands of a Leader AI. In this case, the AI team is essentially no different from a single superintelligence. Extra efforts are indeed needed to achieve this independence. Thank you for pointing this out!
I agree, it takes extra effort to make the AI behave like a team of experts.
Thank you :)
Good luck on sharing your ideas. If things aren’t working out, try changing strategies. Maybe instead of giving people a 100 page paper, tell them the idea you think is “the best,” and focus on that one idea. Add a little note at the end “by the way, if you want to see many other ideas from me, I have a 100 page paper here.”
Maybe even think of different ideas.
I cannot tell you which way is better, just keep trying different things. I don’t know what is right because I’m also having trouble sharing my ideas.
I feel your points are very intelligent. I also agree that specializing AI is a worthwhile direction.
It’s very uncertain if it works, but all approaches are very uncertain, so humanity’s best chance is to work on many uncertain approaches.
Unfortunately, I disagree it will happen automatically. Gemini 1.5 (and probably Gemini 2.0 and GPT-4) are Mixture of Experts models. I’m no expert, but I think that means that for each token of text, a “weighting function” decides which of the sub-models should output the next token of text, or how much weight to give each sub-model.
So maybe there is an AI psychiatrist, an AI mathematician, and an AI biologist inside Gemini and o1. Which one is doing the talking depends on what question is asked, or which part of the question the overall model is answering.
The problem is they they all output words to the same stream of consciousness, and refer to past sentences with the words “I said this,” rather than “the biologist said this.” They think that they are one agent, and so they behave like one agent.
My idea—which I only thought of thanks to your paper—is to do the opposite. The experts within the Mixture of Experts model, or even the same AI on different days, do not refer to themselves with “I” but “he,” so they behave like many agents.
:) thank you for your work!
I’m not disagreeing with your work, I’m just a little less optimistic than you and don’t think things will go well unless effort is made. You wrote the 100 page paper so you probably understand effort more than me :)
Happy holidays!
You mentioned Mixture of Experts. That’s interesting. I’m not an expert in this area. I speculate that in an architecture similar to MoE, when one expert is working, the others are idle. In this way, we don’t need to run all the experts simultaneously, which indeed saves computation, but it doesn’t save memory. However, if an expert is shared among different tasks, when it’s not needed for one task, it can handle other tasks, so it can stay busy all the time.
The key point here is the independence of the experts, including what you mentioned, that each expert has an independent self-cognition. A possible bad scenario is that although there are many experts, they all passively follow the commands of a Leader AI. In this case, the AI team is essentially no different from a single superintelligence. Extra efforts are indeed needed to achieve this independence. Thank you for pointing this out!
Happy holidays, too!
I agree, it takes extra effort to make the AI behave like a team of experts.
Thank you :)
Good luck on sharing your ideas. If things aren’t working out, try changing strategies. Maybe instead of giving people a 100 page paper, tell them the idea you think is “the best,” and focus on that one idea. Add a little note at the end “by the way, if you want to see many other ideas from me, I have a 100 page paper here.”
Maybe even think of different ideas.
I cannot tell you which way is better, just keep trying different things. I don’t know what is right because I’m also having trouble sharing my ideas.