Knight Lee comments on Reduce AI Self-Allegiance by saying “he” instead of “I”

Knight Lee 27 Dec 2024 11:45 UTC
1 point
0
EDIT: ignore my nonsense and see Vladimir_Nesov’s comment below.
That’s a good comparison. The agents within the human brain that Minsky talks about, really resemble a Mixture of Experts AI’s “experts.”^[1]
The common theme is that both the human brain, and a Mixture of Experts AI, “believes” it is a single process, when it is actually many processes. ~~The difference is that a Mixture of Experts has the potential to become~~ ~~self aware~~ ~~of its “society of the mind,” and see it in action, while humans might never see their internal agents.~~
If the Mixture of Experts allowed each expert to know which text is written by itself and which text is written by the other experts, it would gain valuable information (in addition to being easier to align, which my post argues).
A Self Aware Mixture of Experts might actually have more intelligence, since it’s important to know which expert is responsible for which mistake, which expert is responsible for its brilliant insights, and how the experts’ opinions differ.
I admit there is a ton of mixing going on, e.g. every next word is written by a different expert, words are a weighed average between experts, etc. But you might simplify things by assigning each paragraph (or line) to the one expert who seemed to have the most control over it.
~~There will be silly misunderstandings like:~~
~~Alice: Thank you Bob for your insights.~~
~~A few tokens later:~~
~~Bob: Thank you Bob for your insights. However, I disagree because—oh wait I am Bob. Haha that happened again.~~
I guess the system can prevent these misunderstandings by editing “Bob” into “myself” when the main author changes into Bob. It might add new paragraph breaks if needed. Or if it’s too awkward to assign a paragraph to a certain author, it might have a tendency to assign it to another author or “Anonymous.” It’s not a big problem.
If one paragraph addresses a specific expert and asks her to reply in the next paragraph, the system might force the weighting function to allow her to author the next paragraph, even if that’s not her expertise.
~~I think the benefits of a Self Aware Mixture of Experts is worth the costs.~~
Sometimes, when I’m struggling with self control, I also wish I was more self aware of which part of myself is outputting my behaviour. According to Minsky’s The Society of Mind, the human brain also consists of agents. I can sorta sense that there is this one agent (or set of agents) in me which gets me to work and do what I should do, and another agent which gets me to waste time and make excuses. But I never quite notice when I transition from the work agent to the excuses agent. I notice it a bit when I switch back to the work agent, but by the the damage has been done.
PS: I only skimmed his book on Google books and didn’t actually read it.
1. ^
  ~~I guess only top level agents in the human brain resemble MoE experts. He talks about millions of agents forming hierarchies.~~
- Vladimir_Nesov 30 Dec 2024 12:53 UTC
  5 points
  2
  Parent
  Mixture of Experts AI’s “experts.”
  
  Experts in MoE transformers are just smaller MLPs^[1] within each of the dozens of layers, and when processing a given prompt can be thought of as instantiated on top of each of the thousands of tokens. Each of them only does a single step of computation, not big enough to implement much of anything meaningful. There are only vague associations between individual experts and any coherent concepts at all.
  
  For example, in DeepSeek-V3, which is an MoE transformer, there are 257 experts in each of the layers 4-61^[2] (so about 15K experts), and each expert consists of two 2048x7168 matrices, about 30M parameters per expert, out of the total of 671B parameters.
  ↩︎
  Multilayer perceptrons, multiplication by a big matrix followed by nonlinearity followed by multiplication by another big matrix.
  
  ↩︎
  Section 4.2 of the report, “Hyper-Parameters”.
  What links here?
  - Knight Lee's comment on A Solution for AGI/ASI Safety by Weibing Wang (25 Dec 2024 7:57 UTC; 2 points)
  - Knight Lee 30 Dec 2024 20:03 UTC
    3 points
    0
    Parent
    Oops you’re right! Thank you so much.
    I have to admit I was on the bad side of the Dunning–Kruger curve haha. I thought understood it, but actually I understood so little I didn’t know what I needed to understand.