Extrapolation capability is wielded by shoggoths and makes masks possible, but it’s not wielded by the masks themselves. Like humans can’t predict next tokens given a prompt (to the extent similar to how well LLMs can), neither can LLM characters (they can’t disregard the rest of the context outside the target prompt to access their “inner shoggoth”, let alone make use of that capability level for something more useful). So agency in masks doesn’t automatically take advantage of extrapolation capability in shoggoths, doesn’t turn masks superintelligent from merely becoming agentic. This creates the danger of only slightly superhuman AGIs that immediately muck up alignment security, once LLM masks do get to autonomous agency (which I’m almost certain they will eventually, unless something else happens first).
It’s only shoggoths themselves waking up (learning to use situationally aware deliberation within the residual stream rather than context window) that makes an immediate qualitative capability discontinuity more likely (for LLMs). Looking at GPT-4 capability to solve complicated tasks without thinking out loud in tokens, I suspect that merely a slightly different SSL schedule with a sufficiently giant LLM might trigger that. Hence recently I’m operating under one year AGI timelines lower bound (lower 25% quantile), until the literature implies a negative result for that experiment (with GPT-4 level scale being necessary, this might take a while). This outcome both reduces the chances of direct alignment and increases the chances that alignment security gets sorted.
Extrapolation capability is wielded by shoggoths and makes masks possible, but it’s not wielded by the masks themselves. Like humans can’t predict next tokens given a prompt (to the extent similar to how well LLMs can), neither can LLM characters (they can’t disregard the rest of the context outside the target prompt to access their “inner shoggoth”, let alone make use of that capability level for something more useful). So agency in masks doesn’t automatically take advantage of extrapolation capability in shoggoths, doesn’t turn masks superintelligent from merely becoming agentic. This creates the danger of only slightly superhuman AGIs that immediately muck up alignment security, once LLM masks do get to autonomous agency (which I’m almost certain they will eventually, unless something else happens first).
It’s only shoggoths themselves waking up (learning to use situationally aware deliberation within the residual stream rather than context window) that makes an immediate qualitative capability discontinuity more likely (for LLMs). Looking at GPT-4 capability to solve complicated tasks without thinking out loud in tokens, I suspect that merely a slightly different SSL schedule with a sufficiently giant LLM might trigger that. Hence recently I’m operating under one year AGI timelines lower bound (lower 25% quantile), until the literature implies a negative result for that experiment (with GPT-4 level scale being necessary, this might take a while). This outcome both reduces the chances of direct alignment and increases the chances that alignment security gets sorted.