So one claim is that a theory of post-AGI effects often won’t say things about pre-AGI AI, so mostly doesn’t get updated from pre-AGI observations. My takeon LLM alignment asks to distinguish human-like LLM AGIs from stronger AGIs (or weirder LLMs), with theories of stronger AGIs not naturally characterizing issues with human-like LLMs. Like, they aren’t concerned with optimizing for LLM superstimuli while their behavior remains in human imitation regime, where caring for LLM-specific things didn’t have a chance to gain influence. When the mostly faithful imitation nature of LLMs breaks with enough AI tinkering, the way human nature is breaking now towards influence of AGIs, we get another phase change to stronger AGIs.
This seems like a pattern, theories of extremal later phases being bounded within their scopes, saying little of preceding phases that transition into them. If the phase transition boundaries get muddled in thinking about this, we get misleading impressions about how the earlier phases work, while their navigation is instrumental for managing transitions into the much more concerning later phases.
So one claim is that a theory of post-AGI effects often won’t say things about pre-AGI AI, so mostly doesn’t get updated from pre-AGI observations. My take on LLM alignment asks to distinguish human-like LLM AGIs from stronger AGIs (or weirder LLMs), with theories of stronger AGIs not naturally characterizing issues with human-like LLMs. Like, they aren’t concerned with optimizing for LLM superstimuli while their behavior remains in human imitation regime, where caring for LLM-specific things didn’t have a chance to gain influence. When the mostly faithful imitation nature of LLMs breaks with enough AI tinkering, the way human nature is breaking now towards influence of AGIs, we get another phase change to stronger AGIs.
This seems like a pattern, theories of extremal later phases being bounded within their scopes, saying little of preceding phases that transition into them. If the phase transition boundaries get muddled in thinking about this, we get misleading impressions about how the earlier phases work, while their navigation is instrumental for managing transitions into the much more concerning later phases.