The second statement seems pretty plausible (when we consider human-accessible AGI designs, at least), but I’m not super confident of it, and I’m not resting my argument on it.
The weaker statement you provide doesn’t seem like it’s addressing my concern. I expect there are ways to get highly capable reasoning (sufficient for, e.g., gaining decisive strategic advantage) without understanding low-K “good reasoning”; the concern is that said systems are much more difficult to align.
The second statement seems pretty plausible (when we consider human-accessible AGI designs, at least), but I’m not super confident of it, and I’m not resting my argument on it.
The weaker statement you provide doesn’t seem like it’s addressing my concern. I expect there are ways to get highly capable reasoning (sufficient for, e.g., gaining decisive strategic advantage) without understanding low-K “good reasoning”; the concern is that said systems are much more difficult to align.