On one hand, arbitrary agents—or at least a large class of agents, or at least (proto-)AGIs that humans make—might turn out to simply already naturally agree with us on the features we abstract from our surroundings; a better-grounded and better-developed theory of semantics would allow us to confirm this and become more optimistic about the feasibility of alignment.
On the other, such agents might prove in general to have inner ontologies totally unrelated to our own, or perhaps only somewhat different, but in enduring and hazardous ways; a better theory of semantics would warn us of this in advance and suggest other routes to AGI or perhaps drive a total halt to development.
I feel like these two paragraphs are just fleshing out the thing you said earlier and aren’t really needed
the next paragraph is kind of like that but making a sort of novel point so maybe they’re necessary? I’d try to focus them on saying things you haven’t yet said
I agree that those three paragraphs are bloated. My issue is this—I don’t yet know which of those three branches is true (natural abstractions exist all the time vs. NAs can exist but only if you put them there vs. NAs do not, in general, exist, and they break immediately) but whichever it is, I think a better theory of semantics would help tell us which one it is, and then also be a necessary prerequisite to the obvious resulting plan.
I feel like these two paragraphs are just fleshing out the thing you said earlier and aren’t really needed
the next paragraph is kind of like that but making a sort of novel point so maybe they’re necessary? I’d try to focus them on saying things you haven’t yet said
I agree that those three paragraphs are bloated. My issue is this—I don’t yet know which of those three branches is true (natural abstractions exist all the time vs. NAs can exist but only if you put them there vs. NAs do not, in general, exist, and they break immediately) but whichever it is, I think a better theory of semantics would help tell us which one it is, and then also be a necessary prerequisite to the obvious resulting plan.