I do think you’re pointing to the right problems—basically the same problems Shminux was pointing at in his comment, and the same problems which I think are the most promising entry point to progress on embedded agency in general.
That said, I think “word boundaries” is a very misleading label for this class of problems. It suggests that the problem is something like “draw a boundary around points in thing-space which correspond to the word ‘tree’”, except for concepts like “values” or “person” rather than “tree”. Drawing a boundary in thing-space isn’t really the objective here; the problem is that we don’t know what the right parameterization of thing-space is or whether that’s even the right framework for grounding these concepts at all.
Here’s how I’d pose it. Over the course of history, humans have figured out how to translate various human intuitions into formal (i.e. mathematical) models. For instance:
Game theory gave a framework for translating intuitions about “strategic behavior” into math
Information theory gave a framework for translating intuitions about information into math
More recently, work on causality gave a framework for translating intuitions about counterfactuals into math
In the early days, people like Galileo showed how to translate physical intuitions into math
A good heuristic: if a class of intuitive reasoning is useful and effective in practice, then there’s probably some framework which would let us translate those intuitions into math. In the case of embedded-agency-related problems, we don’t yet have the framework—just the intuitions.
With that in mind, I’d pose the problem as: build a framework for translating intuitions about “values”, “people”, etc into math. That’s what we mean by the question “what is X?”.
Ooh, that is very insightful. The word-boundary problem around “values” feels fuzzy and ill-defined, but that doesn’t mean that the thing we care about is actually fuzzy and ill-defined.
Yes and no.
I do think you’re pointing to the right problems—basically the same problems Shminux was pointing at in his comment, and the same problems which I think are the most promising entry point to progress on embedded agency in general.
That said, I think “word boundaries” is a very misleading label for this class of problems. It suggests that the problem is something like “draw a boundary around points in thing-space which correspond to the word ‘tree’”, except for concepts like “values” or “person” rather than “tree”. Drawing a boundary in thing-space isn’t really the objective here; the problem is that we don’t know what the right parameterization of thing-space is or whether that’s even the right framework for grounding these concepts at all.
Here’s how I’d pose it. Over the course of history, humans have figured out how to translate various human intuitions into formal (i.e. mathematical) models. For instance:
Game theory gave a framework for translating intuitions about “strategic behavior” into math
Information theory gave a framework for translating intuitions about information into math
More recently, work on causality gave a framework for translating intuitions about counterfactuals into math
In the early days, people like Galileo showed how to translate physical intuitions into math
A good heuristic: if a class of intuitive reasoning is useful and effective in practice, then there’s probably some framework which would let us translate those intuitions into math. In the case of embedded-agency-related problems, we don’t yet have the framework—just the intuitions.
With that in mind, I’d pose the problem as: build a framework for translating intuitions about “values”, “people”, etc into math. That’s what we mean by the question “what is X?”.
Ooh, that is very insightful. The word-boundary problem around “values” feels fuzzy and ill-defined, but that doesn’t mean that the thing we care about is actually fuzzy and ill-defined.