Sort of relatedly, or on the flipside of the coin:
In these threads, I’ve seen a lot of concern with using language “consequentially”, rather than rooted in pure epistemics and map-territory correspondence.
And those arguments have always seemed weird to me. Because… what could you possibly be grounding this all out in, other than consequences? It seems useful to have a concept of “appeals to consequence” being logically invalid. But in terms of what norms to have on as public forum, the key issue on a public forum is that appeals to shortsighted consequences are bad, for the same reason shortsighted consequentialism is often bad.
If you don’t call the president a Vargath (despite them obviously supporting Varg), because they’d be offended, it seems fairly straightforward to argue that this has bad consequences. You just have to model it out more steps.
I would agree with the claim “if you’re constantly checking ‘hey, in this particular instance, maybe it’s net positive to lie?’ you end lying all the time, and end up in a world where people can’t trust each other”, so it’s worth treating appeals to consequences as forbidden as part of a Rule Consequentialism framework. But, why not just say that?
Because… what could you possibly be grounding this all out in, other than consequences?
In my mind, it stands as an open problem whether you can “usually” expect an intelligent system to remain “agent-like in design” under powerful self-modification. By “agent-like in design” I mean having subcomponents which transparently contribute to the overall agentiness, such as true beliefs, coherent goal systems, etc.
The argument in favor is: it becomes really difficult to self-optimize as your own mind-design becomes less modular. At some point you’re just a massive policy with each part fine-tuned to best shape the future (a future which you had some model of at some point in the past); at some point you have to lose general-purpose learning. Therefore, agents with complicated environments and long time horizons will stay modular.
The argument against is: it just isn’t very probable that the nice clean design is the most optimal. Even if there’s only a small incentive to do weird screwy things with your head (ie a small chance you encounter Newcomblike problems where Omega cares about aspects of your ritual of cognition, rather than just output), the agent will follow that incentive where it leads. Plus, general self-optimization can lead to weird, non-modular designs. Why shouldn’t it?
So, in my mind, it stands as an open problem whether purely consequentialist arguments tend to favor a separate epistemic module “in the long term”.
Therefore, I don’t think we can always ground pure epistemic talk in consequences. At least, not without further work.
However, I do think it’s a coherent flag to rally around, and I do think it’s an important goal in the short term, and I think it’s particularly important for a large number of agents trying to coordinate, and it’s also possible that it’s something approaching a terminal goal for humans (ie, curiosity wants to be satisfied by truth).
So I do want to defend pure epistemics as its own goal which doesn’t continuously answer to broad consequentialism. I perceive some reactions to Zack’s post as isolated demands for rigor, invoking the entire justificatory chain to consequentialism when it would not be similarly invoked for a post about, say, p-values.
(A post about p-values vs bayesian hypothesis testing might give rise do discussions of consequences, but not questions of whether the whole argument about bayes vs p-values makes sense because isn’t epistemics ultimately consequentialist anyway or similar.)
I would agree with the claim “if you’re constantly checking ‘hey, in this particular instance, maybe it’s net positive to lie?’ you end lying all the time, and end up in a world where people can’t trust each other”, so it’s worth treating appeals to consequences as forbidden as part of a Rule Consequentialism framework. But, why not just say that?
I would respond:
Partly for the same reason that a post on Bayes’ Law vs p-values wouldn’t usually bother to say that; it’s at least one meta level up from the chief concerns. Granted, unlike a hypothetical post about p-values, Zack’s post was about the appeal-to-consequences argument from its inception, since it responds to an inappropriate appeal to consequences. However, Zack’s primary argument is on the object level, pointing out that how you define words is of epistemic import, and therefore cannot be chosen freely without making epistemic compromises.
TAG and perhaps other critics of this post are not conceding that much; so, the point you make doesn’t seem like it’s sufficient to address the meta-level questions which are being raised.
I would concede that there is, perhaps, something funny about the way I’ve been responding to the discussion—I have a sense that I might be doing some motte/bailey thing around (motte:) this is an isolated demand for rigor, and we should be able to talk about pure epistemics as a goal without explicitly qualifying everything with “if you’re after pure epistemics”; vs (bailey:) we should pursue pure epistemics. In writing comments here, I’ve attempted to carefully argue the two separately. However, I perceive TAG as not having received these as separate arguments. And it is quite possible I’ve blurred the lines at times. They are pretty relevant to each other.
(I say all of this largely agreeing with the thrust of what the post and your (Abram’s) comments are pointing at, but feeling like something about the exact reasoning is off. And it feeling consistently off has been part of why I’ve taken awhile to come around to the reasoning)
Sort of relatedly, or on the flipside of the coin:
In these threads, I’ve seen a lot of concern with using language “consequentially”, rather than rooted in pure epistemics and map-territory correspondence.
And those arguments have always seemed weird to me. Because… what could you possibly be grounding this all out in, other than consequences? It seems useful to have a concept of “appeals to consequence” being logically invalid. But in terms of what norms to have on as public forum, the key issue on a public forum is that appeals to shortsighted consequences are bad, for the same reason shortsighted consequentialism is often bad.
If you don’t call the president a Vargath (despite them obviously supporting Varg), because they’d be offended, it seems fairly straightforward to argue that this has bad consequences. You just have to model it out more steps.
I would agree with the claim “if you’re constantly checking ‘hey, in this particular instance, maybe it’s net positive to lie?’ you end lying all the time, and end up in a world where people can’t trust each other”, so it’s worth treating appeals to consequences as forbidden as part of a Rule Consequentialism framework. But, why not just say that?
In my mind, it stands as an open problem whether you can “usually” expect an intelligent system to remain “agent-like in design” under powerful self-modification. By “agent-like in design” I mean having subcomponents which transparently contribute to the overall agentiness, such as true beliefs, coherent goal systems, etc.
The argument in favor is: it becomes really difficult to self-optimize as your own mind-design becomes less modular. At some point you’re just a massive policy with each part fine-tuned to best shape the future (a future which you had some model of at some point in the past); at some point you have to lose general-purpose learning. Therefore, agents with complicated environments and long time horizons will stay modular.
The argument against is: it just isn’t very probable that the nice clean design is the most optimal. Even if there’s only a small incentive to do weird screwy things with your head (ie a small chance you encounter Newcomblike problems where Omega cares about aspects of your ritual of cognition, rather than just output), the agent will follow that incentive where it leads. Plus, general self-optimization can lead to weird, non-modular designs. Why shouldn’t it?
So, in my mind, it stands as an open problem whether purely consequentialist arguments tend to favor a separate epistemic module “in the long term”.
Therefore, I don’t think we can always ground pure epistemic talk in consequences. At least, not without further work.
However, I do think it’s a coherent flag to rally around, and I do think it’s an important goal in the short term, and I think it’s particularly important for a large number of agents trying to coordinate, and it’s also possible that it’s something approaching a terminal goal for humans (ie, curiosity wants to be satisfied by truth).
So I do want to defend pure epistemics as its own goal which doesn’t continuously answer to broad consequentialism. I perceive some reactions to Zack’s post as isolated demands for rigor, invoking the entire justificatory chain to consequentialism when it would not be similarly invoked for a post about, say, p-values.
(A post about p-values vs bayesian hypothesis testing might give rise do discussions of consequences, but not questions of whether the whole argument about bayes vs p-values makes sense because isn’t epistemics ultimately consequentialist anyway or similar.)
I would respond:
Partly for the same reason that a post on Bayes’ Law vs p-values wouldn’t usually bother to say that; it’s at least one meta level up from the chief concerns. Granted, unlike a hypothetical post about p-values, Zack’s post was about the appeal-to-consequences argument from its inception, since it responds to an inappropriate appeal to consequences. However, Zack’s primary argument is on the object level, pointing out that how you define words is of epistemic import, and therefore cannot be chosen freely without making epistemic compromises.
TAG and perhaps other critics of this post are not conceding that much; so, the point you make doesn’t seem like it’s sufficient to address the meta-level questions which are being raised.
I would concede that there is, perhaps, something funny about the way I’ve been responding to the discussion—I have a sense that I might be doing some motte/bailey thing around (motte:) this is an isolated demand for rigor, and we should be able to talk about pure epistemics as a goal without explicitly qualifying everything with “if you’re after pure epistemics”; vs (bailey:) we should pursue pure epistemics. In writing comments here, I’ve attempted to carefully argue the two separately. However, I perceive TAG as not having received these as separate arguments. And it is quite possible I’ve blurred the lines at times. They are pretty relevant to each other.
(I say all of this largely agreeing with the thrust of what the post and your (Abram’s) comments are pointing at, but feeling like something about the exact reasoning is off. And it feeling consistently off has been part of why I’ve taken awhile to come around to the reasoning)