Slider comments on Assessors that are hard to seduce

Slider 10 Mar 2015 13:11 UTC
0 points
Wouldn’t we know that in this context this would be true (rather than untrue as you write)? Also the degree that the assessor is properly shielded from tampereing means it will become closer to imaginary (no need to mention assessor implementation details, but then it seems to work like “magic” lessening the evidecne to believe in it existence). Also it seems that things that make people turn on religion are valued and here we are counting on the AI not pulling those same stunts.

Hiding the assessor among multiple plausible targets might make the AI play mafia on people (such as trying to get the assessor replaced when it can’t (no longer) satisfy it’s demands, inhopes that the replacement has easier attitudes or atleast possibility to have flaws to exploit).
- Stuart_Armstrong 10 Mar 2015 14:36 UTC
  0 points
  Parent
  These can be defined in counterfactual ways, if needed. There need not actually be an assessor, just a small probability of one.
  - Slider 11 Mar 2015 1:46 UTC
    0 points
    Parent
    Wouldn’t that be the equivalent of thinking that a Pascal’s wager will keep it in check?
    - Stuart_Armstrong 11 Mar 2015 10:25 UTC
      0 points
      Parent
      No, because I’m using tricks like http://lesswrong.com/r/discussion/lw/ltf/false_thermodynamic_miracles/