John_Maxwell comments on Coherence arguments do not entail goal-directed behavior

John_Maxwell 17 Aug 2019 9:11 UTC
LW: 2 AF: 1
AF

we won’t see examples like this if the algorithms that produce this kind of behavior take longer to produce the behavior than the amount of time we’ve let them run.

Are you suggesting that Deep Blue would behave in this way if we gave it enough time to run? If so, can you explain the mechanism by which this would occur?

I think Stuart Armstrong and Tom Everrit are the main people who’ve done work in this area, and their work on this stuff seems quite under appreciated.

Can you share links?
- David Scott Krueger (formerly: capybaralet) 19 Aug 2019 4:58 UTC
  LW: 3 AF: 2
  AF Parent
  I don’t know how deep blue worked. My impression was that it doesn’t use learning, so the answer would be no.
  A starting point for Tom and Stuart’s works: https://scholar.google.com/scholar?rlz=1C1CHBF_enCA818CA819&um=1&ie=UTF-8&lr&cites=1927115341710450492
  - David Scott Krueger (formerly: capybaralet) 21 Aug 2019 2:48 UTC
    1 point
    Parent
    BoMAI is in this vein, as well ( https://arxiv.org/pdf/1905.12186.pdf )