Richard_Kennaway answers [missing post]

Richard_Kennaway 17 Apr 2023 8:12 UTC
3 points
1
Why are you calling various parts of the program “wanting”, “liking”, “reward”, “confidence”, etc.? For those attributes to be really there, they would have to be discoverable from the code alone, even if all the comments were omitted and all the variable names replaced by arbitrary ones like G36287.
- elbow921 17 Apr 2023 11:45 UTC
  1 point
  0
  Parent
  As this algorithm executes, the last and 2last variables become the program’s last 2 outputs. L1′s even indexes become the average input(reward?) given the number of ones the program outputted the last 2 times. I called L1′s odd indexes ‘confidence’ because, as they get higher, the corresponding average reward changes less based on evidence. When L1 becomes entangled with the input generation process, the algorithm chooses which outputs make the inputs higher on average. That is why I called the input ‘reward’. L2 reads off the average reward given the last 2 outputs. The algorithm chooses outputs that make the number of ones outputted closer to the number that has yielded the highest inputs in the past. This makes L2 analogous to ‘wanting’.
  - Richard_Kennaway 17 Apr 2023 12:31 UTC
    2 points
    0
    Parent
    In effect, you’re saying that all reinforcement learners experience pleasure and suffering. But how do these algorithms “feel from the inside”? What does it mean for the variable A to be “obvious to the algorithm”? We know how we feel, but how do you determine whether there is anything it is like to be that program? Are railway lines screaming in pain when the wheel flanges rub against them? Does ChatGPT feel sorrow when it apologises on being told that its output was bad?
    
    I see no reason to attribute emotional states to any of these things.
    - elbow921 17 Apr 2023 15:03 UTC
      1 point
      0
      Parent
      ‘By ‘obvious to the algorithm’ I mean that, to the algorithm, A is referenced with no intermediate computation. This is how pleasure and pain feel to me. I do not believe all reinforcement learning algorithms feel pleasure/pain. A simple example that does not suffer is the Simpleton iterated prisoner’s dilemma strategy. I believe pain and pleasure are effective ways to implement reinforcement learning. In animals, reinforcement learning is called operant conditioning. See Reinforcement learning on a chicken for a chicken that has experienced it. I do not know any algorithms to determine whether there is anything to be like a given program. I suspected this program experienced pleasure/pain because of its paralells to the neuroscience of pleasure and pain.
    - Signer 17 Apr 2023 18:14 UTC
      1 point
      0
      Parent
      
      We know how we feel
      
      The description of our feelings is not fundamentally different from the description of any reinforcement learner. They both describe the same thing—physical reality—just with different language and precision.
      
      I see no reason to attribute emotional states to any of these things.
      
      The reason is that they are abstractly analogous to emotional states in humans, like emotional state in one human may be abstractly analogous to emotional state in other human.
      - Richard_Kennaway 17 Apr 2023 19:21 UTC
        2 points
        0
        Parent
        I cannot see “abstractly analogous” as sufficient grounds. Get abstract enough and everything is “abstractly analogous” to everything.