I don’t understand very much about mathematics, computer science, or programming, so I think that, for the most part, I’ve expressed myself in natural language to the greatest extent that I possibly can. I’m encouraged that about an hour and a half before my previous reply, DefectiveAlgorithm made the exact same argument that I did, albeit more briefly. It discourages me that he tabooed ‘values’ and you immediately used it anyway. Just in case you did decide to reply, I wrote a Python-esque pseudocode example of my conception of what an AGI with an arbitrary terminal value’s very high level source code would look like. With little technical background, my understanding is very high level with lots of black boxes. I encourage you to do the same, such that we may compare. I would prefer that you write yours before I give you mine so that you are not anchored by my example. This way you are forced to conceive of the AI as a program and do away with ambiguous wording. What do you say?
I’ve asked Nornagest to provide links or further reading on the value stability problem. I don’t know enough about it to say anything meaningful about it. I thought that wireheading scenarios were only problems with AIs whose values were loaded with reinforcement learning.
“[W]hatever an AI values it will try to optimize for in the future.”
On this at least we agree.
Of course, my sense of time is not very good and so I may be overly biases to see immediate reward as worthwhile when an AI with a better sense of time might automatically go for optimization over all time.
From what I understand, even if you’re biased, it’s not a bad assumption. To my knowledge, in scenarios with AGIs that have their values loaded with reinforcement learning, the AGIs are usually given the terminal goal of maximizing the time-discounted integral of their future reward signal. So, they ‘bias’ the AGI in the way that you may be biased. Maybe so that it ‘cares’ about the rewards its handlers give it more than the far greater far future rewards that it could stand to gain from wireheading itself? I don’t know. My brain is tired. My question looks wrong to me.
It discourages me that he tabooed ‘values’ and you immediately used it anyway.
In fairness, I only used it to describe how they’d come to be used in this context in the first place, not to try and continue with my point.
I wrote a Python-esque pseudocode example of my conception of what an AGI with an arbitrary terminal value’s very high level source code would look like. With little technical background, my understanding is very high level with lots of black boxes. I encourage you to do the same, such that we may compare.
I’ve never done something like this. I don’t know python, so mine would actually just be pseudocode if I can do it at all? Do you mean you’d like to see something like this?
while (world_state != desired_state)
get world_state
make_plan
execute_plan
end while
ETA: I seem to be having some trouble getting the while block to indent. It seems that whether I put 4, 6 or 8 spaces in front of the line, I only get the same level of indentation (which is different from Reddit and StackOverflow) and backticks do something altogether different.
Something like that. I posted my pseudocode in an open thread a few days ago to get feedback and I couldn’t get indentation to work either so I posted mine to Pastebin and linked it.
I’m still going through the Sequences, and I read Terminal Values and Instrumental Values the other day. Eliezer makes a pseudocode example of an ideal Bayesian decision system (as well as its data types), which is what an AGI would be a computationally tractable approximation of. If you can show me what you mean in terms of that post, then I might be able to understand you. It doesn’t look like I was far off conceptually, but thinking of it his way is better than thinking of it my way. My way’s kind of intuitive I guess (or I wouldn’t have been able to make it up) but his is accurate.
I also found his paper (Paper? More like book) Creating Friendly AI. Probably a good read for avoiding amateur mistakes, which we might be making. I intend to read it. Probably best not to try to read it in one sitting.
Even though I don’t want you to think of it this way, here’s my pseudocode just to give you an idea of what was going on in my head. If you see a name followed by parentheses, then that is the name of a function. ‘Def’ defines a function. The stuff that follows it is the function itself. If you see a function name without a ‘def’, then that means it’s being called rather than defined. Functions might call other functions. If you see names inside of the parentheses that follow a function, then those are arguments (function inputs). If you see something that is clearly a name, and it isn’t followed by parentheses, then it’s an object: it holds some sort of data. In this example all of the objects are first created as return values of functions (function outputs). And anything that isn’t indented at least once isn’t actually code. So ‘For AGI in general’ is not a for loop, lol.
Okay, I am convinced. I really, really appreciate you sticking with me through this and persistently finding different ways to phrase your side and then finding ways that other people have phrased it.
For reference it was the link to the paper/book that did it. The parts of it that are immediately relevant here are chapter 3 and section 4.2.1.1 (and optionally section 5.3.5). In particular, chapter 3 explicitly describes an order of operations of goal and subgoal evaluation and then the two other sections show how wireheading is discounted as a failing strategy within a system with a well-defined order of operations. Whatever problems there may be with value stability, this has helped to clear out a whole category of mistakes that I might have made.
Again, I really appreciate the effort that you put in. Thanks a load.
No problem, pinyaka.
I don’t understand very much about mathematics, computer science, or programming, so I think that, for the most part, I’ve expressed myself in natural language to the greatest extent that I possibly can. I’m encouraged that about an hour and a half before my previous reply, DefectiveAlgorithm made the exact same argument that I did, albeit more briefly. It discourages me that he tabooed ‘values’ and you immediately used it anyway. Just in case you did decide to reply, I wrote a Python-esque pseudocode example of my conception of what an AGI with an arbitrary terminal value’s very high level source code would look like. With little technical background, my understanding is very high level with lots of black boxes. I encourage you to do the same, such that we may compare. I would prefer that you write yours before I give you mine so that you are not anchored by my example. This way you are forced to conceive of the AI as a program and do away with ambiguous wording. What do you say?
I’ve asked Nornagest to provide links or further reading on the value stability problem. I don’t know enough about it to say anything meaningful about it. I thought that wireheading scenarios were only problems with AIs whose values were loaded with reinforcement learning.
On this at least we agree.
From what I understand, even if you’re biased, it’s not a bad assumption. To my knowledge, in scenarios with AGIs that have their values loaded with reinforcement learning, the AGIs are usually given the terminal goal of maximizing the time-discounted integral of their future reward signal. So, they ‘bias’ the AGI in the way that you may be biased. Maybe so that it ‘cares’ about the rewards its handlers give it more than the far greater far future rewards that it could stand to gain from wireheading itself? I don’t know. My brain is tired. My question looks wrong to me.
In fairness, I only used it to describe how they’d come to be used in this context in the first place, not to try and continue with my point.
I’ve never done something like this. I don’t know python, so mine would actually just be pseudocode if I can do it at all? Do you mean you’d like to see something like this?
ETA: I seem to be having some trouble getting the while block to indent. It seems that whether I put 4, 6 or 8 spaces in front of the line, I only get the same level of indentation (which is different from Reddit and StackOverflow) and backticks do something altogether different.
Unfortunately it’s a longstanding bug that preformatted blocks don’t work.
Something like that. I posted my pseudocode in an open thread a few days ago to get feedback and I couldn’t get indentation to work either so I posted mine to Pastebin and linked it.
I’m still going through the Sequences, and I read Terminal Values and Instrumental Values the other day. Eliezer makes a pseudocode example of an ideal Bayesian decision system (as well as its data types), which is what an AGI would be a computationally tractable approximation of. If you can show me what you mean in terms of that post, then I might be able to understand you. It doesn’t look like I was far off conceptually, but thinking of it his way is better than thinking of it my way. My way’s kind of intuitive I guess (or I wouldn’t have been able to make it up) but his is accurate.
I also found his paper (Paper? More like book) Creating Friendly AI. Probably a good read for avoiding amateur mistakes, which we might be making. I intend to read it. Probably best not to try to read it in one sitting.
Even though I don’t want you to think of it this way, here’s my pseudocode just to give you an idea of what was going on in my head. If you see a name followed by parentheses, then that is the name of a function. ‘Def’ defines a function. The stuff that follows it is the function itself. If you see a function name without a ‘def’, then that means it’s being called rather than defined. Functions might call other functions. If you see names inside of the parentheses that follow a function, then those are arguments (function inputs). If you see something that is clearly a name, and it isn’t followed by parentheses, then it’s an object: it holds some sort of data. In this example all of the objects are first created as return values of functions (function outputs). And anything that isn’t indented at least once isn’t actually code. So ‘For AGI in general’ is not a for loop, lol.
http://pastebin.com/UfP92Q9w
Okay, I am convinced. I really, really appreciate you sticking with me through this and persistently finding different ways to phrase your side and then finding ways that other people have phrased it.
For reference it was the link to the paper/book that did it. The parts of it that are immediately relevant here are chapter 3 and section 4.2.1.1 (and optionally section 5.3.5). In particular, chapter 3 explicitly describes an order of operations of goal and subgoal evaluation and then the two other sections show how wireheading is discounted as a failing strategy within a system with a well-defined order of operations. Whatever problems there may be with value stability, this has helped to clear out a whole category of mistakes that I might have made.
Again, I really appreciate the effort that you put in. Thanks a load.
And thank you for sticking with me! It’s really hard to stick it out when there’s no such thing as an honest disagreement and disagreement is inherently disrespectful!
ETA: See the ETA in this comment to understand how my reasoning was wrong but my conclusion was correct.