It actually is a perfect example of how LW is interested in science:
There is the fact that some people have no mental imagery, but live totally normal lives. That’s amazing! They’re more different than you usually imagine scifi aliens to be! And yet there is no obvious difference. It is awesome. How does that even work? Do they have mental imagery somewhere inside but no reflection on it? Etc, etc etc.
And the first thing that was done with this awesome fact here, was ‘update’ in the direction of trusting more the PUA community’s opinion on women, rather than women themselves, and that was done by author. That’s not even a sufficiently complete update, because the PUA community—especially the manipulative misogynists with zero morals and the ideal to become a clinical sociopath as per check list, along with their bragging that has selection bias and unscientific approach to data collection written all over it—is itself prone to typical mind fallacy (as well as a bunch of other fallacies) when they are seeing women as equally morally reprehensible beings as they themselves are.
This, cousin_it, is the case example why you shouldn’t be writing good work for LW. Some time back you were on verge of something cool—perhaps even proving that defining the real world ‘utility’ is incredibly computationally expensive for UDT. Instead, well, yeah, there’s the local ‘consensus’ on the AI behaviour and you explore for the potential confirmations of it.
the manipulative misogynists with zero morals and the ideal to become a clinical sociopath as per check list, along with… [an] unscientific approach to data collection
What ever data on physiology nazis collected correctly, we are relying on today. Even when very bad guys collect data properly the data is usable. When it’s on-line bragging by people fascinated with ‘negs’… not so much. It is a required condition that data is badly collected; the guys trying to be sociopaths does not suffice.
Some time back you were on verge of something cool—perhaps even proving that defining the real world ‘utility’ is incredibly computationally expensive for UDT. Instead, well, yeah, there’s the local ‘consensus’ on the AI behaviour and you explore for the potential confirmations of it.
You seem to be saying: “you were close to realizing this problem was unsolvable, but instead you decided to spend your time exploring possible solutions.”
Generally, you seem to be continually frustrated about something to do with wireheading, but you’ve never really made your position clear, and I can’t tell where it is coming from. Yes, it is easy to build systems which tear themselves to pieces, literally or internally. Do you have any more substantive observation? We see a path to building systems which have values over the real world. It is full of difficulties, but the wireheading problems seem understood and approachable / resolved. Can you clarify what you are talking about, in the context of UDT?
We see a path to building systems which have values over the real world.
The path he sees has values over internal model, but internal model is perfect AND it is faster than the real world, which stretches it a fair lot if you ask me. It’s not really a path, he’s simply using “sufficiently advanced model is indistinguishable from the real thing”. And we still can’t define what paperclips are if we don’t know the exact model that will be used, as the definition is only meaningful in context of a model.
The objection I have is that it is a: unnecessary to define the values over real world (the alternatives work fine for e.g. finding imaginary cures for imaginary diseases which we make match real diseases), b: very difficult or impossible to define values over the real world, and c: values over real world are necessary for the doomsday scenario. If this can be narrowed down, then there’s precisely the bit of AI architecture that has to be avoided.
We humans are messy creatures. It is very plausible (in light of potential irreducibility of ‘values over real world’) that we value internal states on the model, and we also receive negative reinforcement for model-world inconsistencies (when the model-prediction of the senses does not match the senses), resulting in learned preference not to lose correspondence between model and world, in place of straightforward “I value real paperclips therefore I value having a good model of the world” which looks suspiciously simple and poorly matches the observations (no matter how much you tell yourself you value real paperclips, you may procrastinate).
edit: and if I don’t make my position clear, it looks so because I am opposed to fuzzy ill defined woo where the distinction between models and worlds is poorly defined and the intelligence is a monolithic blob. It’s hard to define an objection to an ill defined idea which always off-shoots some anthropomorphic idea (e.g. wireheading gets replaced with real world goal to have a physical wire in a physical head that is to be kept alive with the wire).
It is very plausible [...] that we value internal states on the model, and we also receive negative reinforcement for model-world inconsistencies [...], resulting in learned preference not to lose correspondence between model and world
Generally correct; we learn to value good models, because they are more useful than bad models. We want rewards, therefore we want to have good models, therefore we are interested in the world out there. (For a reductionist, there must be a mechanism explaining why and how we care about the world.)
Technically, sometimes the most correct model is not the most rewarded model. For example it may be better to believe a lie and be socially rewarded by members of my tribe who share the belief, than to have a true belief that gets me killed by them. There may be other situations, not necessarily social, where the perfect knowledge is out of reach, and a better approximation may be in the “valley of bad rationality”.
it is unnecessary to define the values over real world (the alternatives work fine for e.g. finding imaginary cures for imaginary diseases which we make match real diseases) [...] there’s precisely the bit of AI architecture that has to be avoided.
In other words, make an AI that only cares about what is inside the box, and it will not try to get out of the box.
That assumes that you will feed the AI all the necessary data, and verify that the data is correct and complete, because the AI will be just as happy with any kind of data. If you give an incorrect information to AI, the AI will not care about it, because it has no definition of “incorrect”; even in situations where AI is smarter than you and could have noticed an error that you didn’t notice. In other words, you are responsible for giving AI the correct model, and the AI will not help you with this, because AI does not care about correctness of the model.
You put it backwards.… making AI that cares about truly real stuff as the prime drive is likely impossible and certainly we don’t know how to do that nor need to. edit: i.e. You don’t have to sit and work and work and work and find how to make some positronic mind not care about the real world. You get this by simply omitting some mission-impossible work. Specifying what you want, in some form, is unavoidable.
Regarding verification, you can have the AI search for code that predicts the input data the best, and then if you are falsifying the data the code will include a model of your falsifications.
And the first thing that was done with this awesome fact here, was ‘update’ in the direction of trusting more the PUA community’s opinion on women, rather than women themselves, and that was done by author. That’s not even a sufficiently complete update, because the PUA community—especially the manipulative misogynists with zero morals and the ideal to become a clinical sociopath as per check list, along with their bragging that has selection bias and unscientific approach to data collection written all over it—is itself prone to typical mind fallacy (as well as a bunch of other fallacies) when they are seeing women as equally morally reprehensible beings as they themselves are.
This is a really good point …
This, cousin_it, is the case example why you shouldn’t be writing good work for LW.
… which utterly fails to establish the claim that you attempt to use it for.
… which utterly fails to establish the claim that you attempt to use it for.
Context, man, context. cousin_it’s misgivings are about the low local standards. This article is precisely a good example of such low local standards—and note that I was not picking a strawman here, it was chosen as example of the best. The article would have been torn to shreds in most other intelligent places (consider arstechnica observatory forum) for the bit that I am talking of.
edit: also on the ‘good point’: this is how a lot of rationality here is: handling partial updates incorrectly. You have a fact that affects literally every opinion that a person has on another person, you proceed to update in direction of confirmation of your opinions and your choice of what to trust. LW has awfully low standard on anything that agrees with local opinions. This also pops up in utility discussions, too. E.g. certain things (possibility of huge world) scale down all utilities in the system, leaving all actions unchanged. But the actual update that happens in agents that do not handle meta reasoning correctly for real-time system, updates some A before some B and then suddenly there are enormous difference between utilities. It’s just a broken model. Theoretically speaking A being updated and B being not updated, is in some theoretical sense more accurate than neither being updated, but everything that is dependent to relation of A and B is messed up by partial update. The algorithms for real-time belief updating are incredibly non-trivial (as are the algorithms for Bayesian probability calculation on graphs in general, given cycles and loops). The theoretical understanding behind the rationalism here is just really, really, really poor.
For interest of the discussion, here is the article in question
It actually is a perfect example of how LW is interested in science:
There is the fact that some people have no mental imagery, but live totally normal lives. That’s amazing! They’re more different than you usually imagine scifi aliens to be! And yet there is no obvious difference. It is awesome. How does that even work? Do they have mental imagery somewhere inside but no reflection on it? Etc, etc etc.
And the first thing that was done with this awesome fact here, was ‘update’ in the direction of trusting more the PUA community’s opinion on women, rather than women themselves, and that was done by author. That’s not even a sufficiently complete update, because the PUA community—especially the manipulative misogynists with zero morals and the ideal to become a clinical sociopath as per check list, along with their bragging that has selection bias and unscientific approach to data collection written all over it—is itself prone to typical mind fallacy (as well as a bunch of other fallacies) when they are seeing women as equally morally reprehensible beings as they themselves are.
This, cousin_it, is the case example why you shouldn’t be writing good work for LW. Some time back you were on verge of something cool—perhaps even proving that defining the real world ‘utility’ is incredibly computationally expensive for UDT. Instead, well, yeah, there’s the local ‘consensus’ on the AI behaviour and you explore for the potential confirmations of it.
A classic Arson, Murder, and Jaywalking right there.
I don’t know, given the harm bad data collection can do, I’m not sure being a clinical sociopath is much worse.
What ever data on physiology nazis collected correctly, we are relying on today. Even when very bad guys collect data properly the data is usable. When it’s on-line bragging by people fascinated with ‘negs’… not so much. It is a required condition that data is badly collected; the guys trying to be sociopaths does not suffice.
You seem to be saying: “you were close to realizing this problem was unsolvable, but instead you decided to spend your time exploring possible solutions.”
Generally, you seem to be continually frustrated about something to do with wireheading, but you’ve never really made your position clear, and I can’t tell where it is coming from. Yes, it is easy to build systems which tear themselves to pieces, literally or internally. Do you have any more substantive observation? We see a path to building systems which have values over the real world. It is full of difficulties, but the wireheading problems seem understood and approachable / resolved. Can you clarify what you are talking about, in the context of UDT?
The path he sees has values over internal model, but internal model is perfect AND it is faster than the real world, which stretches it a fair lot if you ask me. It’s not really a path, he’s simply using “sufficiently advanced model is indistinguishable from the real thing”. And we still can’t define what paperclips are if we don’t know the exact model that will be used, as the definition is only meaningful in context of a model.
The objection I have is that it is a: unnecessary to define the values over real world (the alternatives work fine for e.g. finding imaginary cures for imaginary diseases which we make match real diseases), b: very difficult or impossible to define values over the real world, and c: values over real world are necessary for the doomsday scenario. If this can be narrowed down, then there’s precisely the bit of AI architecture that has to be avoided.
We humans are messy creatures. It is very plausible (in light of potential irreducibility of ‘values over real world’) that we value internal states on the model, and we also receive negative reinforcement for model-world inconsistencies (when the model-prediction of the senses does not match the senses), resulting in learned preference not to lose correspondence between model and world, in place of straightforward “I value real paperclips therefore I value having a good model of the world” which looks suspiciously simple and poorly matches the observations (no matter how much you tell yourself you value real paperclips, you may procrastinate).
edit: and if I don’t make my position clear, it looks so because I am opposed to fuzzy ill defined woo where the distinction between models and worlds is poorly defined and the intelligence is a monolithic blob. It’s hard to define an objection to an ill defined idea which always off-shoots some anthropomorphic idea (e.g. wireheading gets replaced with real world goal to have a physical wire in a physical head that is to be kept alive with the wire).
Generally correct; we learn to value good models, because they are more useful than bad models. We want rewards, therefore we want to have good models, therefore we are interested in the world out there. (For a reductionist, there must be a mechanism explaining why and how we care about the world.)
Technically, sometimes the most correct model is not the most rewarded model. For example it may be better to believe a lie and be socially rewarded by members of my tribe who share the belief, than to have a true belief that gets me killed by them. There may be other situations, not necessarily social, where the perfect knowledge is out of reach, and a better approximation may be in the “valley of bad rationality”.
In other words, make an AI that only cares about what is inside the box, and it will not try to get out of the box.
That assumes that you will feed the AI all the necessary data, and verify that the data is correct and complete, because the AI will be just as happy with any kind of data. If you give an incorrect information to AI, the AI will not care about it, because it has no definition of “incorrect”; even in situations where AI is smarter than you and could have noticed an error that you didn’t notice. In other words, you are responsible for giving AI the correct model, and the AI will not help you with this, because AI does not care about correctness of the model.
You put it backwards.… making AI that cares about truly real stuff as the prime drive is likely impossible and certainly we don’t know how to do that nor need to. edit: i.e. You don’t have to sit and work and work and work and find how to make some positronic mind not care about the real world. You get this by simply omitting some mission-impossible work. Specifying what you want, in some form, is unavoidable.
Regarding verification, you can have the AI search for code that predicts the input data the best, and then if you are falsifying the data the code will include a model of your falsifications.
This is a really good point …
… which utterly fails to establish the claim that you attempt to use it for.
Context, man, context. cousin_it’s misgivings are about the low local standards. This article is precisely a good example of such low local standards—and note that I was not picking a strawman here, it was chosen as example of the best. The article would have been torn to shreds in most other intelligent places (consider arstechnica observatory forum) for the bit that I am talking of.
edit: also on the ‘good point’: this is how a lot of rationality here is: handling partial updates incorrectly. You have a fact that affects literally every opinion that a person has on another person, you proceed to update in direction of confirmation of your opinions and your choice of what to trust. LW has awfully low standard on anything that agrees with local opinions. This also pops up in utility discussions, too. E.g. certain things (possibility of huge world) scale down all utilities in the system, leaving all actions unchanged. But the actual update that happens in agents that do not handle meta reasoning correctly for real-time system, updates some A before some B and then suddenly there are enormous difference between utilities. It’s just a broken model. Theoretically speaking A being updated and B being not updated, is in some theoretical sense more accurate than neither being updated, but everything that is dependent to relation of A and B is messed up by partial update. The algorithms for real-time belief updating are incredibly non-trivial (as are the algorithms for Bayesian probability calculation on graphs in general, given cycles and loops). The theoretical understanding behind the rationalism here is just really, really, really poor.