Do you have some reliable way of recruiting? What’s the policy alternative? You do what you gotta do, if ends up being just you, nonetheless, you do what you gotta do. Zero people won’t make fewer mistakes than one person.
If we condition on having all other variables optimized, I’d expect a team to adopt very high standards of proof, and recognize limits to its own capabilities, biases, etc. One of the primary purposes of organizing a small FAI team is to create a team that can actually stop and abandon a line of research/design (Eliezer calls this “halt, melt, and catch fire”) that cannot be shown to be safe (given limited human ability, incentives and bias). If that works (and it’s a separate target in team construction rather than a guarantee, but you specified optimized non-talent variables) then I would expect a big shift of probability from “UFAI” to “null.”
I’m not sure if he had both math and philosophy in mind when he wrote that or just math, but in any case surely the same principle applies to the philosophy. If you don’t reach a high confidence that the philosophy behind some FAI design is correct, then you shouldn’t move forward with that design, and if there is only one philosopher on the team, you just can’t reach high confidence in the philosophy.
if there is only one philosopher on the team, you just can’t reach high confidence in the philosophy.
This does not sound correct to me. Resolutions of simple confusions usually look pretty obvious in retrospect. Or do you mean something broader by “philosophy” than trying to figure out free will?
Did you read the rest of that thread where I talked about how in cryptography we often used formalizations of “security” that were discovered to be wrong years later, and that’s despite having hundreds of people in the research community constantly trying to attack each other’s ideas? I don’t see how formalizing Friendliness could be not just easier and less error prone than formalizing security, but so much so that just one person is enough to solve all the problems with high confidence of correctness.
Or do you mean something broader by “philosophy” than trying to figure out free will?
I mean questions like your R1 and R2, your “nonperson predicate”, how to distinguish between moral progress and moral error / value drift, anthropic reasoning / “reality fluid”. Generally, all the problems that need to be solved for building an FAI besides the math and the programming.
Yes, formalizing Friendliness is not the sort of thing you’d want one person doing. I agree. I don’t consider that “philosophy”, and it’s the sort of thing other FAI team members would have to be able to check. We probably want at least one high-grade actual cryptographer.
Of the others, the nonperson predicate and the moral-progress parts are the main ones where it’d be unusually hard to solve and then tell that it had been solved correctly. I would expect both of those to be factorable-out, though—that all or most of the solution could just be published outright. (Albeit recent experience with trolls makes me think that no insight enabling conscious simulations should ever be published; people would write suffering conscious simulations and run them just to show off… how confident they were that the consciousness theory was wrong, or something. I have a newfound understanding of the utter… do-anything-ness of trolls. This potentially makes it hard to publicly check some parts of the reasoning behind a nonperson predicate.) Anthropic reasoning / “reality fluid” is the sort of thing I’d expect to be really obvious in retrospect once solved. R1 and R2 should be both obvious in retrospect, and publishable.
I have hopes that an upcoming post on the Lob Problem will offer a much more concrete picture of what some parts of the innards of FAI development and formalizing look like.
Yes, formalizing Friendliness is not the sort of thing you’d want one person doing. I agree. I don’t consider that “philosophy”, and it’s the sort of thing other FAI team members would have to be able to check.
In principle, creating a formalization of Friendliness consists of two parts, conceptualizing Friendliness, and translating the concept into mathematical language. I’m using “philosophy” and “formalizing Friendliness” interchangeably to refer to both of these parts, whereas you seem to be using “philosophy” to refer to the former and “formalizing Friendliness” for the latter.
I guess this is because you think you can do the first part, then hand off the second part to others. But in reality, constraints about what kinds of concepts can be expressed in math and what proof techniques are available means that you have to work from both ends at the same time, trying to jointly optimize for philosophical soundness and mathematical feasibility, so there is no clear boundary between “philosophy” and “formalizing”.
(I’m inferring this based on what happens in cryptography. The people creating new security concepts, the people writing down the mathematical formalizations, and the people doing the proofs are usually all the same, I think for the above reason.)
My experience to date has been a bit difference—the person asking the right question needs to be a high-grade philosopher, the people trying to answer it only need enough high-grade philosophy to understand-in-retrospect why that exact question is being asked. Answering can then potentially be done with either math talent or philosophy talent. The person asking the right question can be less good at doing clever advanced proofs but does need an extremely solid understanding of the math concepts they’re using to state the kind-of-lemma they want. Basically, you need high math and high philosophy on both sides but there’s room for S-class-math people who are A-class philosophers but not S-class-philosophers, being pointed in the right direction by S-class-philosophers who are A-class-math but not S-class-math. If you’ll pardon the fuzzy terminology.
Re non-person predicates, do you even have a non-sharp (but non-trivial) lower bound for it? How do you know that the Sims from the namesake game aren’t persons? How do we know that Watson is not suffering indescribably when losing a round of Jeopardy? And that imagining someone (whose behavior you can predict with high accuracy) suffering is not as bad as “actually” making someone suffer? If this bound has been definitively established, I’d appreciate a link.
It’s unclear where our intuitions on the subject come from or how they work, and they are heavily …. distorted … by various beliefs and biases. OTOH, it seems unlikely that rocks are conscious and we just haven’t extrapolated far enough to realize. It’s also unclear whether personhood is binary or there’s some kind of sliding scale. Nevertheless, it seems clear that a fly is not worth killing people over.
Even a person who has never introspected about their moral beliefs can still know that murder is wrong. They’re more likely to make mistakes, but still.
(Albeit recent experience with trolls makes me think that no insight enabling conscious simulations should ever be published; people would write suffering conscious simulations and run them just to show off… how confident they were that the consciousness theory was wrong, or something. I have a newfound understanding of the utter… do-anything-ness of trolls. This potentially makes it hard to publicly check some parts of the reasoning behind a nonperson predicate.)
I get the impression that you have something different in mind as far as ‘trolls’ go than fools who create stereotypical conflicts on the internet. What kind of trolls are these?
My psychological model says that all trolls are of that kind; some trolls just work harder than others. They all do damage in exchange for attention and the joy of seeing others upset, while exercising the limitless human ability to persuade themselves it’s okay. If you make it possible for them to do damage on their home computers with no chance of being arrested and other people being visibly upset about it, a large number will opt to do so. The amount of suffering they create can be arbitrarily great, so long as they can talk themselves into believing it doesn’t matter for and other people are being visibly upset to give them the attention-reward.
4chan would have entire threads devoted to building worse hells. Yes. Seriously. They really would. And then they would instantiate those hells. So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer, shut up. And if somebody actually does it, don’t be upset on the Internet.
4chan would have entire threads devoted to building worse hells. Yes. Seriously. They really would. And then they would instantiate those hells.
They really would at that. It seems you are concerned here about malicious actual trolls specifically. I suppose if the technology and knowledge was disseminated to that degree (before something actually foomed) then that would be the most important threat. My first thoughts had gone towards researchers with the capabilities and interest to research this kind of technology themselves who are merely callous and who are indifferent to the suffering of their simulated conscious ‘guinea pigs’ for the aforementioned .
So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer
At what level of formalization does this kind of ‘incremental progress’ start to count? I ask because your philosophical essays on reductionism, consciousness and zombies is something that seems to be incremental progress towards that end (but which I certainly wouldn’t consider a mistake to publish or a net risk).
What is the suffering of a few in the face of Science? Pain is all relative, as is eternity. We’ve done far worse. I’m sure we have.
(I’m not a huge fan of SCP in general, but I like a few stories with the “infohazard” tag, and I’m amused by how LW-ish those can get.)
At what level of formalization does this kind of ‘incremental progress’ start to count? I ask because your philosophical essays on reductionism, consciousness and zombies is something that seems to be incremental progress towards that end (but which I certainly wouldn’t consider a mistake to publish or a net risk).
Eliezer could argue that the incremental progress towards stopping the risk outweighs the danger, same as with the general FAI/uFAI secrecy debate.
Eliezer could argue that the incremental progress towards stopping the risk outweighs the danger, same as with the general FAI/uFAI secrecy debate.
I think EY vastly overrates security through obscurity. Szilard keeping results about graphite and neutrons secret happened before the Internet; now there’s this thing called the Streisand effect.
talk themselves into believing it doesn’t matter for
That stupid reason is, at core, nihilistic solipsism—and it’s not as stupid as you’d think. I’m not saying it’s right, but it does happen to be the one inescapable meme-trap of philosophy.
To quote your own fic, their reason is “why not?”—and their consciousness was not grown such that your impassioned defense of compassion and consideration have any intrinsic factor in their utility function.
So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer, shut up.
“The Sims” is often heralded as the best-selling videogame of all time, and it attracts players of all ages, races and genders from all across the world and from all walks of life.[citation needed]
Now imagine if the toons in the game could actually feel what was happening to them and react believably to their environment and situation and events?
I’m sure I don’t need to quote the Rules of Acquisition; everyone here should know where this leads if word of such a technique gets out.
Now imagine if the toons in the game could actually feel what was happening to them and react believably to their environment and situation and events?
There have always been those who would pull the wings off flies, stomp on mice, or torture kittens. Setting roosters, fish, or dogs to fight each other to death remains a well-known spectacle in many rural parts of the world. In Shakespeare’s day, Londoners enjoyed watching dogs slowly kill bulls or bears, or be killed by them; in France they set bushels of cats on fire to watch them burn. Public executions and tortures, gladiatorial combat among slaves, and other nonconsensual “blood sports” have been common in human history.
The average individual could not hold private gladiatorial contests, on a whim, at negligible cost. Killing a few innocents by torture, as public spectacle, is significantly less than repeatedly torturing large groups, as private entertainment, for as little as the average individual would have paid for their ticket to the cockfight.
Also, some people reckon the suffering of animals doesn’t matter. They’re wrong, but they wouldn’t care about most of your examples (or at least they would claim it’s because they increase the risk you’ll do the same to humans, which is a whole different kettle of fish.)
Now imagine if the toons in the game could actually feel what was happening to them and react believably to their environment and situation and events?
Now imagine if the toons in the game could actually feel what was happening to them and react believably to their environment and situation and events?
Why do you always have to ask subtly hard questions? I can just see see your smug face, smiling that smug smile of yours with that slight tilt of the head as we squirm trying to rationalize something up quick.
Here’s my crack at it: They don’t have what we currently think is the requisite code structure to “feel” in a meaningful way, but of course we are too confused to articulate the reasons much further.
Thank you, I’m flattered. I have asked Eliezer the same question, not sure if anyone will reply. I hoped that there is a simple answer to this, related to the complexity of information processing in the substrate, like the brain or a computer, but I cannot seem to find any discussions online. Probably using wrong keywords.
related to the complexity of information processing in the substrate
Not directly related. I think it has a lot to do with being roughly isomorphic to how a human thinks, which requires large complexity, but a particular complexity.
When I evaluate such questions IRL, like in the case of helping out an injured bird, or feeding my cat, I notice that my decisions seem to depend on whether I feel empathy for the thing. That is, do my algorithms recognize it as a being, or as a thing.
But then empathy can be hacked or faulty (see for example pictures of african children, cats and small animals, ugly disfigured people, far away people, etc), so I think of a sort of “abstract empathy” that is doing the job of recognizing morally valuable beings without all the bugs of my particular implementation of it.
In other words, I think it’s a matter of moral philosophy, not metaphysics.
Well, I can’t speak for the latest games, but I’ve personally read (some of) the core AI code for the toons in the first game of the series, and there was nothing in there that made a model of said code or attempted any form of what I’d even call “reasoning” throughout. No consciousness or meta-awareness.
By being simulated by the code simulating the game in which they “are”, they could to some extent be said to be “aware” of certain values like their hunger level, if you really want to stretch wide the concept of “awareness”. However, there seems to be no consciousness anywhere to be ‘aware’ (in the anthropomorphized sense) of this.
Since my priors are such that I consider it extremely unlikely that consciousness can exist without self-modeling and even more unlikely that consciousness is nonphysical, I conclude that there is a very low chance that they can be considered a “mind” with a consciousness that is aware of the pain and stimuli they receive.
The overall system is also extremely simple, in relative terms, considering the kind of AI code that’s normally discussed around these parts.
Now imagine if the toons in the game could actually feel what was happening to them and react believably to their environment and situation and events?
The favourite Sim household of my housemate was based on “Buffy the Vampire Slayer”. Complete with a graveyard constructed in the backyard. Through the judicial application of “remove ladder” from the swimming pool.
Why would them feeling it help them “react believably to their environment and situation and events”? If they’re dumb enough to “run lots of small, stupid, suffering conscious agents on a home computer”, I mean.
Of course, give Moore time and this objection will stop applying.
We’re already pretty close to making game characters have believable reactions, but only through clever scripting and a human deciding that situation X warrants reaction Y, and then applying mathematically-complicated patterns of light and prerecorded sounds onto the output devices of a computer.
If we can successfully implement a system that has that-function-we-refer-to-when-we-say-”consciousness” and that-f-w-r-t-w-w-s-”really feel pain”, then it seems an easy additional step to implement the kind of events triggering the latter function and the kind of outputs from the former function that would be believable and convincing to human players. I may be having faulty algorithmic intuitions here though.
Well, if they were as smart as humans, sure. Even as smart as dogs, maybe. But if they’re running lots of ’em on a home PC, then I must have been mistaken about how smart you have to be for consciousness.
How often does 4chan torture animals? That’s pretty easy to pull off. Are they doing it all the time and I haven’t noticed, or is there some additional force preventing it (e.g. Anonymous would hunt them down and post their details online, or 4chan all just like animals.)
Not often. Hurting animals is generally considered Not OK on 4chan, to the extent that anything is Not OK on 4chan.
There are a few pictures and stories that get passed around (some kids kicking a cat against a wall like a football, shoveldog, etc), but many fewer than the human gore pictures. 4channers mostly aggregate this stuff from all over and post it to be edgy and drive people who aren’t edgy enough away from 4chan.
And yeah, to the extent that people do torture animals in current events (as opposed to past stories), vast hordes of moralfags and raiders from 4chan tend to hunt them down and ruin their lives.
And yeah, to the extent that people do torture animals in current events (as opposed to past stories), vast hordes of moralfags and raiders from 4chan tend to hunt them down and ruin their lives.
I wonder if this might happen to people running hells too? I lack the domain expertise to judge if this is ludicrous or impossible to predict or what.
I remember that once, a Facebook page was hacked into (I guess) and started posting pictures and stories about tortured animals. Everybody went WTF and the page was shut down a few days later.
4chan all just like animals
I’ve never been there, but plenty of people on the internet do. Facebook pages against vivisection etc. seem to get way more likes than those in favour of it, the meme that humanity had better become extinct because wildlife would be better off is quite widespread, and some people even rejoice when a hunter dies (though this is a minority stance).
You know, I want to say you’re completely and utterly wrong. I want to say that it’s safe to at least release The Actual Explanation of Consciousness if and when you should solve such a thing.
But, sadly, I know you’re absolutely right re the existence of trolls which would make a point of using that to create suffering. Not just to get a reaction, but some would do it specifically to have a world they could torment beings.
My model is not that all those trolls are identical (In that I’ve seen some that will explicitly unambiguously draw the line and recognize that egging on suicidal people is something that One Does Not Do, but I also know (seen) that all too many gleefully do do that.)
But, sadly, I know you’re absolutely right re the existence of trolls which would make a point of using that to create suffering. Not just to get a reaction, but some would do it specifically to have a world they could torment beings.
My model is not that all those trolls are identical (In that I’ve seen some that will explicitly unambiguously draw the line and recognize that egging on suicidal people is something that One Does Not Do, but I also know (seen) that all too many gleefully do do that.)
It’s worth noting that private torture chambers seem different to trolling, but a troll can still set up a torture chamber—they just care about people’s reaction to it, not the torture itself.
So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer, shut up.
Most any incremental progress towards AGI, or even “just” EMs, would be dual use (if not centuple use) and could be (ab)used for helping achieve such enterta … vile and nefarious purposes.
In fact, it is hard to imagine realistic technological progress that can solely be used to run lots of small, stupid, suffering conscious agents but not as a stepping stone towards more noble pursuits (… such as automated poker playing agents).
So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer, shut up.
I sometimes wonder if this does not already exist, except for the suffering and consciousness being merely simulated. That is, computer games in which the entire purpose is to inflict unspeakable acts on powerless NPCs, acts whose depiction in prose or pictures would be grossly illegal almost everywhere. But I’ve never heard of such a thing actually existing.
That is, computer games in which the entire purpose is to inflict unspeakable acts on powerless NPCs, acts whose depiction in prose or pictures would be grossly illegal almost everywhere.
What sort of acts are we talking here? Because I’m genuinely having trouble thinking of any “acts whose depiction in prose or pictures would be grossly illegal almost everywhere” except maybe pedophilia. Censorship and all that.
And there are some fairly screwed-up games out there, although probably not as bad as they could be if designed with that in mind (as opposed to, y’know, the enjoyment of the player.)
I’ve never heard of such a thing actually existing.
Well would you, if it was grossly illegal to describe the contents?
I didn’t want to be explicit, but you thought of the obvious example.
I can’t think of any other examples, though.
probably not as bad as they could be if designed with that in mind (as opposed to, y’know, the enjoyment of the player.)
For the sort of 4chan people Eliezer mentioned, these would be completely congruent.
… maaaybe. Again, I’m not sure exactly what you have in mind.
I’ve never heard of such a thing actually existing.
Well would you, if it was grossly illegal to describe the contents?
It is well known that illegal pornography exists on non-interactive media. For interactive media, all I’ve ever heard of is 18-rated sex scenes.
Good point. Indeed, it’s well known that child porn exists on some level.
In fact … I do vaguely recall something about a Japanese game about rape causing a moral panic of some kind, so …
EDIT: In fact, it featured kids too! RapeLay. It’s … fairly horrible, although I think someone with the goal of pure horribleness would do .. better? Worse? Whatever.
If you make it possible for them to do damage on their home computers with no chance of being arrested and other people being visibly upset about it, a large number will opt to do so. [...] 4chan would have entire threads devoted to building worse hells. Yes. Seriously. They really would. And then they would instantiate those hells.
Wishing I could disagree with you, and, suspiciously, I find myself believeing that there would be enough vigilante justice to discourage hellmaking—after all, the trolls are doing it for the attention, and if that attention comes in the form of people posting your details and other people breaking into your house to steal your computer and/or murder you (for the greater good) then I doubt there will be many takers.
I just wish I could trust that doubt.*
*(Not expressing a wish for trust pills.)
EDIT: Animal experimentation and factory farming are still popular, but they have financial incentive … and I vaguely recall that some trolls kicked a dog across a football field or something and were punished by Anonymous. That’s where the analogy comes from, anyway, so I’d be interested if someone knows more.
Are people free of responsibility for rising to the bait?
You ask that as if there were some conservation of responsibility, where the people rising to the bait are responsible for doing so, and this somehow means the troll is not responsible for deliberately giving the opportunity, and by the troll not being responsible, this somehow means the troll is not causing damage.
The damage is the result of interactions of multiple causes, and for each of those causes that is a decision by an agent, you can consider that agent “responsible”. This can get complicated when causes include not-so-agenty behaviors of quasi-agents. At some point “responsibility” stops being a useful concept, and you just have to decide whatever policy it was supposed to inform by actually looking at the likely consequences.
Public notice: I’m now considering this user a probable troll and will act accordingly.
In the future, I may so consider and act, under these sorts of circumstances, without such public notice.
Does any (old-time, trusted) user want to volunteer as the mod who goes back and deletes all the troll comments once a user has been designated a troll?
Does any (old-time, trusted) user want to volunteer as the mod who goes back and deletes all the troll comments once a user has been designated a troll?
That seems like something that should be automated. I could give you a script that (for example) goes through a user page and deletes (or rather sends the appropriate button press that would be a delete command when run by a mod) any comment with less than 0 karma. It would make more sense for this to be an implemented feature of the site but “something I could implement in an hour without accessing the source code” seems like a suitable proxy for “is easy to make a feature of the site, one way or another”.
He could have just been talking about trolling in the abstract. And even if not, after reading a bit of his history, his “trolling”, if any, is at most at the level of rhethorical questions. I’m not really a fan of his commenting, but if he’s banned, I’d say “banned for disagreement” will be closer to the mark as a description of what happened than “banned for trolling”, though not the whole story.
You can respond to the argument.(it might even do you good). or you can refuse to consider criticism. It’s your choice. From which I will draw my own conclusions.
You can state that you’ve never trolled in this site (not even “low-level”) and promise to never troll here (not even “low-level”) in the future.
As a sidenote, previously you argued that people who respond to trolls are also to blame. Now you argue that EY would be to blame if he does not respond to a possible troll.
From this discrepancy I just drew my own conclusions.
A troll (no matter how low-level) wastes time, ruins people’s mood, and destroys trust.
Every actual troll increases the probability that some other innocent newcomer will get accused of trolling wrongly, making a community just that tiny bit more hostile to newcomers, and thus less open to the outside world. Then those newcomers judge the community badly for the community’s negative judgment of them. Bad feelings all around, and a community which is now less receptive of new ideas, because they might just be trolling.
So, yeah, trolling does damage. Trolling is bad, bad, bad.
It could be argued that such trolling can cause circumstantial damage or emotional damage, through intermediaries, with an example that takes a couple of weak steps in reasoning.
It could be argued back that this is circumstantial, and therefore not caused knowingly by the troll, and the example taken apart by all those weak points by giving them actual numbers for their probability.
Then it could be counter-argued again that the amount of possible circumstances or possible combinations of circumstances that would bring about some form of damage is such, compared to the circumstances that would not and the probabilistic facts of the trolling, that it ends up being more likely than not that at least one out of the incredibly many possible sets of circumstances will apply for any given instance of trolling.
I need a sanity check for motivated stopping here, but I don’t see any good further counter-argument that I could steel-man and would show that this isn’t a case of “trolling causes damage in a predictable manner”, unless my prior that such damage would not occur in the absence of trolling is completely wrong.
That’s like saying you shouldn’t drive on your street when Joe is driving because Joe is a bad driver. It’s true that you should update and avoid driving when Joe is out. But you find that out after Joe crashes his car into you. At which point, damage has been done to your car that updating won’t fix.
If you present a depressed person with strong arguments that they should commit suicide, this is likely to cause their beliefs to change. So changing their beliefs back to their old level so that they can continue functioning as before (as opposed to killing themselves) will require work in addition to realizing they shouldn’t talk to you anymore, possibly in the form of support and hugs from other supportive people. Similarly, if your car has been damaged in an accident, it will require additional work to run again, such as replacing deformed parts. The car won’t magically start running once Joe is off the road.
(Albeit recent experience with trolls makes me think that no insight enabling conscious simulations should ever be published; people would write suffering conscious simulations and run them just to show off… how confident they were that the consciousness theory was wrong, or something. I have a newfound understanding of the utter… do-anything-ness of trolls. This potentially makes it hard to publicly check some parts of the reasoning behind a nonperson predicate.)
At least for now, it’d take a pretty determined troll who could build an em for the sole purpose of being a terrible person. Not saying some humanity-first movement mightn’t pull it off, but by that point you could hopefully have legal recognition (assuming there’s no risk or accidental fooming and they pass the Turing test.)
I don’t think we’re talking ems, we’re talking conscious algorithms which aren’t necessarily humanlike or even particularly intelligent.
And as for the Turing Test, one oughtn’t confuse consciousness with intelligence. A 6-year old human child couldn’t pass off as an adult human, but we still believe the child to be conscious, and my own memories indicate that I indeed was at that age.
Well, I think consciousness, intelligence and personhood are sliding scales anyway, so I may be imagining the output of a Nonperson Predicate somewhat differently to LW norm. OTOH, I guess it’s not a priori impossible that a simple human-level AI could fit on something avvailable to the public, and such an insight would be … risky, yeah. Upvoted.
First of all, I also believe that consciousness is most probably a sliding scale.
Secondly, again you just used “human-level” without specifying human-level at what, at intelligence or at consciousness; as such I’m not sure whether I actually communicated adequately my point that we’re not discussing intelligence here, but just consciousness.
Resolutions of simple confusions usually look pretty obvious in retrospect.
Can you give some more examples of this, besides “free will”? (I don’t understand where your intuitions comes from that certain problems will turn out to have solutions that are obvious in retrospect, and that such feelings of obviousness are trustworthy. Maybe it would help me see your perspective if I got some more past examples.)
I don’t class that as a problem that is discussed by professional philosophers. It’s more of a toy question that introduces the nature of phil. problems—and the importance of asking “it depends on what you mean...”—to laypeople.
It’s not an example that lends much credence to the idea that all problems can be solved that way, even apart from the generalisation-from-one-example issue.
I’m not claiming it proves anything, and I’m not taking sides in this discussion. Someone asked for an example of something—something which varies from person to person depending on whether they’ve dissolved the relevant confusions—and I provided what I thought was the best example. It is not intended to prove anyone’s point; arguments are not soldiers.
It wasn’t an argument at all. That you chose to interpret it as an enemy soldier is your mistake, not mine. It’s not a weak soldier, it’s a … medic or something.
Do you have an example in mind where a certain philosophical question claimed to have been solved or dissolved by Eliezer turned out to be not solved after all, or the solution was wrong?
Do you have an example in mind where a certain philosophical question claimed to have been solved or dissolved by Eliezer turned out to be not solved after all, or the solution was wrong?
Order-dependence and butterfly effects—knew about this and had it in mind when I wrote CEV, I think it should be in the text.
Counterfactual Mugging—check, I don’t think I was calling TDT a complete solution before then but the Counterfactual Mugging was a class of possibilities I hadn’t considered. (It does seem related to Parfit’s Hitchhiker which I knew was a problem.)
Solomonoff Induction—again, I think you may be overestimating how much weight I put on that in the first place. It’s not a workable AI answer for at least two obvious reasons I’m pretty sure I knew about from almost-day-one, (a) it’s uncomputable and (b) it can’t handle utility functions over the environment. However, your particular contributions about halting-oracles-shouldn’t-be-unimaginable did indeed influence me in toward my current notion of second-order logical natural induction over possible models of axioms in which you could be embedded. Albeit I stand by my old reply that Solomonoff Induction would encompass any computable predictions or learning you could do about halting oracles in the environment. (The problem of porting yourself onto any environmental object is something I already knew AIXI would fail at.)
Order-dependence and butterfly effects—knew about this and had it in mind when I wrote CEV, I think it should be in the text.
Ok, I checked the CEV writeup and you did mention these briefly. But that makes me unsure why you claimed to have solved metaethics. What should you do if your FAI comes back and says that your EV shows no coherence due to order dependence and butterfly effects (assuming it’s not some kind of implementation error)? If you’re not sure the answer is “nothing”, and you don’t have another answer, doesn’t that mean your solution (about the meaning of “should”) is at least incomplete, and possibly wrong?
Counterfactual Mugging—check, I don’t think I was calling TDT a complete solution before then but the Counterfactual Mugging was a class of possibilities I hadn’t considered. (It does seem related to Parfit’s Hitchhiker which I knew was a problem.)
You said that TDT solves Parfit’s Hitchhiker, so I don’t know if you would have kept looking for more problems related to Parfit’s Hitchhiker and eventually come upon Counterfactual Mugging.
Solomonoff Induction—again, I think you may be overestimating how much weight I put on that in the first place. It’s not a workable AI answer for at least two obvious reasons I’m pretty sure I knew about from almost-day-one, (a) it’s uncomputable and (b) it can’t handle utility functions over the environment
Both of these can be solved without also solving halting-oracles-shouldn’t-be-unimaginable. For (a), solve logical uncertainty. For (b), switch to UDT-with-world-programs.
Also, here is another problem that maybe you weren’t already aware of.
What should you do if your FAI comes back and says that your EV shows no coherence due to order dependence and butterfly effects (assuming it’s not some kind of implementation error)?
Wouldn’t that kind of make moral reasoning impossible?
Do you have some reliable way of recruiting? What’s the policy alternative? You do what you gotta do, if ends up being just you, nonetheless, you do what you gotta do. Zero people won’t make fewer mistakes than one person.
Quoting Carl Shulman from about a year ago:
I’m not sure if he had both math and philosophy in mind when he wrote that or just math, but in any case surely the same principle applies to the philosophy. If you don’t reach a high confidence that the philosophy behind some FAI design is correct, then you shouldn’t move forward with that design, and if there is only one philosopher on the team, you just can’t reach high confidence in the philosophy.
This does not sound correct to me. Resolutions of simple confusions usually look pretty obvious in retrospect. Or do you mean something broader by “philosophy” than trying to figure out free will?
Did you read the rest of that thread where I talked about how in cryptography we often used formalizations of “security” that were discovered to be wrong years later, and that’s despite having hundreds of people in the research community constantly trying to attack each other’s ideas? I don’t see how formalizing Friendliness could be not just easier and less error prone than formalizing security, but so much so that just one person is enough to solve all the problems with high confidence of correctness.
I mean questions like your R1 and R2, your “nonperson predicate”, how to distinguish between moral progress and moral error / value drift, anthropic reasoning / “reality fluid”. Generally, all the problems that need to be solved for building an FAI besides the math and the programming.
Yes, formalizing Friendliness is not the sort of thing you’d want one person doing. I agree. I don’t consider that “philosophy”, and it’s the sort of thing other FAI team members would have to be able to check. We probably want at least one high-grade actual cryptographer.
Of the others, the nonperson predicate and the moral-progress parts are the main ones where it’d be unusually hard to solve and then tell that it had been solved correctly. I would expect both of those to be factorable-out, though—that all or most of the solution could just be published outright. (Albeit recent experience with trolls makes me think that no insight enabling conscious simulations should ever be published; people would write suffering conscious simulations and run them just to show off… how confident they were that the consciousness theory was wrong, or something. I have a newfound understanding of the utter… do-anything-ness of trolls. This potentially makes it hard to publicly check some parts of the reasoning behind a nonperson predicate.) Anthropic reasoning / “reality fluid” is the sort of thing I’d expect to be really obvious in retrospect once solved. R1 and R2 should be both obvious in retrospect, and publishable.
I have hopes that an upcoming post on the Lob Problem will offer a much more concrete picture of what some parts of the innards of FAI development and formalizing look like.
In principle, creating a formalization of Friendliness consists of two parts, conceptualizing Friendliness, and translating the concept into mathematical language. I’m using “philosophy” and “formalizing Friendliness” interchangeably to refer to both of these parts, whereas you seem to be using “philosophy” to refer to the former and “formalizing Friendliness” for the latter.
I guess this is because you think you can do the first part, then hand off the second part to others. But in reality, constraints about what kinds of concepts can be expressed in math and what proof techniques are available means that you have to work from both ends at the same time, trying to jointly optimize for philosophical soundness and mathematical feasibility, so there is no clear boundary between “philosophy” and “formalizing”.
(I’m inferring this based on what happens in cryptography. The people creating new security concepts, the people writing down the mathematical formalizations, and the people doing the proofs are usually all the same, I think for the above reason.)
My experience to date has been a bit difference—the person asking the right question needs to be a high-grade philosopher, the people trying to answer it only need enough high-grade philosophy to understand-in-retrospect why that exact question is being asked. Answering can then potentially be done with either math talent or philosophy talent. The person asking the right question can be less good at doing clever advanced proofs but does need an extremely solid understanding of the math concepts they’re using to state the kind-of-lemma they want. Basically, you need high math and high philosophy on both sides but there’s room for S-class-math people who are A-class philosophers but not S-class-philosophers, being pointed in the right direction by S-class-philosophers who are A-class-math but not S-class-math. If you’ll pardon the fuzzy terminology.
What happened (if you don’t mind sharing)?
Re non-person predicates, do you even have a non-sharp (but non-trivial) lower bound for it? How do you know that the Sims from the namesake game aren’t persons? How do we know that Watson is not suffering indescribably when losing a round of Jeopardy? And that imagining someone (whose behavior you can predict with high accuracy) suffering is not as bad as “actually” making someone suffer? If this bound has been definitively established, I’d appreciate a link.
It’s unclear where our intuitions on the subject come from or how they work, and they are heavily …. distorted … by various beliefs and biases. OTOH, it seems unlikely that rocks are conscious and we just haven’t extrapolated far enough to realize. It’s also unclear whether personhood is binary or there’s some kind of sliding scale. Nevertheless, it seems clear that a fly is not worth killing people over.
Even a person who has never introspected about their moral beliefs can still know that murder is wrong. They’re more likely to make mistakes, but still.
I get the impression that you have something different in mind as far as ‘trolls’ go than fools who create stereotypical conflicts on the internet. What kind of trolls are these?
The kind who persuade depressed people to commit suicide. The kind who post people’s address on the internet. The kind that burn the Koran in public.
My psychological model says that all trolls are of that kind; some trolls just work harder than others. They all do damage in exchange for attention and the joy of seeing others upset, while exercising the limitless human ability to persuade themselves it’s okay. If you make it possible for them to do damage on their home computers with no chance of being arrested and other people being visibly upset about it, a large number will opt to do so. The amount of suffering they create can be arbitrarily great, so long as they can talk themselves into believing it doesn’t matter for and other people are being visibly upset to give them the attention-reward.
4chan would have entire threads devoted to building worse hells. Yes. Seriously. They really would. And then they would instantiate those hells. So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer, shut up. And if somebody actually does it, don’t be upset on the Internet.
They really would at that. It seems you are concerned here about malicious actual trolls specifically. I suppose if the technology and knowledge was disseminated to that degree (before something actually foomed) then that would be the most important threat. My first thoughts had gone towards researchers with the capabilities and interest to research this kind of technology themselves who are merely callous and who are indifferent to the suffering of their simulated conscious ‘guinea pigs’ for the aforementioned .
At what level of formalization does this kind of ‘incremental progress’ start to count? I ask because your philosophical essays on reductionism, consciousness and zombies is something that seems to be incremental progress towards that end (but which I certainly wouldn’t consider a mistake to publish or a net risk).
Related.
(I’m not a huge fan of SCP in general, but I like a few stories with the “infohazard” tag, and I’m amused by how LW-ish those can get.)
Eliezer could argue that the incremental progress towards stopping the risk outweighs the danger, same as with the general FAI/uFAI secrecy debate.
I think EY vastly overrates security through obscurity. Szilard keeping results about graphite and neutrons secret happened before the Internet; now there’s this thing called the Streisand effect.
I can’t find the quote on that page. Is it from somewhere else (or an earlier version) or am I missing something?
White text. (Apparently there’s a few more hidden features in the entry, but I only found this one.)
I, um, still can’t find it. This white text is on the page you linked to, yes? About the videos that are probably soultraps?
EDIT: Nevermind, got it.
Ah, thanks.
That stupid reason is, at core, nihilistic solipsism—and it’s not as stupid as you’d think. I’m not saying it’s right, but it does happen to be the one inescapable meme-trap of philosophy.
To quote your own fic, their reason is “why not?”—and their consciousness was not grown such that your impassioned defense of compassion and consideration have any intrinsic factor in their utility function.
“The Sims” is often heralded as the best-selling videogame of all time, and it attracts players of all ages, races and genders from all across the world and from all walks of life.[citation needed]
Now imagine if the toons in the game could actually feel what was happening to them and react believably to their environment and situation and events?
I’m sure I don’t need to quote the Rules of Acquisition; everyone here should know where this leads if word of such a technique gets out.
There have always been those who would pull the wings off flies, stomp on mice, or torture kittens. Setting roosters, fish, or dogs to fight each other to death remains a well-known spectacle in many rural parts of the world. In Shakespeare’s day, Londoners enjoyed watching dogs slowly kill bulls or bears, or be killed by them; in France they set bushels of cats on fire to watch them burn. Public executions and tortures, gladiatorial combat among slaves, and other nonconsensual “blood sports” have been common in human history.
What’s the difference?
Scale.
The average individual could not hold private gladiatorial contests, on a whim, at negligible cost. Killing a few innocents by torture, as public spectacle, is significantly less than repeatedly torturing large groups, as private entertainment, for as little as the average individual would have paid for their ticket to the cockfight.
Also, some people reckon the suffering of animals doesn’t matter. They’re wrong, but they wouldn’t care about most of your examples (or at least they would claim it’s because they increase the risk you’ll do the same to humans, which is a whole different kettle of fish.)
Not to mention the sizeable fraction of car drives who will swerve in order to hit turtles. What the hell is wrong with my species?
Link is broken.
… seriously? Poor turtles >:-(
Previous discussion of this on LW
It was mentioned recently on Yvain’s blog and a few months ago on LW (can’t find it right now).
How do you know that they don’t?
How do you know that they don’t?
Why do you always have to ask subtly hard questions? I can just see see your smug face, smiling that smug smile of yours with that slight tilt of the head as we squirm trying to rationalize something up quick.
Here’s my crack at it: They don’t have what we currently think is the requisite code structure to “feel” in a meaningful way, but of course we are too confused to articulate the reasons much further.
Thank you, I’m flattered. I have asked Eliezer the same question, not sure if anyone will reply. I hoped that there is a simple answer to this, related to the complexity of information processing in the substrate, like the brain or a computer, but I cannot seem to find any discussions online. Probably using wrong keywords.
Not directly related. I think it has a lot to do with being roughly isomorphic to how a human thinks, which requires large complexity, but a particular complexity.
When I evaluate such questions IRL, like in the case of helping out an injured bird, or feeding my cat, I notice that my decisions seem to depend on whether I feel empathy for the thing. That is, do my algorithms recognize it as a being, or as a thing.
But then empathy can be hacked or faulty (see for example pictures of african children, cats and small animals, ugly disfigured people, far away people, etc), so I think of a sort of “abstract empathy” that is doing the job of recognizing morally valuable beings without all the bugs of my particular implementation of it.
In other words, I think it’s a matter of moral philosophy, not metaphysics.
Information integration theory seems relevant.
Well, I can’t speak for the latest games, but I’ve personally read (some of) the core AI code for the toons in the first game of the series, and there was nothing in there that made a model of said code or attempted any form of what I’d even call “reasoning” throughout. No consciousness or meta-awareness.
By being simulated by the code simulating the game in which they “are”, they could to some extent be said to be “aware” of certain values like their hunger level, if you really want to stretch wide the concept of “awareness”. However, there seems to be no consciousness anywhere to be ‘aware’ (in the anthropomorphized sense) of this.
Since my priors are such that I consider it extremely unlikely that consciousness can exist without self-modeling and even more unlikely that consciousness is nonphysical, I conclude that there is a very low chance that they can be considered a “mind” with a consciousness that is aware of the pain and stimuli they receive.
The overall system is also extremely simple, in relative terms, considering the kind of AI code that’s normally discussed around these parts.
I used to torture my own characters to death a lot, back in the day.
EDIT: Not to mention what I did when playing Roller Coaster Tycoon.
The favourite Sim household of my housemate was based on “Buffy the Vampire Slayer”. Complete with a graveyard constructed in the backyard. Through the judicial application of “remove ladder” from the swimming pool.
And this is all without any particular malice!
Why would them feeling it help them “react believably to their environment and situation and events”? If they’re dumb enough to “run lots of small, stupid, suffering conscious agents on a home computer”, I mean.
Of course, give Moore time and this objection will stop applying.
We’re already pretty close to making game characters have believable reactions, but only through clever scripting and a human deciding that situation X warrants reaction Y, and then applying mathematically-complicated patterns of light and prerecorded sounds onto the output devices of a computer.
If we can successfully implement a system that has that-function-we-refer-to-when-we-say-”consciousness” and that-f-w-r-t-w-w-s-”really feel pain”, then it seems an easy additional step to implement the kind of events triggering the latter function and the kind of outputs from the former function that would be believable and convincing to human players. I may be having faulty algorithmic intuitions here though.
Well, if they were as smart as humans, sure. Even as smart as dogs, maybe. But if they’re running lots of ’em on a home PC, then I must have been mistaken about how smart you have to be for consciousness.
In case anyone doubts this, as a long-time observer of the 4chan memeplex, I concur.
Related:
How often does 4chan torture animals? That’s pretty easy to pull off. Are they doing it all the time and I haven’t noticed, or is there some additional force preventing it (e.g. Anonymous would hunt them down and post their details online, or 4chan all just like animals.)
Not often. Hurting animals is generally considered Not OK on 4chan, to the extent that anything is Not OK on 4chan.
There are a few pictures and stories that get passed around (some kids kicking a cat against a wall like a football, shoveldog, etc), but many fewer than the human gore pictures. 4channers mostly aggregate this stuff from all over and post it to be edgy and drive people who aren’t edgy enough away from 4chan.
And yeah, to the extent that people do torture animals in current events (as opposed to past stories), vast hordes of moralfags and raiders from 4chan tend to hunt them down and ruin their lives.
I wonder if this might happen to people running hells too? I lack the domain expertise to judge if this is ludicrous or impossible to predict or what.
Really depends on whether the beings in the hell are cute and empathetic.
Humans don’t like to hurt things that are cute and empathetic, and don’t like them getting hurt. Otherwise we don’t care.
I remember that once, a Facebook page was hacked into (I guess) and started posting pictures and stories about tortured animals. Everybody went WTF and the page was shut down a few days later.
I’ve never been there, but plenty of people on the internet do. Facebook pages against vivisection etc. seem to get way more likes than those in favour of it, the meme that humanity had better become extinct because wildlife would be better off is quite widespread, and some people even rejoice when a hunter dies (though this is a minority stance).
You know, I want to say you’re completely and utterly wrong. I want to say that it’s safe to at least release The Actual Explanation of Consciousness if and when you should solve such a thing.
But, sadly, I know you’re absolutely right re the existence of trolls which would make a point of using that to create suffering. Not just to get a reaction, but some would do it specifically to have a world they could torment beings.
My model is not that all those trolls are identical (In that I’ve seen some that will explicitly unambiguously draw the line and recognize that egging on suicidal people is something that One Does Not Do, but I also know (seen) that all too many gleefully do do that.)
It’s worth noting that private torture chambers seem different to trolling, but a troll can still set up a torture chamber—they just care about people’s reaction to it, not the torture itself.
Most any incremental progress towards AGI, or even “just” EMs, would be dual use (if not centuple use) and could be (ab)used for helping achieve such enterta … vile and nefarious purposes.
In fact, it is hard to imagine realistic technological progress that can solely be used to run lots of small, stupid, suffering conscious agents but not as a stepping stone towards more noble pursuits (… such as automated poker playing agents).
I sometimes wonder if this does not already exist, except for the suffering and consciousness being merely simulated. That is, computer games in which the entire purpose is to inflict unspeakable acts on powerless NPCs, acts whose depiction in prose or pictures would be grossly illegal almost everywhere. But I’ve never heard of such a thing actually existing.
What sort of acts are we talking here? Because I’m genuinely having trouble thinking of any “acts whose depiction in prose or pictures would be grossly illegal almost everywhere” except maybe pedophilia. Censorship and all that.
And there are some fairly screwed-up games out there, although probably not as bad as they could be if designed with that in mind (as opposed to, y’know, the enjoyment of the player.)
Well would you, if it was grossly illegal to describe the contents?
I didn’t want to be explicit, but you thought of the obvious example.
For the sort of 4chan people Eliezer mentioned, these would be completely congruent.
It is well known that illegal pornography exists on non-interactive media. For interactive media, all I’ve ever heard of is 18-rated sex scenes.
I can’t think of any other examples, though.
… maaaybe. Again, I’m not sure exactly what you have in mind.
Good point. Indeed, it’s well known that child porn exists on some level.
In fact … I do vaguely recall something about a Japanese game about rape causing a moral panic of some kind, so …
EDIT: In fact, it featured kids too! RapeLay. It’s … fairly horrible, although I think someone with the goal of pure horribleness would do .. better? Worse? Whatever.
Wishing I could disagree with you, and, suspiciously, I find myself believeing that there would be enough vigilante justice to discourage hellmaking—after all, the trolls are doing it for the attention, and if that attention comes in the form of people posting your details and other people breaking into your house to steal your computer and/or murder you (for the greater good) then I doubt there will be many takers.
I just wish I could trust that doubt.*
*(Not expressing a wish for trust pills.)
EDIT: Animal experimentation and factory farming are still popular, but they have financial incentive … and I vaguely recall that some trolls kicked a dog across a football field or something and were punished by Anonymous. That’s where the analogy comes from, anyway, so I’d be interested if someone knows more.
Is contrarianism (which is all that a lot of low net trolling is) actually damaging? Are people free of responsibility for rising to the bait?
You ask that as if there were some conservation of responsibility, where the people rising to the bait are responsible for doing so, and this somehow means the troll is not responsible for deliberately giving the opportunity, and by the troll not being responsible, this somehow means the troll is not causing damage.
The damage is the result of interactions of multiple causes, and for each of those causes that is a decision by an agent, you can consider that agent “responsible”. This can get complicated when causes include not-so-agenty behaviors of quasi-agents. At some point “responsibility” stops being a useful concept, and you just have to decide whatever policy it was supposed to inform by actually looking at the likely consequences.
I’m still not convinced there is any damage in the kind of low level trolling which is just teasing.
Public notice: I’m now considering this user a probable troll and will act accordingly.
In the future, I may so consider and act, under these sorts of circumstances, without such public notice.
Does any (old-time, trusted) user want to volunteer as the mod who goes back and deletes all the troll comments once a user has been designated a troll?
Thank you! Intentional troll or not, this user’s extremely prolific posting of low-value comments is something I’d rather not see here.
That seems like something that should be automated. I could give you a script that (for example) goes through a user page and deletes (or rather sends the appropriate button press that would be a delete command when run by a mod) any comment with less than 0 karma. It would make more sense for this to be an implemented feature of the site but “something I could implement in an hour without accessing the source code” seems like a suitable proxy for “is easy to make a feature of the site, one way or another”.
That sounds helpful. Let’s give this a shot. (I’m running updated Chrome on Win7 if that’s relevant.)
He could have just been talking about trolling in the abstract. And even if not, after reading a bit of his history, his “trolling”, if any, is at most at the level of rhethorical questions. I’m not really a fan of his commenting, but if he’s banned, I’d say “banned for disagreement” will be closer to the mark as a description of what happened than “banned for trolling”, though not the whole story.
Hi.
You can respond to the argument.(it might even do you good). or you can refuse to consider criticism. It’s your choice. From which I will draw my own conclusions.
Hi.
You can state that you’ve never trolled in this site (not even “low-level”) and promise to never troll here (not even “low-level”) in the future.
As a sidenote, previously you argued that people who respond to trolls are also to blame. Now you argue that EY would be to blame if he does not respond to a possible troll.
From this discrepancy I just drew my own conclusions.
...um, Aris, you’re feeding the troll...
A troll (no matter how low-level) wastes time, ruins people’s mood, and destroys trust.
Every actual troll increases the probability that some other innocent newcomer will get accused of trolling wrongly, making a community just that tiny bit more hostile to newcomers, and thus less open to the outside world. Then those newcomers judge the community badly for the community’s negative judgment of them. Bad feelings all around, and a community which is now less receptive of new ideas, because they might just be trolling.
So, yeah, trolling does damage. Trolling is bad, bad, bad.
Doesn’t it take two to waste time?
Yes, most crimes take at least two people, the victim and the perpetrator.
Isn’t “crimes” just a wee bit overheated?
For very minor moral crimes (e.g. insults) the same applies: the insulter and the insulted. The spitter and the spat upon. The troll and the trolled.
Insults are highly subjective too.
It could be argued that such trolling can cause circumstantial damage or emotional damage, through intermediaries, with an example that takes a couple of weak steps in reasoning.
It could be argued back that this is circumstantial, and therefore not caused knowingly by the troll, and the example taken apart by all those weak points by giving them actual numbers for their probability.
Then it could be counter-argued again that the amount of possible circumstances or possible combinations of circumstances that would bring about some form of damage is such, compared to the circumstances that would not and the probabilistic facts of the trolling, that it ends up being more likely than not that at least one out of the incredibly many possible sets of circumstances will apply for any given instance of trolling.
I need a sanity check for motivated stopping here, but I don’t see any good further counter-argument that I could steel-man and would show that this isn’t a case of “trolling causes damage in a predictable manner”, unless my prior that such damage would not occur in the absence of trolling is completely wrong.
It could be argued that there is a opposite process by which people label undamaging behaviour as trollig so that,eg, they don’t have to updaate.
That’s like saying you shouldn’t drive on your street when Joe is driving because Joe is a bad driver. It’s true that you should update and avoid driving when Joe is out. But you find that out after Joe crashes his car into you. At which point, damage has been done to your car that updating won’t fix.
Too much metaphor. What is this damage?
If you present a depressed person with strong arguments that they should commit suicide, this is likely to cause their beliefs to change. So changing their beliefs back to their old level so that they can continue functioning as before (as opposed to killing themselves) will require work in addition to realizing they shouldn’t talk to you anymore, possibly in the form of support and hugs from other supportive people. Similarly, if your car has been damaged in an accident, it will require additional work to run again, such as replacing deformed parts. The car won’t magically start running once Joe is off the road.
And who is doing that?
I think trolls.
Well, no-one is encouraging suicide here, so there are no trolls here.
Uhh. There are no trolls here, therefore trolls do not cause damage?
I dare say high-level trolls cause all sorts of damage, but what’s the relevance?
How are these related? One is epistemology and one is ontology.
At least for now, it’d take a pretty determined troll who could build an em for the sole purpose of being a terrible person. Not saying some humanity-first movement mightn’t pull it off, but by that point you could hopefully have legal recognition (assuming there’s no risk or accidental fooming and they pass the Turing test.)
I don’t think we’re talking ems, we’re talking conscious algorithms which aren’t necessarily humanlike or even particularly intelligent.
And as for the Turing Test, one oughtn’t confuse consciousness with intelligence. A 6-year old human child couldn’t pass off as an adult human, but we still believe the child to be conscious, and my own memories indicate that I indeed was at that age.
Well, I think consciousness, intelligence and personhood are sliding scales anyway, so I may be imagining the output of a Nonperson Predicate somewhat differently to LW norm. OTOH, I guess it’s not a priori impossible that a simple human-level AI could fit on something avvailable to the public, and such an insight would be … risky, yeah. Upvoted.
First of all, I also believe that consciousness is most probably a sliding scale.
Secondly, again you just used “human-level” without specifying human-level at what, at intelligence or at consciousness; as such I’m not sure whether I actually communicated adequately my point that we’re not discussing intelligence here, but just consciousness.
Well, they do seem to be correlated in any case. However, I was referring to consciousness (whatever that is.)
Can you give some more examples of this, besides “free will”? (I don’t understand where your intuitions comes from that certain problems will turn out to have solutions that are obvious in retrospect, and that such feelings of obviousness are trustworthy. Maybe it would help me see your perspective if I got some more past examples.)
A tree falls in a forest with no-one to hear it. Does it make a sound?
And the other example being generalised from isnt that good
I don’t class that as a problem that is discussed by professional philosophers. It’s more of a toy question that introduces the nature of phil. problems—and the importance of asking “it depends on what you mean...”—to laypeople.
I agree, but that’s not what I was aiming for. It’s an example of obviousness after the fact, not philosophers being wrong/indecisive.
It’s not an example that lends much credence to the idea that all problems can be solved that way, even apart from the generalisation-from-one-example issue.
I’m not claiming it proves anything, and I’m not taking sides in this discussion. Someone asked for an example of something—something which varies from person to person depending on whether they’ve dissolved the relevant confusions—and I provided what I thought was the best example. It is not intended to prove anyone’s point; arguments are not soldiers.
The counterargument to “arguments are not soldiers” is “a point should have a point”.
It wasn’t an argument at all. That you chose to interpret it as an enemy soldier is your mistake, not mine. It’s not a weak soldier, it’s a … medic or something.
Do you have an example in mind where a certain philosophical question claimed to have been solved or dissolved by Eliezer turned out to be not solved after all, or the solution was wrong?
Shut Up and Divide?
Beware Selective Nihilism
Also, instances where Eliezer didn’t seem to realize that a problem existed until someone pointed it out to him:
Marcello on responses to moral arguments being possibly order-dependent or subject to butterfly effects
Nesov’s Counterfactual Mugging
Me arguing that Solomonoff Induction may not be adequate
Eliezer seemingly not realizing that making certain kinds of physics “unimaginable” is a bad idea
Order-dependence and butterfly effects—knew about this and had it in mind when I wrote CEV, I think it should be in the text.
Counterfactual Mugging—check, I don’t think I was calling TDT a complete solution before then but the Counterfactual Mugging was a class of possibilities I hadn’t considered. (It does seem related to Parfit’s Hitchhiker which I knew was a problem.)
Solomonoff Induction—again, I think you may be overestimating how much weight I put on that in the first place. It’s not a workable AI answer for at least two obvious reasons I’m pretty sure I knew about from almost-day-one, (a) it’s uncomputable and (b) it can’t handle utility functions over the environment. However, your particular contributions about halting-oracles-shouldn’t-be-unimaginable did indeed influence me in toward my current notion of second-order logical natural induction over possible models of axioms in which you could be embedded. Albeit I stand by my old reply that Solomonoff Induction would encompass any computable predictions or learning you could do about halting oracles in the environment. (The problem of porting yourself onto any environmental object is something I already knew AIXI would fail at.)
Ok, I checked the CEV writeup and you did mention these briefly. But that makes me unsure why you claimed to have solved metaethics. What should you do if your FAI comes back and says that your EV shows no coherence due to order dependence and butterfly effects (assuming it’s not some kind of implementation error)? If you’re not sure the answer is “nothing”, and you don’t have another answer, doesn’t that mean your solution (about the meaning of “should”) is at least incomplete, and possibly wrong?
You said that TDT solves Parfit’s Hitchhiker, so I don’t know if you would have kept looking for more problems related to Parfit’s Hitchhiker and eventually come upon Counterfactual Mugging.
Both of these can be solved without also solving halting-oracles-shouldn’t-be-unimaginable. For (a), solve logical uncertainty. For (b), switch to UDT-with-world-programs.
Also, here is another problem that maybe you weren’t already aware of.
Wouldn’t that kind of make moral reasoning impossible?
Both.