“Good” simply means “our targeted property” here. So my point is, if WE is true to any property P, we could get a P-Waluigi through some anti-P (pseudo-)naive targeting.
I don’t get your second point, we’re talking about simulacra not agent, and obviously this idea would only be part of a larger solution at best. For any property P, I expect several anti-P so you don’t have to instanciate an actually bad Luigi, my idea is more about to trap deception as a one-layer only.
I’m confused, could you clarify? I interpret your “Wawaluigi” as two successive layers of deception within a simulacra, which is unlikely if WE is reliable, right? I didn’t say anything about Wawaluigis and I agree that they are not Luigis, because as I said, a layer of Waluigi is not a one-to-one operator. My guess is about a normal Waluigi layer, but with a desirable Waluigi rather than a harmful Waluigi.
“Good” simply means “our targeted property” here. So my point is, if WE is true to any property P, we could get a P-Waluigi through some anti-P (pseudo-)naive targeting.
I don’t get your second point, we’re talking about simulacra not agent, and obviously this idea would only be part of a larger solution at best. For any property P, I expect several anti-P so you don’t have to instanciate an actually bad Luigi, my idea is more about to trap deception as a one-layer only.
This wouldn’t work. Wawaluigis are not luigis.
I’m confused, could you clarify? I interpret your “Wawaluigi” as two successive layers of deception within a simulacra, which is unlikely if WE is reliable, right?
I didn’t say anything about Wawaluigis and I agree that they are not Luigis, because as I said, a layer of Waluigi is not a one-to-one operator. My guess is about a normal Waluigi layer, but with a desirable Waluigi rather than a harmful Waluigi.