Raemon comments on The Hidden Complexity of Wishes

Raemon 17 Oct 2024 20:35 UTC
18 points
3
a) I think at least part of what’s gone on is that Eliezer has been misunderstood and facing the same actually quite dumb arguments a lot, and he is now (IMO) too quick to round new arguments off to something he’s got cached arguments for. (I’m not sure whether this is exactly what went on in this case, but seems plausible without carefully rereading everything)
b) I do think when Eliezer wrote this post, there were literally a bunch of people making quite dumb arguments that were literally “the solution to AI ethics/alignment is [my preferred elegant system of ethics] / [just have it track smiling faces] / [other explicit hardcoded solutions that were genuinely impractical]”
I think I personally did also not get what you were trying to say for awhile, so I don’t think the problem here is just Eliezer (although it might be me making a similar mistake to what I hypothesize Eliezer to have made, for reasons that are correlated with him)
I do generally think a criticism I have of Eliezer is that he has spent too much time comparatively focused on the dumber ³⁄₄ of arguments, instead of engaging directly with top critics which are often actually making more subtle points (and being a bit too slow to update that this is what’s going on)
- Eliezer Yudkowsky 18 Oct 2024 17:02 UTC
  42 points
  52
  Parent
  Wish there was a system where people could pay money to bid up what they believed were the “top arguments” that they wanted me to respond to. Possibly a system where I collect the money for writing a diligent response (albeit note that in this case I’d weigh the time-cost of responding as well as the bid for a response); but even aside from that, some way of canonizing what “people who care enough to spend money on that” think are the Super Best Arguments That I Should Definitely Respond To. As it stands, whatever I respond to, there’s somebody else to say that it wasn’t the real argument, and this mainly incentivizes me to sigh and go on responding to whatever I happen to care about more.
  (I also wish this system had been in place 24 years ago so you could scroll back and check out the wacky shit that used to be on that system earlier, but too late now.)
  - Raemon 19 Oct 2024 22:29 UTC
    13 points
    9
    Parent
    I do think such a system would be really valuable, and is the sort of the thing the LW team should try to build. (I’m mostly not going to respond to this idea right now but I’ve filed it away as something to revisit more seriously with Lightcone. Seems straightforwardly good)
    But it feels slightly orthogonal to what I was trying to say. Let me try again.
    (this is now official a tangent from the original point, but, feels important to me)
    It would be good if the world could (deservedly) trust, that the best x-risk thinkers have a good group epistemic process for resolving disagreements.
    At least two steps that seem helpful for that process are:
    Articulating clear lists of the best arguments, such that people can prioritize refuting them (or updating on them).
    But, before that, there is a messier process of “people articulating half formed versions of those arguments, struggling to communicate through different ontologies, being slightly confused.” And there is some back-and-forth process typically needed to make progress.
    It is that “before” step where it feels like things seem to be going wrong, to me. (I haven’t re-read Matthew’s post or your response comment from a year ago in enough detail to have a clear sense of what, if anything, went wrong. But to illustrate the ontology: I that instance was roughly in the liminal space between the two steps)
    Half-formed confused arguments in different ontologies are probably “wrong”, but that isn’t necessarily because they are completely stupid, it can be because they are half-formed. And maybe the final version of the argument is good, or maybe not, but it’s at least a less stupid version of that argument. And if Alice rejects a confused, stupid argument in a loud way, without understanding the generator that Bob was trying to pursue, Bob’s often rightly annoyed that Alice didn’t really hear them and didn’t really engage.
    Dealing with confused half-formed arguments is expensive, and I’m not sure it’s worth people’s time, especially given that confused half-formed arguments are hard to distinguish from “just wrong” ones.
    But, I think we can reduce wasted-motion on the margin.
    A hopefully cheap-enough TAP that might help if more people did, might be something like:
    <TAP> When responding to a wrong argument (which might be completely stupid, or might be a half-formed thing going in an eventually interesting direction)
    <ACTION> Preface response with something like: “I think you’re saying X. Assuming so, I think this is wrong because [insert argument].” End the argument with “If this seemed to be missing the point, can you try saying your thing in different words, or clarify?”
    (if it feels too expensive to articulate what X is, instead one could start with something more like “It looks at first glance like this is wrong because [insert argument]” and then still end with the “check if missing the point?” closing note)
    I think more-of-that-on-the-margin from a bunch of people would save a lot of time spent in aggro-y escalation spirals.
    re: top level posts
    This doesn’t quite help with when, instead of replying to someone, you’re writing a top-level post responding to an abstracted argument (i.e. The Sun is big, but superintelligences will not spare Earth a little sunlight).
    I’d have to think more about what to do for that case, but, the sort of thing I’m imagining is a bit more scaffolding that builds towards “having a well indexed list of the best arguments.” Maybe briefly noting early on “This essay is arguing for [this particular item in List of Lethalities]” or “This argument is adding a new item to List of Lethalities” (and then maybe update that post, since it’s nice to have a comprehensive list).
    This doesn’t feel like a complete solution, but, the sort of things I’d be looking for a cheap things you can add to posts that help bootstrap towards a clearer-list-of-the-best-arguments existing.
  - Christopher King 6 Nov 2024 17:45 UTC
    1 point
    0
    Parent
    I would suggest formulating this like a literal attention economy.
    
    You set a price for your attention (probably like $1). The price at which even if the post is a waste of time, the money makes it worth it.
    “Recommenders” can recommend content to you by paying the price.
    If the content was worth your time, you pay the recommender the $1 back plus a couple cents.
    
    The idea is that the recommenders would get good at predicting what posts you’d pay them for. And since you aren’t a causal decision theorist they know you won’t scam them. In particular, on average you should be losing money (but in exchange you get good content).
    
    This doesn’t necessarily require new software. Just tell people to send PayPals with a link to the content.
    
    With custom software, theoretically there could exist a secondary market for “shares” in the payout from step 3 to make things more efficient. That way the best recommenders could sell their shares and then use that money to recommend more content before you payout.
    
    If the system is bad at recommending content, at least you get paid!