Raemon comments on CFAR Takeaways: Andrew Critch

Raemon Feb 25, 2024, 7:53 PM
4 points
0
POV, this is the key observation that should’ve, and should still, instigate a basic attempt to model what humans actually are and what is actually up in today’s humans. It’s too basic a confusion/surprise to respond to by patching the symptoms without understanding what’s underneath.
On one hand, when you say it like that, it does seem pretty significant.
I’m not sure I think there’s that much confusion to explain? Like, my mainline story here is:
1. Humans are mostly a kludge of impulses which vary in terms of how coherent / agentic they are. Most of them have wants that are fairly basic, and don’t lend themselves super well to strategic thinking. (I think most of them also consider strategic thinking sort of uncomfortable/painful). This isn’t that weird, because, like, having any kind of agency at all is an anomaly. Most animals have only limited agency and wanting-ness.
2. There’s some selection effect where the people who might want to start Rationality Orgs are more agentic, have more complex goals, and find deliberate thinking about their wants and goals more natural/fun/rewarding.
3. The “confusion” is mostly a typical mind error on the part of people like us, and if you look at evolution the state of most humans isn’t actually that weird or surprising.
Perhaps something I’m missing or confused about is what exactly Critch (or, you, if applicable?) mean by “people don’t seem to want things.” I maybe am surprised that the filtering effect of people who showed up at CFAR workshops or similar still didn’t want things.
Can you say a bit more about what you’ve experienced, and what felt surprising or confusing about it?
What links here?
- sunwillrise's comment on The Field of AI Alignment: A Postmortem, and What To Do About It by johnswentworth (Dec 27, 2024, 9:21 AM; 16 points)
- AnnaSalamon Feb 25, 2024, 9:09 PM
  59 points
  15
  Parent
  Some partial responses (speaking only for myself):
  1. If humans are mostly a kludge of impulses, including the humans you are training, then… what exactly are you hoping to empower using “rationality training”? I mean, what wants-or-whatever will they act on after your training? What about your “rationality training” will lead them to take actions as though they want things? What will the results be?
  1b. To illustrate what I mean: once I taught a rationality technique to SPARC high schoolers (probably the first year of SPARC, not sure; I was young and naive). Once of the steps in the process involved picking a goal. After walking them through all the steps, I asked for examples of how it had gone, and was surprised to find that almost all of them had picked such goals as “start my homework earlier, instead of successfully getting it done at the last minute and doing recreational math meanwhile”… which I’m pretty sure was not their goal in any wholesome sense, but was more like ambient words floating around that they had some social allegiance to. I worry that if you “teach” “rationality” to adults who do not have wants, without properly noticing that they don’t have wants, you set them up to be better-hijacked by the local memeset (and to better camouflage themselves as “really caring about AI risk” or whatever) in ways that won’t do anybody any good because the words that are taking the place of wants don’t have enough intelligence/depth/wisdom in them.
  2. My guess is that the degree of not-wanting that is seen among many members of the professional and managerial classes in today’s anglosphere is more extreme than the historical normal, on some dimensions. I think this partially because:
  a. IME, my friends and I as 8-year-olds had more wanting than I see in CFAR participants a lot of the time. My friends were kids who happened to live on the same street as me growing up, so probably pretty normal. We did have more free time than typical adults.
  i. I partially mean: we would’ve reported wanting things more often, and an observer with normal empathy would on my best guess have been like “yes it does seem like these kids wish they could go out and play 4-square” or whatever. (Like, wanting you can feel in your body as you watch someone, as with a dog who really wants a bone or something).
  ii. I also mean: we tinkered, toward figuring out the things we wanted (e.g. rigging the rules different ways to try to make the 4-square game work in a way that was fun for kids of mixed ages, by figuring out laxer rules for the younger ones), and we had fun doing it. (It’s harder to claim this is different from the adults, but, like, it was fun and spontaneous and not because we were trying to mimic virtue; it was also this way when we saved up for toys we wanted. I agree this point may not be super persuasive though.)
  b. IME, a lot of people act more like they/we want things when on a multi-day camping trip without phones/internet/work. (Maybe like Critch’s post about allowing oneself to get bored?)
  c. I myself have had periods of wanting things, and have had periods of long, bleached-out not-really-wanting-things-but-acting-pretty-”agentically”-anyway. Burnout, I guess, though with all my CFAR techniques and such I could be pretty agentic-looking while quite burnt out. The latter looks to me more like the worlds a lot of people today seem to me to be in, partly from talking to them about it, though people vary of course and hard to know.
  d. I have a theoretical model in which there are supposed to be cycles of yang and then yin, of goal-seeking effort and then finding the goal has become no-longer-compelling and resting / getting board / similar until a new goal comes along that is more compelling. CFAR/AIRCS participants and similar people today seem to me to often try to stop this process—people caffeinate, try to work full days, try to have goals all the time and make progress all the time, and on a large scale there’s efforts to mess with the currency to prevent economic slumps. I think there’s a pattern to where good goals/wanting come from that isn’t much respected. I also think there’s a lot of memes trying to hijack people, and a lot of memetic control structures that get upset when members of the professional and managerial classes think/talk/want without filtering their thoughts carefully through “will this be okay-looking” filters.
  All of the above leaves me with a belief that the kinds of not-wanting we see are more “living human animals stuck in a matrix that leaves them very little slack to recover and have normal wants, with most of their ‘conversation’ and ‘attempts to acquire rationality techniques’ being hijacked by the matrix they’re in rather than being earnest contact with the living animals inside” and less “this is simple ignorance from critters who’re just barely figuring out intelligence but who will follow their hearts better and better as you give them more tools.”
  Apologies for how I’m probably not making much sense; happy to try other formats.
  - AnnaSalamon Feb 26, 2024, 4:04 AM
    9 points
    5
    Parent
    A related tweet by Qiaochu:
    (I don’t necessarily agree with QC’s interpretation of what was going on as people talked about “agency”—I empathize some, but empathize also with e.g. Kaj’s comment in a reply that Kaj doesn’t recognize this at from Kaj’s 2018 CFAR mentorship training, did not find pressures there to coerce particular kinds of thinking).
    My point in quoting this is more like: if people don’t have much wanting of their own, and are immersed in an ambient culture that has opinions on what they should “want,” experiences such as QC’s seem sorta like the thing to expect. Which is at least a bit corroborated by QC reporting it.
    - Elizabeth Feb 27, 2024, 2:03 AM
      8 points
      2
      Parent
      I’m not sure if this is a disagreement or supporting evidence, but: I remember you saying you didn’t want to teach SPARC kids too much [word similar to agency but not quite that. Maybe good at executing plans?], because they’d just use it to [coerce] themselves more. This was definitely before covid, maybe as far back as 2015 or 2016. I’m almost certain it was before QC even joined CFAR. It was a helpful moment for me.
  - Wei Dai Feb 28, 2024, 7:18 AM
    4 points
    0
    Parent
    
    If humans are mostly a kludge of impulses, including the humans you are training, then… what exactly are you hoping to empower using “rationality training”? I mean, what wants-or-whatever will they act on after your training? What about your “rationality training” will lead them to take actions as though they want things? What will the results be?
    
    To give a straight answer to this, if I was doing rationality training (if I was agenty enough to do something like that), I’d have the goal that the trainees finish the training with the realization that they don’t know what they want or don’t currently want anything, but they may eventually figure out what they want or want something, and therefore in the interim they should accumulate resources/optionality, avoid doing harm (things that eventually might be considered irreversibly harmful), and push towards eventually figuring out what they want. And I’d probably also teach a bunch of things to mitigate the risk that the trainees too easily convince themselves that they’ve figured out what they want.
    - Kaj_Sotala Feb 28, 2024, 6:29 PM
      4 points
      2
      Parent
      There’s something about this framing that feels off to me and makes me worry that it could be counterproductive. I think my main concerns are something like:
      1) People often figure out what they want by pursuing things they think they want and then updating on the outcomes. So making them less certain about their wants might prevent them from pursuing the things that would give them the information for actually figuring it out.
      2) I think that people’s wants are often underdetermined and they could end up wanting many different things based on their choices. E.g. most people could probably be happy in many different kinds of careers that were almost entirely unlike each other, if they just picked one that offered decent working conditions and committed to it. I think this is true for a lot of things that people might potentially want, but to me the framing of “figure out what you want” implies that people’s wants are a lot more static than this.
      I think this 80K article expresses these kinds of ideas pretty well in the context of career choice:
      The third problem [with the advice of “follow your passion”] is that it makes it sound like you can work out the right career for you in a flash of insight. Just think deeply about what truly most motivates you, and you’ll realise your “true calling”. However, research shows we’re bad at predicting what will make us happiest ahead of time, and where we’ll perform best. When it comes to career decisions, our gut is often unreliable. Rather than reflecting on your passions, if you want to find a great career, you need to go and try lots of things.
      The fourth problem is that it can make people needlessly limit their options. If you’re interested in literature, it’s easy to think you must become a writer to have a satisfying career, and ignore other options.
      But in fact, you can start a career in a new area. If your work helps others, you practice to get good at it, you work on engaging tasks, and you work with people you like, then you’ll become passionate about it. The ingredients of a dream job we’ve found are most supported by the evidence, are all about the context of the work, not the content. Ten years ago, we would have never imagined being passionate about giving career advice, but here we are, writing this article.
      Many successful people are passionate, but often their passion developed alongside their success, rather than coming first. Steve Jobs started out passionate about zen buddhism. He got into technology as a way to make some quick cash. But as he became successful, his passion grew, until he became the most famous advocate of “doing what you love”.
  - TekhneMakre Feb 27, 2024, 6:30 AM
    4 points
    2
    Parent
    It makes sense, but I think it’s missing that adults who try to want in the current social world get triggered and/or traumatized as fuck because everyone else is behaving the way you describe.
    - AnnaSalamon Feb 27, 2024, 6:34 AM
      2 points
      0
      Parent
      Totally. Yes.