AnnaSalamon comments on What should you change in response to an “emergency”? And AI risk

AnnaSalamon 21 Jul 2022 19:50 UTC
40 points
31
I appreciate this comment a lot. Thank you. I appreciate that it’s sharing an inside view, and your actual best guess, despite these things being the sort of thing that might get social push-back!
My own take is that people depleting their long-term resources and capacities is rarely optimal in the present context around AI safety.

My attempt to share my reasoning is pretty long, sorry; I tried to use bolding to make it skimmable.
In terms of my inside-view disagreement, if I try to reason about people as mere means to an end (e.g. “labor”):
0. A world where I’d agree with you. If all that would/could impact AI safety was a particular engineering project (e.g., Redwood’s ML experiments, for concreteness), and if the time-frame of a person’s relevance to that effort was relatively short (e.g., a year or two, either because AI was in two years, or because there would be an army of new people in two years), I agree that people focusing obsessively for 60 hours/week would probably produce more than the same people capping their work at 35 hrs/week.
But (0) is not the world we’re in, at least right now. Specific differences between a world where I’d agree with you, and the world we seem to me to be in:
1. Having a steep discount rate on labor seems like a poor predictive bet to me. I don’t think we’re within two years of the singularity; I do think labor is increasing but not at a crazy rate; and a person who keeps their wits and wisdom about them, who pays attention and cares and thinks and learns, and especially someone who is relatively new to the field and/or relatively young (which is the case for most such engineers I think), can reasonably hope to be more productive in 2 years than they are now, which can roughly counterbalance the increase (or more than counterbalance the increase) on my best guess.
E.g., if they get hired and Redwood and then stay there, you’ll want veterans a couple years later who already know your processes and skills.
(In 2009, I told myself I needed only to work hard for ~5 years, maybe 10, because after that I’d be a negligible portion of the AI safety effort, so it was okay to cut corners. I still think I’m a non-negligible portion of the effort.)
1.1. Trying a thing to see if it works (e.g. 60 hrs/week of obsession, to see how that is) might still be sensible, but more like “try it and see if it works, especially if that risk and difficulty is appealing, since “appealingness” is often an indicator that a thing will turn out to make sense / to yield useful info / to be the kind of thing one can deeply/sincerely try rather than forcing oneself to mimic, etc.” not like “you are nothing and don’t matter much after two years, run yourself into the ground while trying to make a project go.” I suppose your question is about accepting a known probability of running yourself into the ground, but I’m having trouble booting that sim; to me the two mindsets are pretty different. I do think many people are too averse to risk and discomfort; but also that valuing oneself in the long-term is correct and important. Sorry if I’m dodging the question here.
2. There is no single project that is most of what matters in AI safety today, AFAICT. Also, such projects as exist are partly managerially bottlenecked. And so it isn’t “have zero impact” vs “be above Redwood’s/project such-and-such’s hiring line,” it is “be slightly above a given hiring line” (and contribute the difference between that spot and the person who would fill it next, or between that project having one just-above-margin person and having one fewer but more managerial slack) vs “be alive and alert and curious as you take an interest in the world from some other location”, which is more continuous-ish.
3. We are confused still, and the work is often subtle, such that we need people to notice subtle mismatches between what they’re doing and what makes sense to do, and subtle adjustments to specific projects, to which projects make sense at all, and subtle updates from how the work is going that can be propagated to some larger set of things, etc. We need people who care and don’t just want to signal that they kinda look like they care. We need people who become smarter and wiser and more oriented over time and who have deep scientific aesthetics, and other aesthetics. We need people who can go for what matters even when it means backtracking or losing face. We don’t mainly need people as something like fully needing-to-be-directed subjugated labor, who try for the appearances while lacking an internal compass. I expect more of this from folks who average 35 hrs/week than 60 hrs/week in most cases (not counting brief sprints, trying things for awhile to test and stretch one’s capacities, etc. — all of which seems healthy and part of fully inhabiting this world to me). Basically because of the things pointed out in Raemon’s post about slack, or Ben’s post about the Sabbath. Also because often 60 hrs/week for long periods of time means unconsciously writing off important personal goals (cf Critch’s post about addiction to work), and IMO writing off deep goals for the long-term makes it hard to sincerely care about things.
(4. I do agree there’s something useful about being able to work on other peoples’ projects, or on mundane non-glamorous projects, that many don’t have, and that naive readings of my #3 might tend to pull away from. I think the deeper readings of #3 don’t, but it could be discussed.)
If I instead try to share my actual views, despite these being kinda wooey and inarticulate and hard to justify, instead of trying to reason about people as means to an end:
A. I still agree that in a world where all that would/could impact AI safety was a particular engineering project (e.g., Redwood’s ML experiments, for concreteness), and if the time-frame of a person’s relevance to that effort was relatively short (e.g., a year or two, or even probably even five years), people focusing obsessively for 60 hours/week would be in many ways saner-feeling, more grounding, and more likely to produce the right kind of work in the right timeframe than the same people capping their work at 35 hrs/week. (Although even here, vacations, sabbaths, or otherwise carefully maintaining enough of the right kinds of slack and leisure that deep new things can bubble up seems really valuable to me; otherwise I expect a lot of people working hard at dumb subtasks).
A2. I’m talking about “saner-feeling” and “more grounding” here, because I’m imagining that if people are somehow capping their work at 35 hrs/week, this might be via dissociating from how things matter, and dissociation sucks and has bad side-effects on the quality of work and of team conversation and such IMO. This is really the main thing I’m optimizing for ~in general; I think sane grounded contexts where people can see what causes will have what effects and can acknowledge what matters will mostly cause a lot of the right actions, and that the main question is how to cause such contexts, whether that means 60 hrs/week or 35 hrs/week or what.
A3. In this alternate world, I expect people will kinda naturally reason about themselves and one another as means to an end (to the end of us all surviving), in a way that won’t be disoriented and won’t be made out of fear and belief-in-belief and weird dissociation.
B. In the world we seem to actually be in, I think all of this is pretty different:
B1. It’s hard to know what safety strategies will or won’t help how much.
B2. Lots of people have “belief in belief” about safety strategies working. Often this is partly politically motivated/manipulated, e.g. people wanting to work at an organization and to rise there via buying into that organization’s narrative; an organization wanting its staff and potential hires to buy its narrative so they’ll work hard and organize their work in particular ways and be loyal.
B3. There are large “unknown unknowns,” large gaps in the total set of strategies being done, maybe none of this makes sense, etc.
B4. AI timelines are probably more than two years, probably also more than five years, although it’s hard to know.
C. In a context like the hypothetical one in A, people talking about how some people are worth much more than another, about what tradeoffs will have what effects, etc. will for many cash out in mechanistic reasoning and so be basically sane-making and grounding. (Likewise, I suspect battlefield triage or mechanistic reasoning from a group of firefighters considering rescuing people from a burning building is pretty sane-making.)
In a context like the one in B (which is the one I think we’re in), people talking about themselves and other people as mere means to an end, and about how much more some people are worth than another such that those other people are a waste for the first people to talk to, and so on, will tend to increase social fear, decrease sharing of actual views, and increase weird status stuff and the feeling that one ought not question current social narratives, I think. It will tend to erode trust, erode freedom to be oneself or to share data about how one is actually thinking and feeling, and increase the extent to which people cut off their own and others’ perceptual faculties. The opposite of sane-ifying/grounding.
To gesture a bit at what I mean: a friend of mine, after attending a gathering of EA elites for the first time, complained that it was like: “So, which of the 30 organizations that we all agree has no more than a 0.1% chance of saving the world do you work for?”, followed by talking shop about the specifics within that plan, with almost no attention to the rest of the probability mass.
So I think we ought mostly not to reason about ourselves and other “labor” as though we’re in simple microecon world, given the world we’re in, and given that it encourages writing off a bunch of peoples’ perceptual abilities etc. Though I also think that you, Buck (or others) speaking your mind, including when you’re reasoning this way, is extremely helpful! We of course can’t stop wrong views by taking my best guess at which views are right and doing belief-in-belief about it; we have to converse freely and see what comes out.
(Thanks to Justis for saying some of this to me in the comments prior to me posting.)
What links here?
- PeterMcCluskey's comment on Changing the world through slack & hobbies by Steven Byrnes (EA Forum; 23 Jul 2022 3:10 UTC; 4 points)
- Buck 22 Jul 2022 1:32 UTC
  10 points
  0
  Parent
  Thanks for all these comments. I agree with a bunch of this. I might try later to explain more precisely where I agree and disagree.

AnnaSalamon comments on What should you change in response to an “emergency”? And AI risk

In terms of my inside-view disagreement, if I try to reason about people as mere means to an end (e.g. “labor”):

If I instead try to share my actual views, despite these being kinda wooey and inarticulate and hard to justify, instead of trying to reason about people as means to an end: