Vael Gates

Karma: 743

Vael Gates Nov 26, 2022, 2:48 AM
8 points
0
on: Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
I see there’s an associated talk now! https://www.youtube.com/watch?v=EIhE84kH2QI

Vael Gates Sep 12, 2022, 3:02 AM
7 points
4
on: AI Safety field-building projects I’d like to see
Re: “Targeted Outreach to Experienced Researchers”

Please apply to work with the aforementioned AISFB Hub! I am actively trying to hire for people who I think would be good fits for this type of role, and offer mentorship / funding / access to and models of the space. Note that you’ll need to have AI safety knowledge (for example, I want you to have read / have a plan for reading all of the main readings in the AGISF Technical Curriculum) and high generalist competence, as two of the most important qualifications.

I think most people will not be a good fit for this role (there’s more complicated status hierarchies and culture within experienced researchers than are visible at first glance), and like Akash I caution against unilateral action here. I’m psyched about meeting people who are good fits, however, and urge you to apply to work with me if you think that could be you!

Vael Gates Sep 5, 2022, 3:01 AM
3 points
0
on: #4.3: Cryonics-friendly insurance agents
Some Rudi communication style anecdotes:
- Rudi: “Aren’t you a beautiful young woman!” almost immediately when we saw each other on video call for the first time (I identify as nonbinary) (<-- this anecdote is a from a few years ago and from memory, though, so he might have just said something quite similar)
- Rudi, in a Google Calendar invite note, as a closing: “Let’s talk, dear Vael...as I recall, we liked each other a lot.:)”
  - Me in an email back:
    
    ”(Ah, and just one other quick note: in the Google Calendar invite, you’ve included the line “Let’s talk, dear Vael...as I recall, we liked each other a lot.:)”. This feels like flirting to me, and I’m not sure but imagine you wouldn’t include this in emails to men, so I just wanted to state a preference that I’d enjoy if sentences like this weren’t included in the future! Many thanks, and looking forward to talking to you in June!)”
  - Rudi back:
    
    “Hi Vael,
    
    Of course, and thank you for nicely stating your preference, and just for the record I would include a phrase like this with men, women, or non-gendered individuals. (Also for the record, maybe I should re-think this.) And I still appreciate your observation, and will endeavor to be more circumspect in the future. :)
    
    Warm and decidedly professional regards,
    
    Rudi :)”
I’ve similarly heard he doesn’t do this with men. He also answered my questions when emailing back and forth. But yeah, be ready!

Vael Gates Aug 25, 2022, 1:07 AM
1 point
0
in reply to: P.’s comment on: Announcing the AI Safety Community Building Hub, a new effort to provide AISCB projects, mentorship, and funding
Seems like it’s great to do one-on-ones with people who could be interested and skilled from all sorts of fields, and top researchers in similar fields could be a good group to prioritize! Alas, I feel like the current bottleneck is people who are good fits to do these one-on-ones (I’m looking to hire people, but not currently doing them myself); there’s many people I’d ideally want to reach.

Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding

Vael GatesJul 28, 2022, 9:29 PM

49 points

3 comments6 min readLW link

Vael Gates Jun 21, 2022, 2:26 AM
2 points
in reply to: KatWoods’s comment on: Resources I send to AI researchers about AI safety
Thanks for doing that Kat!

Vael Gates Jun 14, 2022, 11:50 PM
3 points
in reply to: plex’s comment on: Resources I send to AI researchers about AI safety
Sure! This isn’t novel content; the vast majority of it is drawn from existing lists, so it’s not even particularly mine. I think just make sure the things within are referenced correctly, and you should be good to go!

Vael Gates Jun 14, 2022, 11:45 PM
10 points
in reply to: Vael Gates’s comment on: Resources I send to AI researchers about AI safety
With respect to the fact that I don’t immediately point people at LessWrong or the Alignment Forum (I actually only very rarely include the “Rationalist” section in the email—not unless I’ve decided to bring it up in person, and they’ve reacted positively), there’s different philosophies on AI alignment field-building. One of the active disagreements right now is how much we want new people coming into AI alignment to be the type of person who enjoy LessWrong, or whether it’s good to be targeting a broader audience.

I’m personally currently of the opinion that we should be targeting a broader audience, where there’s a place for people who want to work in academia or industry separate from the main Rationalist sphere, and the people who are drawn towards the Rationalists will find their way there either on their own (I find people tend to do this pretty easily when they start Googling), or with my nudging if they seem to be that kind of person.

I don’t think this is much “shying away from reality”—it feels more like engaging with it, trying to figure out if and how we want AI alignment research to grow, and how to best make that happen given the different types of people with different motivations involved.

Vael Gates Jun 14, 2022, 11:32 PM
5 points
in reply to: Adam Zerner’s comment on: Resources I send to AI researchers about AI safety
A great point, thanks! I’ve just edited the “There’s also a growing community working on AI alignment” section to include MIRI, and also edited some of the academics’ names and links.

I don’t think it makes sense for me to list Eliezer’s name in the part of that section where I’m listing names, since I’m only listing some subset of academics who (vaguely gesturing at a cluster) are sort of actively publishing in academia, mostly tenure track and actively recruiting students, and interested in academic field-building. I’m not currently listing names of researchers in industry or non-profits (e.g. I don’t list Paul Christiano, or Chris Olah), though that might be a thing to do.

Note that I didn’t choose this list of names very carefully, so I’m happy to take suggestions! This doc came about because I had an email draft that I was haphazardly adding things to as I talked to researchers and needed to promptly send them resources, getting gradually refined when I spotted issues. I thus consider it a work-in-progress and appreciate suggestions.

Resources I send to AI researchers about AI safety

Vael GatesJun 14, 2022, 2:24 AM

69 points

12 comments1 min readLW link

Vael Gates: Risks from Advanced AI (June 2022)

Vael GatesJun 14, 2022, 12:54 AM

38 points

2 comments30 min readLW link

Vael Gates May 18, 2022, 3:15 AM
4 points
on: Transcripts of interviews with AI researchers
I’ve been finding “A Bird’s Eye View of the ML Field [Pragmatic AI Safety #2]” to have a lot of content that would likely be interesting to the audience reading these transcripts. For example, the incentives section rhymes with the type of things interviewees would sometimes say. I think the post generally captures and analyzes a lot of the flavor / contextualizes what it was like to talk to researchers.

Vael Gates May 9, 2022, 10:18 PM
20 points
in reply to: Yitz’s comment on: Transcripts of interviews with AI researchers
It was formatted based on typical academic “I am conducting a survey on X, $Y for Z time”, and notably didn’t mention AI safety. The intro was basically this:
My name is Vael Gates, and I’m a postdoctoral fellow at Stanford studying how productive and active AI researchers (based on submissions to major conferences) perceive AI and the future of the field. For example:
- What do you think are the largest benefits and risks of AI?
- If you could change your colleagues’ perception of AI, what attitudes/beliefs would you want them to have?
My response rate was generally very low, which biased the sample towards… friendly, sociable people who wanted to talk about their work and/or help out and/or wanted money, and had time. I think it was usually <5% response rate for the NeurIPS / ICML sample off the top of my head. I didn’t A/B test the email. I also offered more money for this study than the main academic study, and expect I wouldn’t have been able to talk to the individually-selected researchers without the money component.

Transcripts of interviews with AI researchers

Vael GatesMay 9, 2022, 5:57 AM

170 points

9 comments2 min readLW link

Vael Gates Apr 3, 2022, 10:59 PM
10 points
on: Giving calibrated time estimates can have social costs
Thanks Alex :). Comment just on this section:

“The annoying thing here is that I believe the only difference between me and another task doer in this situation is that I have more accurate beliefs, or I have a higher belief threshold for making claims (or something similar, like that I only use statement for communicating beliefs and not for socially enforcing a commitment to myself).”
As someone who was in this situation with Alex recently (wanting a commitment from him, in order to make plans with other people that relied on this initial commitment), I think there’s maybe an additional thing in my psychology and not in Alex’s which is about self-forcing.

I’m careful about situations where I’m making a very strong commitment to something, because it means that if I’ve planned the timing wrong, I’ll get the thing done but with high self-sacrifice. I’m committing to skipping sleeping, or fun hangouts I otherwise had planned, or relaxing activities, to get the thing done by the date I said it’d be done. I’m capable and willing to force myself to do this, if the other person wants a commitment from me enough. It’s not 100% certain I’ll succeed—e.g. I might be hit by a car—but I’m certain enough of success that people would expect me to succeed barring an emergency, which is mostly what I expect from other people when they’re for-real-for-real committing to something. S

o when I’m asking someone to for-real-for-real commit to me, I’m asking “are you ready to do self-sacrifice if you don’t get it done by this date, barring an emergency. It’s fine if it’s a later date, I just want the certainty of being able to build on this plan”. And I do think there’s a bunch of different kinds of commitments in day-to-day life, where I make looser commitments all the time, but I do have a category for “for-real-for-real commitment”, and will track other people’s failures to meet my expectations when I believe they’ve made a “for-real-for-real” commitment to me. I might track this more carefully than other people do though—feels like it kinda rhymes with autism and high conscientiousness, maybe also high-performance environments but idk?

Anyway, this all might be the same thing as “I only use statement for communicating beliefs and not for socially enforcing a commitment to myself”. I’m not sure I’d use exactly the the “social enforcing a commitment to myself” phrase; in my mind, it feels like a social commitment and also feels like “I’m now putting my personal integrity on the line, since I’m making a for-real-for-real commitment, so I’d better do what I said I would, even if no one’s looking”.

Amusingly, I think Alex and I are both using self-integrity here, but one hypothesis is that maybe I’m very willing and able to force myself to do things, and this makes up the difference with respect to what concepts we’re referring to with respect to (strong) commitment?

Always fun getting unduly detailed with very specific pieces of models :P.

Self-studying to develop an inside-view model of AI alignment; co-studiers welcome!

Vael GatesNov 30, 2021, 9:25 AM

13 points

0 comments4 min readLW link

Vael Gates Nov 14, 2021, 8:47 AM
6 points
on: Discussion with Eliezer Yudkowsky on AGI interventions
“Alpha Zero scales with more computing power, I think AlphaFold 2 scales with more computing power, Mu Zero scales with more computing power. Precisely because GPT-3 doesn’t scale, I’d expect an AGI to look more like Mu Zero and particularly with respect to the fact that it has some way of scaling.”

I thought GPT-3 was the canonical example of a model type that people are worried about will scale? (i.e. it’s discussed in https://www.gwern.net/Scaling-hypothesis?)

Vael Gates Sep 23, 2021, 1:00 AM
2 points
in reply to: MichaelA’s comment on: MichaelA’s Shortform
Recently I was also trying to figure out what resources to send to an economist, and couldn’t find a list that existed either! The list I came up with is subsumed by yours, except:
- Questions within Some AI Governance Research Ideas
- “Further Research” section within an OpenPhil 2021 report: https://www.openphilanthropy.org/could-advanced-ai-drive-explosive-economic-growth
- The AI Objectives Institute just launched, and they may have questions in the future

Vael Gates

An­nounc­ing the AI Safety Field Build­ing Hub, a new effort to provide AISFB pro­jects, men­tor­ship, and funding

Re­sources I send to AI re­searchers about AI safety

Vael Gates: Risks from Ad­vanced AI (June 2022)

Tran­scripts of in­ter­views with AI researchers

Self-study­ing to de­velop an in­side-view model of AI al­ign­ment; co-studiers wel­come!

Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding

Resources I send to AI researchers about AI safety

Vael Gates: Risks from Advanced AI (June 2022)

Transcripts of interviews with AI researchers

Self-studying to develop an inside-view model of AI alignment; co-studiers welcome!