Making a research platform for AI Alignment at https://ai-plans.com/
Come critique AI Alignment plans and get feedback on your alignment plan!
Iknownothing
A challenge for folks interested: spend 2 weeks without media based entertainment.
“CESI’s Artificial Intelligence Standardization White Paper released in 2018 states
that “AI systems that have a direct impact on the safety of humanity and the safety of life,
and may constitute threats to humans” must be regulated and assessed, suggesting a broad
threat perception (Section 4.5.7).42 In addition, a TC260 white paper released in 2019 on AI
safety/security worries that “emergence” (涌现性) by AI algorithms can exacerbate the
black box effect and “autonomy” can lead to algorithmic “self-improvement” (Section
3.2.1.3).43”
From https://concordia-consulting.com/wp-content/uploads/2023/10/State-of-AI-Safety-in-China.pdf
I disagree with this paragraph today: “A lot of what AI does currently, that is visible to the general public seems like it could be replicated without AI”
I was talking about for a farmer. For a consumer, they can get their eggs/milk from such a farmer and fund/invest in such a farm, if they can.
Or talk to a local farm about setting aside some chickens, pay for them to be given extra space, better treatment, etc.
I don’t really know what you mean about the EA reducetarian stuff.
Also, if you as an individual want to be healthy, not contribute to harming animal and have the time, space, money, willingness etc to raise some chickens, why not?
Exercise in general is pretty great, yes. Especially if done outdoors, imo.
Could a solution to some of this be to raise some chickens for eggs, treat them nicely, give them space to roam, etc?
Obviously the best would be to raise cows as well, treat them well, don’t kill the male calves, etc- but that’s much less of an option for most.
This is great! Thank you for doing this! Might add some of these to ai-plans.com!
Yes, winning if fun!
I think this kind of thing makes people feel like you’re pushing a message, to which the automatic response is to push back.
What I’ve found works is to be agreeable, inviting, meet them at their own values and present how it as a hard problem to solve which isn’t being competently tackled by this other dumb group (not us, we wouldn’t do this).
That kind of thing. Had a 100% success rate so far.
I’m simplifying my approach, since I’m not spending a lot of time on this, but if you imagine I’m not a dumbass and think about what kind of approach like this could work a lot, while not being dumb in that it doesn’t actually address the problem, you’ll probably get what I mean.
I’m generally disincentivized to post or put effort into a post from the system where someone can just heavily downvote my post, without even giving a reason.
A simple way to improve this system would be to require someone to comment/give a reason when heavily upvoting/heavily downvoting things.
“In the ancestral environment, politics was a matter of life and death.”—this is a pretty strong statement to make with no evidence to back it up.
What about orgs such as ai-plans.com, which aim to be exponentially useful for AI Safety?
I think your ideas are some of the most promising I’ve seen- I’d love to see them pursued further, though I’m concerned about the air-gaping
Hi Ruby! Thanks for the great feedback!! Sorry for the late reply, I’ve been working on the site!
So, we’re not doing just criticisms anymore- we’re ranking plans by Total Strength score—Total Vulnerabilities scores. Quite a few researchers have been posting their plans on the site!
Going to do a full rebuild soon, to make the site look nicer and be even faster to work on.
We’re also holding regular critique-a-thons. The last one went very well!
We had 40+ submissions and produced what I think is really great work!
We also made a Broad List of Vulnerabilities in the first two days! https://docs.google.com/document/d/1tCMrvJEueePNgb2_nOEUMc_UGce7TxKdqI5rOJ1G7C0/edit?usp=sharing
On not getting all of a plan’s details without talking to the person a lot- I think this is a vulnerability in communication.
A serious plan, with the intention of actually solving the problem, should have the effort put into it to make it clear to a reader what it actually is, what problems it aims to solve, why it aims to solve them and how it seeks to do so.
A failure to do so is silly for any serious strategy.
The good thing is, that if such a vulnerability is pointed out, on AI-Plans.com, the poster can see the vulnerability and iterate on it!
This was really great. Thanks for making it.
I was curious why Trump was dropping some of the best takes!
Yeah, I think you’re right- at least about the sequences.
I think something more specific about attitudes would be more accurate and useful.
When I say media, I mean social media, movies, videos, books etc- any type of recording or something that you believe you’re using as entertainment.
I’m trying this myself. Done singular days before, sometimes 2 or 3 days, but failed to keep it consistent. I did find that when I did it, my work output was far higher and greater quality, I had a much better sleeping schedule and was generally in a much more enjoyable mood.
I also ended up spending more time with friends and family, meeting new people, trying interesting things, spending time outdoors, etc.
This time I’m building up to it- starting with 1 media free hour a day, then 2 hours, then 3, etc.
I think building up to it will let me build new habits which will stick more.