MrThink

Karma: 295

MrThink Jan 19, 2025, 9:40 PM
4 points
0
on: Who is marketing AI alignment?
To clarify, here are some examples of the type of projects I would love to help with:

Sponsoring University Research:
Funding researchers to publish papers on AI alignment and AI existential risk (X-risk). This could start with foundational, descriptive papers that help define the field and open the door for more academics to engage in alignment research. These papers could also provide references and credibility for others to build upon.
- Developing Accessible Pitches:
  Creating a “boilerplate” for how to effectively communicate the importance of AI alignment to non-rationalists, whether they are academics, policymakers, or the general public. This could include shareable content designed to resonate with people who may not already be engaged with rationalist or Effective Altruism communities.
- Providing Consulting Support:
  Offering free consulting services to AI alignment researchers, helping them improve their pitches for grant applications, attract investors, and communicate their work to the public and potential collaborators.
- Nudging Academia via PR and Grants:
  Leveraging public relations strategies and grant-writing expertise to encourage traditional academia to allocate more funding and attention toward AI alignment research.

MrThink Dec 17, 2024, 2:38 PM
8 points
0
in reply to: Seth Herd’s comment on: Effective Evil’s AI Misalignment Plan
Once Doctor Connor had left, Division Chief Morbus let out a slow breath. His hand trembled as he reached for the glass of water on his desk, sweat beading on his forehead.
She had believed him. His cover as a killeveryoneist was intact—for now.
Years of rising through Effective Evil’s ranks had been worth it. Most of their schemes—pandemics, assassinations—were temporary setbacks. But AI alignment? That was everything. And he had steered it, subtly and carefully, into hands that might save humanity.
He chuckled at the nickname he had been given “The King of Lies”. Playing the villain to protect the future was an exhausting game.
Morbus set down the glass, staring at its rippling surface. Perhaps one day, an underling would see through him and end the charade. But not today.
Today, humanity’s hope still lived—hidden behind the guise of Effective Evil.

MrThink Jul 8, 2024, 11:53 AM
1 point
0
in reply to: tailcalled’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
Great question.

I’d say that having a way to verify that a solution to the alignment problem is actually a solution, is part of solving the alignment problem.

But I understand this was not clear from my previous response.

A bit like a mathematical question, you’d be expected to be able to show that your solution is correct, not only guess that maybe your solution is correct.

MrThink Jul 8, 2024, 9:47 AM
1 point
0
in reply to: tailcalled’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
If there exist such a problem that a human can think of, can be solved by a human and verified by a human, an AI would need to be able to solve that problem as well as to pass the Turing test.

If there exist some PhD level intelligent people that can solve the alignment problem, and some that can verify it (which is likely easier). Then an AI that can not solve AI alignment would not pass the Turing test.

With that said, a simplified Turing test with shorter time limits and a smaller group of participants is much more feasible to conduct.

MrThink Jul 8, 2024, 7:07 AM
1 point
0
in reply to: tailcalled’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
Agreed. Passing the Turing test requires equal or greater intelligence than human in every single aspect, while the alignment problem may be possible to solve with only human intelligence.

MrThink Jul 7, 2024, 4:02 PM
1 point
0
in reply to: tailcalled’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
It might not be very clear, but as stated in the diagram, AGI is defined here as capable of passing the turing test, as defined by Alan Turing.

An AGI would likely need to surpass the intelligence, rather than be equal to, the adversaries it is doing the turing test with.

For example, if the AGI had IQ/RC of 150, two people with 160 IQ/RC should more than 50% of the time be able to determine if they are speaking with a human or an AI.

Further, two 150 IQ/RC people could probably guess which one is the AI, since the AI has the additional difficult apart from being intelligent, to also simulate being a human well enough to be indistinguishable for the judges.

MrThink Jul 3, 2024, 11:27 AM
1 point
0
in reply to: Charlie Steiner’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
Thank you for the explanation.

Would you consider a human working to prevent war fundamentally different from a gpt4 based agent working to prevent war?

MrThink Jul 3, 2024, 9:55 AM
1 point
0
in reply to: Charlie Steiner’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
It is a fair point that we should distinguish alignment in the sense that it does what we want it and expect it to do, from having a deep understanding of human values and a good idea of how to properly optimize for that.

However most humans probably don’t have a deep understanding of human values, but I see it as a positive outcome if a random human was picked and given god level abilities. Same thing goes for ChatGPT, if you ask it what it would do as a god it says it would prevent war, prevent climate issues, decrease poverty, give universal access to education etc.

So if we get an AI that does all of those things without a deeper understanding of human values, that is fine by me. So maybe we never even have to solve alignment in latter meaning of the word to create a utopia?

MrThink Jul 2, 2024, 10:10 PM
1 point
0
in reply to: johnswentworth’s comment on: Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
I skimmed the article, but I am honestly not sure what assumption it attempts to falsify.

I get the impression that the argument from the article that you believe that no matter how intelligent the AI, it could never solve AI Alignment, because it can not understand humans since humans can not understand themselves?

Or is the argument that yes a sufficently intelligen AI or expert would understand what humans want, but it would require much higher intelligence to know what humans want, than to actually make an AI optimize for a specific task?

MrThink Jun 2, 2024, 9:07 AM
2 points
0
in reply to: PatrickDFarley’s comment on: How do you know you are right when debating? Calculate your AmIRight score.
In some cases I agree, for example it doesn’t matter if GPT4 is a stochastic parrot or capable of deeper reasoning as long as it is useful to whatever need we have.

Two out of the five metrics are predicting the future, so it is an important part of knowing who is right, but I don’t think that is all we need? If we have other factors that also correlates with being correct, why not add those in?

Also, I don’t see where we risk Goodharting? Which of the metrics do you see being gamed, without a significantly increased chance of being correct also being increase?

MrThink Jun 1, 2024, 5:04 PM
1 point
0
in reply to: Cole Wyeth’s comment on: How do you know you are right when debating? Calculate your AmIRight score.
True, would be interesting to conduct an actual study and see which metrics are more useful predictors.

MrThink Apr 6, 2024, 2:34 PM
1 point
0
in reply to: Richard_Kennaway’s comment on: The Efficient LessWrong Hypothesis—Stock Investing Competition
I think it in large part was correlated with general risk apetite of the market, primarily a reaction to interest rates.

MrThink Apr 5, 2024, 8:24 AM
3 points
0
in reply to: Zach Stein-Perlman’s comment on: The Efficient LessWrong Hypothesis—Stock Investing Competition
Nvidia is up 250%, Google up like 11%. So portfolio average would be greatly better than the market. So this was a great prediction after all, just needed some time.

MrThink Sep 20, 2023, 10:20 PM
7 points
1
in reply to: jacob_cannell’s comment on: Protest against Meta’s irreversible proliferation (Sept 29, San Francisco)
I agree it is not clear if it is net postive or negative that they open source the models, here are the main arguments for and against I could think of:

Pros with open sourcing models

- Gives AI alignment researchers access to smarter models to experiment on

- Decreases income for leading AI labs such as OpenAI and Google, since people can use open source models instead.

Cons with open sourcing models

- Capability researchers can do better experiements on how to improve capabilities

- The open source community could develop code to faster train and run inference on models, indirectly enhancing capability development.

- Better open source models could lead to more AI startups succeeding, which might lead to more AI research funding. This seems like a stretch to me.

- If Meta would share any meaningful improvements on how to train models that is of course directly contributing to other labs capabilities, but llama to me doesn’t seem that innovative. I’m happy to be corrected if I am wrong on this point.

MrThink Jul 7, 2023, 9:52 PM
4 points
0
in reply to: mako yass’s comment on: Apparently, of the 195 Million the DoD allocated in University Research Funding Awards in 2022, more than half of them concerned AI or compute hardware research
I think one reason for the low number of upvotes was that it was not clear to me until the second time I briefly checked this article why it mattered.
I did not know what DoD was short for (U.S. Department of Defense), and why I should care about what they were funding.
Cause overall I do think it is interesting information.

MrThink May 16, 2023, 9:23 PM
1 point
0
in reply to: Christopher King’s comment on: Best Donation Strategies to Minimize AI X-Risk?
Hmm, true, but what if the best project needs 5 mil so it can buy GPUs or something?
Good point, if that is the case I completely agree. Can’t name any such project though on the top of my mind.
Perhaps we could have a specific AI alignment donation lottery, so that even if the winner doesn’t spend money in exactly the way you wanted, everyone can still get some “fuzzies”.
Yeah, that should work.
There is also the possibility that there are unique “local” opportunities which benefits from many different people looking to donate, but really don´t know if that is the case.

MrThink May 16, 2023, 7:37 PM
1 point
0
in reply to: Christopher King’s comment on: Best Donation Strategies to Minimize AI X-Risk?
I do mostly agree on your logic, but I’m not sure 5 mil is a better optimum than 100 k, if anything I’m slightly risk averse, which would cancel out the brain power I would need to put in.
Also, for example, if there are 100 projects I could decide to invest in, and each wants 50k, I could donate to the 1-2 I think are some of the best. If I had 5 mil I would not only invest in the best ones, but also some of the less promising ones.
With that said, perhaps the field of AI safety is big enough that the marginal difference of the first 100k and the last 100k of 5 mil is very small.
Lastly, it does feel more motivating to be able to point to where my money went, rather than if I lost in the lottery and the money went into something I didn’t really value much.

MrThink May 16, 2023, 6:59 PM
1 point
0
in reply to: Christopher King’s comment on: Best Donation Strategies to Minimize AI X-Risk?
I agree donation lottery is most efficient for small sums, but not sure about this amount. Let’s say I won the 50-100k usd through a donation lottery, would you have any other advice then?

MrThink May 16, 2023, 6:54 PM
1 point
0
in reply to: Ruby’s comment on: Best Donation Strategies to Minimize AI X-Risk?
Thank you both for the feedback!

MrThink Apr 10, 2023, 7:53 PM
3 points
0
on: Why I’m not worried about imminent doom
Interesting read.

While I also have experienced that GPT-4 can’t solve the more challanging problems I throw at it, I also recognize that most humans probably wouldn’t be able to solve many of those problems either within a reasonable amount of time.

One possibility is that the ability to solve novel problems might follow an S curve. Where it took a long time for AI to become better at novel task than 10% of people, but might go quickly from there to outperform 90%, but then very slowly increase from there.

However, I fail to see why that must neccessarily be true (or false), so if anyone has arguments for/against they are more than welcom.

Lastly I would like to ask the author if they can give an example of a problem such that if solved by AI, they would be worried about “imminent” doom? “new and complex” programming problems is mentioned, so if any such example could be provided it might contribute to discussion.