You can directly write/paste your own lyrics (Custom Mode). And v3 came out fairly recently, which is better in general, in case you haven’t tried it in a while.
arabaga
They seem to be created by https://app.suno.ai/ And yes, it is really easy to create songs—you can either have it create the lyrics for you based on a prompt (the default), or you can write/paste the lyrics yourself (Custom Mode). Songs can be up to ~2 minutes long I think.
Yeah, this seems to be a big part of it. If you instead switch it to the probability at market midpoint, Manifold is basically perfectly calibrated, and Kalshi is if anything overconfident (Metaculus still looks underconfident overall).
No, the letter has not been falsified.
Just to clarify: ~700 out of ~770 OpenAI employees have signed the letter (~90%)
Out of the 10 authors of the autointerpretability paper, only 5 have signed the letter. This is much lower than the average rate. One out of the 10 is no longer at OpenAI, so couldn’t have signed it, so it makes sense to count this as 5⁄9 rather than 5⁄10. Either way, it’s still well below the average rate.
Ah, nice catch, I’ll update my comment.
There is an updated list of 702 who have signed the letter (as of the time I’m writing this) here: https://www.nytimes.com/interactive/2023/11/20/technology/letter-to-the-open-ai-board.html (direct link to pdf: https://static01.nyt.com/newsgraphics/documenttools/f31ff522a5b1ad7a/9cf7eda3-full.pdf)
Nick Cammarata left OpenAI ~8 weeks ago, so he couldn’t have signed the letter.
Out of the remaining 6 core research contributors:
3⁄6 have signed it: Steven Bills, Dan Mossing, and Henk Tillman
3⁄6 have still not signed it: Leo Gao, Jeff Wu, and William Saunders
Out of the non-core research contributors:
2⁄3 signed it: Gabriel Goh and Ilya Sutskever
1⁄3 still have not signed it: Jan Leike
That being said, it looks like Jan Leike has tweeted that he thinks the board should resign: https://twitter.com/janleike/status/1726600432750125146
And that tweet was liked by Leo Gao: https://twitter.com/nabla_theta/likes
Still, it is interesting that this group is clearly underrepresented among people who have actually signed the letter.
Edit: Updated to note that Nick Cammarata is no longer at OpenAI, so he couldn’t have signed the letter. For what it’s worth, he has liked at least one tweet that called for the board to resign: https://twitter.com/nickcammarata/likes
It seems like a strategy by investors or even large tech companies to create a self-fulfilling prophecy to create a coalition of OpenAI employees, when there previously was none.
How is this more likely than the alternative, which is simply that this is an already-existing coalition that supports Sam Altman as CEO? Considering that he was CEO until he was suddenly removed yesterday, it would be surprising if most employees and investors didn’t support him. Unless I’m misunderstanding what you’re claiming here?
If you follow the link, under the section “Free Market Seen as Best, Despite Inequality”, Vietnam is the country with the highest agreement by far with the statement “Most people are better off in a free market economy, even though some people are rich and some are poor” (95%!)
That being said, while it is the most pro-capitalism country, it is clearly not the most capitalist country (although it’s not that bad, 72nd out of 176 countries ranked: https://www.heritage.org/index/ranking), and it would likely be more capitalist today if South Vietnam had won.
Small typo/correction: Waymo and Cruise each claim 10k rides per week, not riders.
Note that another way of phrasing the poll is:
Everyone responding to this poll chooses between a blue pill or red pill.
if you choose red pill, you live
if you choose blue pill, you die unless >50% of ppl choose blue pill
Which do you choose?
I bet the poll results would be very different if it was phrased this way.
- Ten variations on red-pill-blue-pill by 19 Aug 2023 16:34 UTC; 21 points) (
- 17 Aug 2023 20:39 UTC; 2 points) 's comment on A short calculation about a Twitter poll by (
Does anyone doubt that, with at most a few very incremental technological steps from today, one could train a multimodal, embodied large language model (“RobotGPT”), to which you could say, “please fill up the cauldron”, and it would just do it, using a reasonable amount of common sense in the process — not flooding the room, not killing anyone or going to any other extreme lengths, and stopping if asked?
Indeed, isn’t PaLM-SayCan an early example of this?
To be precise, Alphabet owns DeepMind. Google and DeepMind are sister companies.
So it’s possible for something to benefit Google without benefiting DeepMind, or vice versa.
“A scenario where a group of human thugs [rips and devours your entire family] is still okay-ish in some sense, because no state was involved; at least you have avoided the horrors of non-consensual taxation!”
Sorry, this doesn’t pass the ITT.
Yes, anarcho-capitalists accept that ~everyone will hire a security agency. This isn’t a refutation of anarchism.
The point is that security agencies have incentive to compete on quality, whereas current governments don’t (as much), so the quality of security agencies would be higher than the quality of governments today.
I agree that there is a good chance that this solution is not actually SOTA, and that it is important to distinguish the three sets.
There’s a further distinction between 3 guesses per problem (which is allowed according to the original specification as Ryan notes), and 2 guesses per problem (which is currently what the leaderboard tracks [rules]).
Some additional comments / minor corrections:
AFAICT, the current SOTA-on-the-private-test-set with 3 submissions per problem is 37%, and that solution scores 54% on the public eval set.
The SOTA-on-the-public-eval-set is at least 60% (see thread).
I think this is a typo and you mean the opposite.
From looking into this a bit, it seems pretty clear that the public eval set and the private test set are not IID. They’re “intended” to be the “same” difficulty, but AFAICT this essentially just means that they both consist of problems that are feasible for humans to solve.
It’s not the case that a fixed set of eval/test problems were created and then randomly distributed between the public eval set and private test set. At your link, Chollet says “the [private] test set was created last” and the problems in it are “more unique and more diverse” than the public eval set. He confirms that here:
Bottom line: I would expect Ryan’s solution to score significantly lower than 50% on the private test set.