I work at the Alignment Research Center (ARC). I write a blog on stuff I’m interested in (such as math, philosophy, puzzles, statistics, and elections): https://ericneyman.wordpress.com/
Eric Neyman
Since no one is giving answers, I’ll give my super uninformed take. If anyone replies with a disagreement, you should presume that they are right.
During a recession, countries want to spend their money on economic stimulus programs that create jobs and get their citizens to spend more. China seems to be doing this.
Is spending on AI development good for these goals? I’m tempted to say no. One exception is building power plants, which China would maybe need to eventually do in order to build sufficiently large models.
At the same time, China seems to have a pretty big debt problem. Its debt-to-GDP ratio was 288% in 2023 (I think this number accounts not only for national debt but also for local government debt and maybe personal debt, which I think China has a lot of compared to other countries like the United States). This might in practice constrain how much it can spend.
So China is in a position of wanting to spend, but not spend too much, and AI probably isn’t a great place for it to spend in order to accomplish its immediate goals.
In other words, I think the recession makes AGI development a lower priority for the Chinese government. It seems quite plausible to me that the recession might delay the creation of a large government project for building AGI by a few years.
(Again, I don’t know stuff about this. Maybe someone will reply saying “Actually, China has already created a giant government project for building AGI” with a link.)
Thanks! This makes me curious: is sports betting anomalous (among forms of consumption) in terms of how much it substitutes for financial investing?
I think the “Provably Safe ML” section is my main crux. For example, you write:
One potential solution is to externally gate the AI system with provable code. In this case, the driving might be handled by an unsafe AI system, but its behavior would have “safety in the loop” by having simpler and provably safe code restrict what the driving system can output, to respect the rules noted above. This does not guarantee that the AI is a safe driver—it just keeps such systems in a provably safe box.
I currently believe that if you try to do this, you will either have to restrict the outputs so much that the car wouldn’t be able to drive well, or else fail to prove that the actions allowed by the gate are safe. Perhaps you can elaborate on why this approach seems like it could work?
(I feel similarly about other proposals in that section.)
For what it’s worth, I don’t have any particular reason to think that that’s the reason for her opposition.
But it seems like SB1047 hasn’t been very controversial among CA politicians.
I think this isn’t true. Concretely, I bet that if you looked at the distribution of Democratic No votes among bills that reached Newsom’s desk, this one would be among the highest (7 No votes and a bunch of not-voting, which I think is just a polite way to vote No; source). I haven’t checked and could be wrong!
My take is basically the same as Neel’s, though my all-things-considered guess is that he’s 60% or so to veto. My position on Manifold is in large part an emotional hedge. (Otherwise I would be placing much smaller bets in the same direction.)
I believe that Pelosi had never once spoken out against a state bill authored by a California Democrat before this.
Probably no longer willing to make the bet, sorry. While my inside view is that Harris is more likely to win than Nate Silver’s 72%, I defer to his model enough that my “all things considered” view now puts her win probability around 75%.
[Edit: this comment is probably retracted, although I’m still confused; see discussion below.]
I’d like clarification from Paul and Eliezer on how the bet would resolve, if it were about whether an AI could get IMO silver by 2024.
Besides not fitting in the time constraints (which I think is kind of a cop-out because the process seems pretty parallelizable), I think the main reason that such a bet would resolve no is that problems 1, 2, and 6 had the form “find the right answer and prove it right”, whereas the DeepMind AI was given the right answer and merely had to prove it right. Often, finding the right answer is a decent part of the challenge of solving an Olympiad problem. Quoting more extensively from Manifold commenter Balasar:
The “translations” to Lean do some pretty substantial work on behalf of the model. For example, in the theorem for problem 6, the Lean translation that the model is asked to prove includes an answer that was not given in the original IMO problem.
theorem imo_2024_p6 (IsAquaesulian : (ℚ → ℚ) → Prop) (IsAquaesulian_def : ∀ f, IsAquaesulian f ↔ ∀ x y, f (x + f y) = f x + y ∨ f (f x + y) = x + f y) : IsLeast {(c : ℤ) | ∀ f, IsAquaesulian f → {(f r + f (-r)) | (r : ℚ)}.Finite ∧ {(f r + f (-r)) | (r : ℚ)}.ncard ≤ c} 2
The model is supposed to prove that “there exists an integer
c
such that for any aquaesulian functionf
there are at mostc
different rational numbers of the formf(r)+f(−r)
for some rational numberr
, and find the smallest possible value ofc"
.The original IMO problem does not include that the smallest possible value of
c
is 2, but the theorem that AlphaProof was given to solve has the number 2 right there in the theorem statement. Part of the problem is to figure out what 2 is.
I’m now happy to make this bet about Trump vs. Harris, if you’re interested.
Looks like this bet is voided. My take is roughly that:
To the extent that our disagreement was rooted in a difference in how much to weight polls vs. priors, I continue to feel good about my side of the bet.
I wouldn’t have made this bet after the debate. I’m not sure to what extent I should have known that Biden would perform terribly. I was blindsided by how poorly he did, but maybe shouldn’t have been.
I definitely wouldn’t have made this bet after the assassination attempt, which I think increased Trump’s chances. But that event didn’t update me on how good my side of the bet was when I made it.
I think there’s like a 75-80% chance that Kamala Harris wins Virginia.
I frequently find myself in the following situation:
Friend: I’m confused about X
Me: Well, I’m not confused about X, but I bet it’s because you have more information than me, and if I knew what you knew then I would be confused.(E.g. my friend who know more chemistry than me might say “I’m confused about how soap works”, and while I have an explanation for why soap works, their confusion is at a deeper level, where if I gave them my explanation of how soap works, it wouldn’t actually clarify their confusion.)
This is different from the “usual” state of affairs, where you’re not confused but you know more than the other person.
I would love to have a succinct word or phrase for this kind of being not-confused!
Yup, sounds good! I’ve set myself a reminder for November 9th.
I’d have to think more about 4:1 odds, but definitely happy to make this bet at 3:1 odds. How about my $300 to your $100?
(Edit: my proposal is to consider the bet voided if Biden or Trump dies or isn’t the nominee.)
I think the FiveThirtyEight model is pretty bad this year. This makes sense to me, because it’s a pretty different model: Nate Silver owns the former FiveThirtyEight model IP (and will be publishing it on his Substack later this month), so FiveThirtyEight needed to create a new model from scratch. They hired G. Elliott Morris, whose 2020 forecasts were pretty crazy in my opinion.
Here are some concrete things about FiveThirtyEight’s model that don’t make sense to me:
There’s only a 30% chance that Pennsylvania, Michigan, or Wisconsin will be the tipping point state. I think that’s way too low; I would put this probability around 65%. In general, their probability distribution over which state will be the tipping point state is way too spread out.
They expect Biden to win by 2.5 points; currently he’s down by 1 point. I buy that there will be some amount of movement toward Biden in expectation because of the economic fundamentals, but 3.5 seems too much as an average-case.
I think their Voter Power Index (VPI) doesn’t make sense. VPI is a measure of how likely a voter in a given state is to flip the entire election. Their VPIs are way to similar. To pick a particularly egregious example, they think that a vote in Delaware is 1/7th as valuable as a vote in Pennsylvania. This is obvious nonsense: a vote in Delaware is less than 1% as valuable as a vote in Pennsylvania. In 2020, Biden won Delaware by 19%. If Biden wins 50% of the vote in Delaware, he will have lost the election in an almost unprecedented landslide.
I claim that the following is a pretty good approximation to VPI: (probability that the state is the tipping state) * (number of electoral votes) / (number of voters). If you use their tipping-point state probabilities, you’ll find that Pennsylvania’s VPI should be roughly 4.3 times larger than New Hampshire’s. Instead, FiveThirtyEight has New Hampshire’s VPI being (slightly) higher than Pennsylvania’s.I retract this: the approximation should instead be (tipping point state probability) / (number of voters). Their VPI numbers now seem pretty consistent with their tipping point probabilities to me, although I still think their tipping point probabilities are wrong.
The Economist also has a model, which gives Trump a 2⁄3 chance of winning. I think that model is pretty bad too. For example, I think Biden is much more than 70% likely to win Virginia and New Hampshire. I haven’t dug into the details of the model to get a better sense of what I think they’re doing wrong.
One example of (2) is disapproving of publishing AI alignment research that may advance AI capabilities. That’s because you’re criticizing the research not on the basis of “this is wrong” but on the basis of “it was bad to say this, even if it’s right”.
People like to talk about decoupling vs. contextualizing norms. To summarize, decoupling norms encourage for arguments to be assessed in isolation of surrounding context, while contextualizing norms consider the context around an argument to be really important.
I think it’s worth distinguishing between two kinds of contextualizing:
(1) If someone says X, updating on the fact that they are the sort of person who would say X. (E.g. if most people who say X in fact believe Y, contextualizing norms are fine with assuming that your interlocutor believes Y unless they say otherwise.)
(2) In a discussion where someone says X, considering “is it good for the world to be saying X” to be an importantly relevant question.
I think these are pretty different and it would be nice to have separate terms for them.
My Manifold market on Collin Burns, lead author of the weak-to-strong generalization paper
Indeed! This is Theorem 9.4.2.
Update: the strangely-textured fluid turned out to be a dentigerous cyst, which was the best possible outcome. I won’t need a second surgery :)
Yeah, that’s right—see this section for the full statements.