David Matolcsi

Karma: 847

David Matolcsi Apr 1, 2025, 5:52 PM
1 point
0
in reply to: ryan_greenblatt’s comment on: Notes on handling non-concentrated failures with AI control: high level methods and different regimes
Thanks for the reply. If you have time, I’m still interested in hearing what would be a realistic central example of non-concentrated failure that’s good to imagine while reading the post.

David Matolcsi Mar 31, 2025, 9:51 AM
6 points
5
on: Notes on handling non-concentrated failures with AI control: high level methods and different regimes
This post was a very dense read, and it was hard for me to digest what the main conclusions were supposed to be. Could you write some concrete scenarios that you think are central examples of schemers causing non-concentrated failures? While reading the post, I never knew what situation to imagine: An AI is doing philosophical alignment research but intentionally producing promising-looking crackpotry? It is building cyber-sec infrastructure but leaving in a lot of vulnerabilities? Advising the President, but having a bias towards advocating for integrating AI into the military?

I think these problems are all pretty different in what approaches are promising in preventing them, so it would be useful to see what you think the most likely non-concentrated failures are, so we can read the post with that in mind.
As another point, you could really write conclusion sections. There are a lot of different points made in the post, and it’s hard to see which are the most important to get across to the reader in your opinion. A conclusion section would help a lot in that.
In general, I think that among all the people I know, you might be the one who has the biggest difference in how good you are at explaining concepts in person, and how bad you are at communicating them in blog posts. (Strangely, your LW comments are also very good and digestible, more similar to your in person communication than to your long-form posts, I don’t know why.) I think it could be high leverage for you to experiment some with making your posts more readable. Using more concrete examples and writing conclusion sections would go a long way in improving your posts in general, but I felt compelled to comment here because this post was especially hard to read without them.

David Matolcsi Mar 18, 2025, 8:55 PM
3 points
3
on: FrontierMath Score of o3-mini Much Lower Than Claimed
My strong guess is that OpenAI’s results are real, it would really surprise me if they were literally cheating on the benchmarks. It looks like they are just using much more inference-time compute than is available to any outside user, and they use a clever scaffold that makes the model productively utilize the extra inference time. Elliot Glazer (creator of FrontierMath) says in a comment on my recent post on FrontierMath:
A quick comment: the o3 and o3-mini announcements each have two significantly different scores, one ⇐ 10%, the other >= 25%. Our own eval of o3-mini (high) got a score of 11% (it’s on Epoch’s Benchmarking Hub). We don’t actually know what the higher scores mean, could be some combination of extreme compute, tool use, scaffolding, majority vote, etc., but we’re pretty sure there is no publicly accessible way to get that level of performance out of the model, and certainly not performance capable of “crushing IMO problems.”
I do have the reasoning traces from the high-scoring o3-mini run. They’re extremely long, and one of the ways it leverages the higher resources is to engage in an internal dialogue where it does a pretty good job of catching its own errors/hallucinations and backtracking until it finds a path to a solution it’s confident in. I’m still writing up my analysis of the traces and surveying the authors for their opinions on the traces, and will also update e.g. my IMO predictions with what I’ve learned.

David Matolcsi Mar 15, 2025, 9:53 AM
3 points
0
in reply to: Elliot Glazer’s comment on: Don’t over-update on FrontierMath results
I like the idea of IMO-style releases, always collecting new problems, testing the AIs on them, then releasing to the public. What do you think, how important it is to only have problems with numerical solutions? If you can test the AIs on problems with proofs, then there are already many competitions that regularly release high-quality problems. (I’m shilling KöMaL again as one that’s especially close to my heart, but there are many good monthly competitions around the world.) I think if we instruct the AI to present its solution in one page at the end, then it’s not that hard to get an experience competition grader to read the solution and give it scores according to the normal competitions scores, so the result won’t be much less objective than if it was only numerical solutions. If you want to stick to problems with numerical solutions, I’m worried that you will have a hard time regularly assembling high-quality numerical problems again and again, and even if the problems are released publicly, people will have a harder time evaluating them than if they actually came from a competition where we can compare to the natural human baseline of the competing students.

David Matolcsi Mar 13, 2025, 7:46 AM
2 points
0
in reply to: Elliot Glazer’s comment on: Don’t over-update on FrontierMath results
Thanks a lot for the answer, I put in an edit linking to it. I think it’s a very interesting update that the models get significantly better at catching and correcting their mistakes in OpenAI’s scaffold with longer inference time. I am surprised by this, given how much it feels like the models can’t distinguish its plausible fake reasoning from good proofs at all. But I assume there is still a small signal in the right direction, and that can be amplified if the model think the question through a lot of times (and does something like a majority voting within its chain of thought?). I think this is an interesting update towards the viability of inference time scaling.
I think many of my other points still stand however: I still don’t know how capable I should expect the internally scaffolded model to be given that it got 32% on FrontierMath, and I would much rather have them report results on the IMO or a similar competition, than on a benchmark I can’t see and whose difficulty I can’t easily assess.

David Matolcsi Mar 4, 2025, 4:50 PM
LW: 6 AF: 4
2
AF
on: Validating against a misalignment detector is very different to training against one
I like the main idea of the post. It’s important to note though that the setup assumed that we have a bunch of alignnent ideas that all have an independent 10% chance of working. Meanwhile, in reality I expect a lot of correlation: there is a decent chance that alignment is easy and a lot of our ideas will work, and a decent chance that it’s hard and basically nothing works.

David Matolcsi Jan 17, 2025, 6:21 AM
2 points
0
in reply to: Drake Thomas’s comment on: Drake Thomas’s Shortform
Does anyone know of a not peppermint flavored zinc acetate lozenge? I really dislike peppermint, so I’m not sure it would be worth it to drink 5 peppermint flavored glasses of water a day to decrease the duration of cold with one day, and I haven’t found other zinc acetate lozenge options yet, the acetate version seems to be rare among zing supplement. (Why?)

David Matolcsi Jan 10, 2025, 9:43 PM
7 points
6
in reply to: Ben Pace’s comment on: On Eating the Sun
Fair, I also haven’t made any specific commitments, I phrased it wrongly. I agree there can be extreme scenarios with trillions of digital minds tortured where you’d maybe want to declare war on the. rest of society. But I would still like people to write down that “of course, I wouldn’t want to destroy Earth before we can save all the people who want to live in their biological bodies, just to get a few years of acceleration in the cosmic conquest”. I feel a sentence like this should really have been included in the original post about dismantling the Sun, and until people are not willing to write this down, I remain paranoid that they would in fact haul the Amish the extermination camps if it feels like a good idea at the time. (As I said, I met people who really held this position.)

David Matolcsi Jan 10, 2025, 8:19 PM
3 points
0
in reply to: Ben Pace’s comment on: On Eating the Sun
As I explain in more detail in my other comment, I expect market based approaches to not dismantle the Sun anytime soon. I’m interested if you know of any governance structure that you support that you think will probably lead to dismantling the Sun within the next few centuries.

David Matolcsi Jan 10, 2025, 8:13 PM
2 points
2
in reply to: Raemon’s comment on: On Eating the Sun
I feel reassured that you don’t want to Eat the Earth while there are still biological humans who want to live on it.
I still maintain that under governance systems I would like, I would expect the outcome to be very conservative with the solar system in the next thousand years. Like one default governance structure I quite like is to parcel out the Universe equally among the people alive during the Singularity, have a binding constitution on what they can do on their fiefdoms (no torture, etc), and allow them to trade and give away their stuff to their biological and digital descendants. There could also be a basic income coming to all biological people,^[1] though not to digital as it’s too easy to mass-produce them.
One year of delay in cosmic expansion costs us around 1 in a billion of the reachable Universe under some assumptions on where the grabby aliens are (if they exist). One year also costs us around 1 in a billion of the Sun’s mass being burned, if like Habryka you care about using the solar system optimally for the sake of the biological humans who want to stay. So one year of delay can be bought by 160 people paying out 10% of their wealth. I really think that you won’t do things like moving the Earth closer to the Sun and things like that in the next 200 years, there will just always be enough people to pay out, it just takes 10,000 traditionalist families, literally the Amish could easily do it. And it won’t matter much, the cosmic acceleration will soon become a moot point as we build out other industrial bases, and I don’t expect the biological people to feel much of a personal need to dismantle the Sun anytime soon. Maybe in 10,000 years the objectors will run out of money, and the bio people either overpopulate or have expensive hobbies like building planets to themselves and decide to dismantle the Sun, though I expect them to be rich enough to just haul in matter from other stars if they want to.
By the way, I recommend Tim Underwood’s sci-fi, The Accord, as a very good exploration of these topics, I think it’s my favorite sci-fi novel.
As for the 80 trillions stars, I agree it’s a real loss, but for me this type of sadness feels “already priced in”. I already accepted that the world won’t and shouldn’t be all my personal absolute kingdom, so other people’s decision will cause a lot of waste from my perspective, and 0.00000004% is just a really negligible part of this loss. In this, I think my analogy to current government is quite apt, I feel similarly about current governments, that I already accepted that the world will be wasteful compared to the rule of a dictatorship perfectly aligned with me, but that’s how it needs to be.
1. ^
  Though you need to pay attention to overpopulation. If the average biological couple has 2.2 children, the Universe runs out of atoms to support humans in 50 thousand years. Exponential growth is crazy fast.

David Matolcsi Jan 10, 2025, 9:35 AM
4 points
3
in reply to: habryka’s comment on: On Eating the Sun
I maintain that biological humans will need to do population control at some point. If they decide that enacting the population control in the solar system at a later population leve is worth it for them to dismantle the Sun, then they can go for it. My guess is that they won’t, and will have population control earlier.

David Matolcsi Jan 10, 2025, 9:29 AM
1 point
0
in reply to: habryka’s comment on: On Eating the Sun
I think that the coder looking up and saying that the Sun burning is distasteful but the Great Transhumanist Future will come in 20 years, along with a later mention of “the Sun is a battery”, together implies that the Sun is getting dismantled in the near future. I guess you can debate in how strong the implication is, maybe they just want to dismantle the Sun in the long term, and currently only using the Sun as a battery in some benign way, but I think that’s not the most natural interpretation.

David Matolcsi Jan 10, 2025, 9:23 AM
1 point
0
in reply to: Ben Pace’s comment on: On Eating the Sun
Yeah, maybe I just got too angry. As we discussed in other comments, I believe that astronomical acceleration perspective the real deal is maximizing the initial industrialization of Earth and its surroundings, which does require killing off (and mind uploading) the Amish and everyone else. Sure, if people are only arguing that we should only dismantle the Sun and Earth after millennia, that’s more acceptable, but I really don’t see what’s the point then, we can build out our industrial base on Alpha Centauri by then.
The part that is frustrating to me that neither the original post, nor any of the commenters arguing with me are not caveating their position with “of course, we would never want to destroy Earth before we can save all the people who want to live in their biological bodies, even though this is plausibly the majority of the cost in cosmic slow-down”. If you agree with this, please say so, and I still have quarrels about removing people to artificial planets if they don’t want to go, but I’m less horrified. But so far, no one was willing to clarify that they don’t want to destroy Earth before saving the biological people, and I really did hear people say in private conversations things like “we will immediately kill all the bodies and upload the minds, the people will thank us later once they understand better” and things of that sort, which makes me paranoid.
Ben, Oliver, Raemon, Jessica, are you willing to commit to not wanting to destroy Earth if it requires killing the biological bodies of a significant number of non-consenting people? If so, my ire was not directed against you and I apologize to you.

David Matolcsi Jan 10, 2025, 9:01 AM
1 point
0
in reply to: habryka’s comment on: On Eating the Sun
I expect non-positional material goods to be basically saturated for Earth people in a good post-Singularity world, so I don’t think you can promise them to become twice as rich. And also, people dislike drastic change and new things they don’t understand. 20% of the US population refused the potentially life-saving covid vaccine out of distrust of new things they don’t understand. Do you think they would happily move to a new planet with artificial sky maintained by supposedly benevolent robots? Maybe you could buy off some percentage of the population if material goods weren’t saturated, but surely not more than you could convince to get the vaccine? Also, don’t some religions (Islam?) have specific laws about what to do at sunrise and sunset and so on? Do you think all the imams would go along with moving to the new artificial Earth? I really think you are out of touch with the average person on this one, but we can go out to the streets and interview some people on the matter, though Berkeley is maybe not the most representative place for this.
(Again, if you are talking about cultural drift over millennia, that’s more plausible, though I’m below 50% they would dismantle the Sun. But I’m primarily arguing against dismantling the Sun within twenty years of the Singularity.)

David Matolcsi Jan 10, 2025, 8:49 AM
1 point
0
in reply to: Ben Pace’s comment on: On Eating the Sun
Are you arguing that if technologically possible, the Sun should be dismantled in the first few decades after the Singularity, as it is implied in the Great Transhumanist Future song, the main thing I’m complaining about here? In that case, I don’t know of any remotely just and reasonable (democratic, market-based or other) governance structure that would allow that to happen given how the majority of people feel.
If you are talking about population dynamics, ownership and voting shifting over millennia to the point that they decide to dismantle the Sun, then sure, that’s possible, though that’s not what I expect to happen, see my other comment on market trades and my reply to Habryka on population dynamics.

David Matolcsi Jan 10, 2025, 8:43 AM
1 point
0
in reply to: habryka’s comment on: On Eating the Sun
You mean that people on Earth and the solar system colonies will have enough biological children, and space travel to other stars for biological people will be hard enough that they will want the resources from dismantling the Sun? I suppose that’s possible, though I expect they will put some kind of population control for biological people in place before that happens. I agree that also feels aversive, but at some point it needs to be done anyway, otherwise exponential population growth just brings us back to the Malthusian limit a few ten thousand years from now even if we use up the whole Universe. (See Tim Underwood’s excellent rationalist sci-fi novel on the topic.)
If you are talking about ems and digital beings, not biological humans, I don’t think they will and should have have decision rights over what happens with the solar system, as they can simply move to other stars.

David Matolcsi Jan 10, 2025, 8:29 AM
2 points
−1
in reply to: Ben Pace’s comment on: On Eating the Sun
I agree that not all decisions about the cosmos should be made on a majoritarian democratic way, but I don’t see how replacing the Sun with artificial light can be done by market forces under normal property rights. I think you are currently would not be allowed to build a giant glass dome around someone’s pot of land, and this feels at least that strong.
I’m broadly sympathetic to having property rights and markets in the post-Singularity future, and probably the people will scope-sensitive and longtermist preferences will be able to buy out the future control of far-away things from the normal people who don’t care about these too much. But these trades will almost certainly result the solar system being owned by a coalition of normal people, except if they start with basically zero capital. I don’t know how you imagine the initial capital allocation to look like in your market-based post-Singularity world, but if the vast majority of the population doesn’t have enough control to even save the Sun, then probably something went deeply wrong.

David Matolcsi Jan 10, 2025, 8:05 AM
1 point
3
in reply to: Raemon’s comment on: On Eating the Sun
I agree that I don’t viscerally feel the loss of the 200 galaxies, and maybe that’s a deficiency. But I still find this position crazy. I feel this is a decent parallel dialogue:
Other person: “Here is a something I thought of that would increase health outcomes in the world by 0.00000004%.”
Me: “But surely you realize that this measure is horrendously unpopular, and the only way to implement it is through a dictatorial world government.”
Other person: “Well yes, I agree it’s a hard dilemma, but on absolute terms, 0.00000004% of the world population is 3 people, so my intervention would save 3 lives. Think about how horrible tragedy every death is, I really feel there is a missing mood here when you don’t even consider the upside of my proposal.”

I do feel sad when the democratic governments of the world screw up policy in a big way according to my beliefs and values (as they often do). I still believe that democracy is the least bad form of government, but yes, I feel the temptation of how much better it would be if we were instead governed by perfect philosopher king who happens to be aligned with my values. But 0.00000004% is just a drop in the ocean.
Similarly, I believe that we should try to maintain a relatively democratic form of government at least in the early AI age (and then the people can slowly figure out if they find something better than democracy). And yes, I expect that democracies will do a lot of incredibly wasteful, stupid, and sometimes evil things by my lights, and I will sometimes wish if somehow I could have become the philosopher king. That’s just how things always are. But leaving Earth alone really is chump change, and won’t be among the top thousand things I disagree with democracies on.
(Also, again, I think there will also be a lot of value in conservatism and caution, especially that we probably won’t be able to fully trust our AIs in the most complicated issues. And I also think there is something intrinsically wrong about destroying Earth, I think that if you cut out the world’s oldest tree to increase health outcomes by 0.00000004%, you are doing something wrong, and people have a good reason to distrust you.)

David Matolcsi Jan 10, 2025, 7:36 AM
3 points
0
in reply to: ryan_greenblatt’s comment on: On Eating the Sun
Yes, I wanted to argue something like this.

David Matolcsi Jan 10, 2025, 3:12 AM
12 points
−2
in reply to: Raemon’s comment on: On Eating the Sun
I think this is a false dilemma. If all human cultures on Earth come to the conclusion in 1000 years that they would like the Sun to be dismantled (which I very much doubt), then sure, we can do that. But at that point, we could already have built awesome industrial bases by dismantling Alpha Centauri, or just building them up by dismantling 0.1% of the Sun that doesn’t affect anything on Earth. I doubt that totally dismantling the Sun after centuries would significantly accelerate the time we reach the cosmic event horizon.
The thing that actually has costs is not to immediately bulldoze down Earth and turn it into a maximally efficient industrial powerhouse at the cost of killing every biological body. Or if the ASI has the opportunity to dismantle the Sun in a short notice (the post alludes to 10,000 years being a very conservative estimate and “the assumption that ASI eats the Sun within a few years”). But that’s not going to happen democratically. There is no way you get 51% of people ^[1] to vote for bulldozing down Earth and killing their biological bodies, and I very much doubt you get that vote even for dismantling the Sun in a few years and putting some fake sky around Earth for protection. It’s possible there could be a truly wise philosopher king who could with aching heart overrule everyone else’s objection and bulldoze down Earth to get those extra 200 galaxies at the edge of the Universe, but then govern the Universe wisely and benevolently in a way that people on reflection all approve of. But in practice, we are not going to get a wise philosopher king. I expect that any government that decides to destroy the Sun for the greater good, against the outrage of the vast majority of people, will also be a bad ruler of the Universe.
I also believe that AI alignment is not a binary, and even in the worlds where there is no AI takeover, we will probably get an AI we initially can’t fully tust that will follow the spirit of our commands in exotic situations we can’t really understand. In that case, it would be extremely unwise to immediately instruct it to create mind uploads (how faithful those will be?) and bulldoze down the world to turn the Sun into computronium. There are a lot of reasons for taking things slow.
Usually rationalists are pretty reasonable about these things, and endorse democratic government and human rights, and they even often like talking about the Long Reflection a taking things slow. But then they start talking about dismantling the Sun! This post can kind of defend itself that it was proposing a less immediate and horrifying implementation (though there really is a missing mood here), but there are other examples, most notably the Great Transhumanist Future song in last year’s Solstice, where a coder looks up to the burning Sun disapprovingly, and in twenty years with a big ol’ computer they will use the Sun as a battery.
I don’t know if the people talking like that are so out of touch that they believe that with a little convincing everyone will agree to dismantle the Sun in twenty years, or they would approve of an AI-enabled dictatorship bulldozing over Earth, or they just don’t think through the implications. I think it’s mostly that they just don’t think about it too hard, but I did hear people coming out in favor of actually bulldozing down Earth (usually including a step where we forcibly increase everyone’s intelligence until they agree with the leadership), and I think that’s very foolish and bad.
1. ^
  And even 51% of the vote wouldn’t be enough in any good democracy to bulldoze over everyone else