mesaoptimizer

Karma: 1,359

https://mesaoptimizer.com

learn math or hardware

mesaoptimizer Dec 27, 2024, 1:45 PM
4 points
1
in reply to: sunwillrise’s comment on: The Field of AI Alignment: A Postmortem, and What To Do About It
Even if I’d agree with your conclusion, your argument seems quite incorrect to me.

the seeming lack of reliable feedback loops that give you some indication that you are pushing towards something practically useful in the end instead of just a bunch of cool math that nonetheless resides alone in its separate magisterium

That’s what math always is. The applicability of any math depends on how well the mathematical models reflect the situation involved.

would build on that to say that for every powerfully predictive, but lossy and reductive mathematical model of a complex real-world system, there are a million times more similar-looking mathematical models that fail to capture the essence of the problem and ultimately don’t generalize well at all. And it’s only by grounding yourself to reality and hugging the query tight by engaging with real-world empirics that you can figure out if the approach you’ve chosen is in the former category as opposed to the latter.

It seems very unlikely to me that you’d have many ‘similar-looking mathematical models’. If a class of real-world situations seems to be abstracted in multiple ways such that you have hundreds (not even millions) of mathematical models that supposedly could capture its essence, maybe you are making a mistake somewhere in your modelling. Abstract away the variations. From my experience, you may have a small bunch of mathematical models that could likely capture the essence of the class of real-world situations, and you may debate with your friends about which one is more appropriate, but you will not have ‘multiple similar-looking models’.

Nevertheless, I agree with your general sentiment. I feel like humans will find it quite difficult make research progress without concrete feedback loops, and that actually trying stuff with existing examples of models (that is, the stuff that Anthropic and Apollo are doing, for example) provide valuable data points.

I also recommend maybe not spending so much time reading LessWrong and instead reading STEM textbooks.

mesaoptimizer Dec 9, 2024, 8:00 PM
0 points
−2
in reply to: Viliam’s comment on: Hazard’s Shortform Feed

Yes, he is doing something, but he is optimizing for signal rather than the true thing. Becoming a drug addict, developing schizophrenia, killing yourself—those are all costly signals of engaging with the abyss.

What? Michael Vassar has (AFAIK from Zack M. Davis’ descriptions) not taken drugs or promoted becoming a drug addict or “killing yourself”. If you hear his Spencer interview, you’ll notice that he seems very sane and erudite, and clearly does not give off the unhinged ‘Nick Land’ vibe that you seem to be claiming that he has or he promotes.

You are directly contributing to the increase of misinformation and FUD here, by making such claims without enough confidence or knowledge of the situation.

mesaoptimizer Nov 30, 2024, 6:43 PM
28 points
−10
on: mesaoptimizer’s Shortform
As of right now, I expect we have at least a decade, perhaps two, until we get a human intelligence level generalizing AI (which is what I consider AGI). This is a controversial statement in these social circles, and I don’t have the bandwidth or resources to write a concrete and detailed argument, so I’ll simply state an overview here.
- Scale is the key variable driving progress to AGI. Human ingenuity is irrelevant. Lots of people believe they know the one last piece of the puzzle to get AGI, but I increasingly expect the missing pieces to be too alien for most researchers to stumble upon just by thinking about things without doing compute-intensive experiments.
- Scale shall increasingly require more and larger datacenters and a lot of power. Humanity’s track record at accomplishing megaprojects is abyssmal. If we find ourselves needing to build city-sized datacenters (with all the required infrastructure to maintain and supply it), I expect that humanity will take twice the initially estimated time and resources to build something with 80% of the planned capacity.
So the main questions for me, given my current model, are these:
- How many OOMs of optimization power would you need for your search process, to stumble upon a neural network model (or more accurately, an algorithm), that is just general enough that it can start improving itself? (To be clear, I expect this level of generalization to be achieved when we create AI systems that can do ML experiments autonomously.)
- How much more difficult will each OOM increase be? (For example, if we see an exponential increase in resources and time to build the infrastructure, that seems to compensate for the exponential increase in the optimization power provided.)
Both questions are very hard to answer with rigor I’d consider adequate given their importance. If you did press me to answer, however: my intuition is that we’d need at least three OOMs and that the OOM-increase difficulty would be exponential, which I approximate via a doubling of time taken. Given that Epoch’s historical trends imply that it takes two years for one OOM, I’d expect that we roughly have at least 2 + 4 + 8 = 14 years more before the labs stumble upon a proto-Clippy.

mesaoptimizer Sep 19, 2024, 12:04 PM
2 points
0
in reply to: TsviBT’s comment on: Why I funded PIBBSS

IDK how to understand your comment as referring to mine.

I’m familiar with how Eliezer uses the term. I was more pointing to the move of saying something like “You are [slipping sideways out of reality], and this is bad! Stop it!” I don’t think this usually results in the person, especially confused people, reflecting and trying to be more skilled at epistemology and communication.

In fact, there’s a loopy thing here where you expect someone who is ‘slipping sideways out of reality’ to caveat their communications with an explicit disclaimer that admits that they are doing so. It seems very unlikely to me that we’ll see such behavior. Either the person has confusion and uncertainty and is usually trying to honestly communicate their uncertainty (which is different from ‘slipping sideways’), or the person would disagree that they are ‘slipping sideways’ and claim (implicitly and explicitly) that what they are doing is tractable / matters.

mesaoptimizer Sep 19, 2024, 9:09 AM
2 points
0
in reply to: TsviBT’s comment on: Why I funded PIBBSS
I think James was implicitly tracking the fact that takeoff speeds are a feature of reality and not something people can choose. I agree that he could have made it clearer, but I think he’s made it clear enough given the following line:

I suspect that even if we have a bunch of good agent foundations research getting done, the result is that we just blast ahead with methods that are many times easier because they lean on slow takeoff, and if takeoff is slow we’re probably fine if it’s fast we die.

And as for your last sentence:

If you don’t, you’re spraying your [slipping sideways out of reality] on everyone else.

It depends on the intended audience of your communication. James here very likely implicitly modeled his audience as people who’d comprehend what he was pointing at without having to explicitly say the caveats you list.

I’d prefer you ask why people think the way they do instead of ranting to them about ‘moral obligations’ and insinuating that they are ‘slipping sideways out of reality’.

mesaoptimizer Aug 24, 2024, 6:35 PM
4 points
0
in reply to: TsviBT’s comment on: Zach Stein-Perlman’s Shortform
Seems like most people believe (implicitly or explicitly) that empirical research is the only feasible path forward to building a somewhat aligned generally intelligent AI scientist. This is an underspecified claim, and given certain fully-specified instances of it, I’d agree.

But this belief leads to the following reasoning: (1) if we don’t eat all this free energy in the form of researchers+compute+funding, someone else will; (2) other people are clearly less trustworthy compared to us (Anthropic, in this hypothetical); (3) let’s do whatever it takes to maintain our lead and prevent other labs from gaining power, while using whatever resources we have to also do alignment research, preferably in ways that also help us maintain or strengthen our lead in this race.

mesaoptimizer Aug 15, 2024, 8:12 AM
2 points
0
in reply to: Bezzi’s comment on: There are transparent monsters in the world—part 1
If you meet Buddha on the road...

mesaoptimizer Aug 6, 2024, 7:41 AM
2 points
0
in reply to: yanni kyriacos’s comment on: yanni’s Shortform
I recommend messaging people who seem to have experience doing so, and requesting to get on a call with them. I haven’t found any useful online content related to this, and everything I’ve learned in relation to social skills and working with neurodivergent people, I learned by failing and debugging my failures.

mesaoptimizer Aug 6, 2024, 7:34 AM
2 points
0
in reply to: habryka’s comment on: Akash’s Shortform
I hope you’ve at least throttled them or IP blocked them temporarily for being annoying. It is not that difficult to scrape a website while respecting its bandwidth and CPU limitations.

mesaoptimizer Jul 25, 2024, 9:24 PM
2 points
0
in reply to: gjm’s comment on: AlphaProof: an LLM to auto-formalize + AlphaZero self-trained to prove mathematical statements in Lean
Yeah I think yours has achieved my goal—a post to discuss this specific research advance. Please don’t delete your post—I’ll move mine back to drafts.

mesaoptimizer Jul 25, 2024, 4:59 PM
4 points
0
in reply to: peterbarnett’s comment on: “AI achieves silver-medal standard solving International Mathematical Olympiad problems”
I searched for it and found none. The twitter conversation also seems to imply that there has not been a paper / technical report out yet.

mesaoptimizer Jul 25, 2024, 12:08 PM
2 points
0
in reply to: Vanessa Kosoy’s comment on: Prize and fast track to alignment research at ALTER
Based on your link, it seems like nobody even submitted anything to the contest throughout the time it existed. Is that correct?

mesaoptimizer Jul 15, 2024, 5:51 PM
2 points
0
in reply to: George3d6’s comment on: Neural networks as non-leaky mathematical abstraction

yet mathematically true

This only seems to be the case because the equals sign is redefined in that sentence.

mesaoptimizer Jul 15, 2024, 7:50 AM
6 points
0
in reply to: Elizabeth’s comment on: Ryan Kidd’s Shortform
I expect that Ryan means to say one of the these things:
1. There isn’t enough funding for MATS grads to do useful work in the research directions they are working on, that have already been vouched for by senior alignment researchers (especially their mentors) to be valuable. (Potential examples: infrabayesianism)
2. There isn’t (yet) institutional infrastructure to support MATS grads to do useful work together as part of a team focused on the same (or very similar) research agendas, and that this is the case for multiple nascent and established research agendas. They are forced to go to academia and disperse across the world instead of being able to work together in one location. (Potential examples: selection theorems, multi-agent alignment (of the sort that Caspar Oesterheld and company work on))
3. There aren’t enough research managers in existing established alignment research organizations or frontier labs to enable MATS grads to work on the research directions they consider extremely high value, and would benefit from multiple people working together on (Potential examples: activation steering)
I’m pretty sure that Ryan does not mean to say that MATS grads cannot do useful work on their own. The point is that we don’t yet have the institutional infrastructure to absorb, enable, and scale new researchers the way our civilization has for existing STEM fields via, say, PhD programs or yearlong fellowships at OpenAI/MSR/DeepMind (which are also pretty rare). AFAICT, the most valuable part of such infrastructure in general is the ability to co-locate researchers working on the same or similar research problems—this is standard for academic and industry research groups, for example, and from experience I know that being able to do so is invaluable. Another extremely valuable facet of institutional infrastructure that enables researchers is the ability to delegate operations and logistics problems—particularly the difficulty of finding grant funding, interfacing with other organizations, getting paperwork handled, etc.

I keep getting more and more convinced, as time passes, that it would be more valuable for me to work on building the infrastructure to enable valuable teams and projects, than to simply do alignment research while disregarding such bottlenecks to this research ecosystem.

mesaoptimizer Jul 15, 2024, 7:32 AM
40 points
5
on: mesaoptimizer’s Shortform
I’ve become somewhat pessimistic about encouraging regulatory power over AI development recently after reading this Bismarck Analysis case study on the level of influence (or lack of it) that scientists had over nuclear policy.

The impression I got from some other secondary/tertiary sources (specifically the book Organizing Genius) was that General Groves, the military man who was the interface between the military and Oppenheimer and the Manhattan Project, did his best to shield the Manhattan Project scientists from military and bureaucratic drudgery, and that Vannevar Bush was someone who served as an example of a scientist successfully steering policy.

This case study seems to show that Groves was significantly less of a value add than I thought given the likelihood of him having destroyed Leo Szilard’s political influence (and therefore Leo’s ability to influence nuclear policy in a direction of preventing an arms race or using it in war). Bush also seems like a disappointment—he waited months for information to pass through ‘official channels’ before he attempted to persuade people like FDR to begin a nuclear weapons development program. On top of that, Bush seemed like he internalized the bureaucratic norms of the political and military hierarchy he worked in—when a scientist named Ernest Lawrence tried to reach the relevant government officials to talk about the importance of nuclear weapons development, Bush (according to this paper) got annoyed by him seemingly bypassing the ‘chain of command’ (I assume by focusing on talking to people Bush would report to, instead of to Bush himself) that he threatened to politically marginalize Ernest.

Finally, I see clear parallels between the ineffective attempts by these physicists at influencing nuclear weapons policy via contributing technically and trying to build ‘political capital’, and the ineffective attempts by AI safety engineers and researchers who decide to go work at frontier labs (OpenAI is the clearest example) with the intention of building influence with the people in there so that they can steer things in the future. I’m pretty sure at this point that such a strategy is a pretty bad idea, given that it seems better to do nothing than to contribute to accelerating towards ASI.

There are galaxy-brained counter-arguments to this claim, such as davidad’s supposed game-theoretic model that (AFAICT) involves accelerating to AGI powerful enough to make the provable safety agenda viable, or Paul Christiano’s (again, AFAICT) plan that’s basically ‘given intense economic pressure for better capabilities, we shall see a steady and continuous improvement, so the danger actually is in discontinuities that make it harder for humanity to react to changes, and therefore we should accelerate to reduce compute overhang’. I remain unconvinced by them.

mesaoptimizer Jul 10, 2024, 6:36 AM
4 points
2
in reply to: adamShimi’s comment on: The Golden Mean of Scientific Virtues

I’m optimizing for consistently writing and publishing posts.

I agree with this strategy, and I plan to begin something similar soon. I forgot that Epistemological Fascinations is your less polished and more “optimized for fun and sustainability” substack. (I have both your substacks in my feed reader.)

mesaoptimizer Jul 8, 2024, 10:09 PM
4 points
2
on: The Golden Mean of Scientific Virtues
I really appreciate this essay. I also think that most of it consists of sazens. When I read your essay, I find my mind bubbling up concrete examples of experiences I’ve had, that confirm or contradict your claims. This is, of course, what I believe is expected from graduate students when they are studying theoretical computer science or mathematics courses—they’d encounter an abstraction, and it is on them to build concrete examples in their mind to get a sense of what the paper or textbook is talking about.

However, when it comes to more inchoate domains like research skill, such writing does very little to help the inexperienced researcher. It is more likely that they’d simply miss out on the point you are trying to tell them, for they haven’t failed both by, say, being too trusting (a common phenomenon) and being too wary of ‘trusting’ (a somewhat rare phenomenon for someone who gets to the big leagues as a researcher). What would actually help is either concrete case studies, or a tight feedback loop that involves a researcher trying to do something, and perhaps failing, and getting specific feedback from an experienced researcher mentoring them. The latter has an advantage that one doesn’t need to explicitly try to elicit and make clear distinctions of the skills involved, and can still learn them. The former is useful because it is scalable (you write it once, and many people can read it), and the concreteness is extremely relevant to allowing people to evaluate the abstract claims you make, and pattern match it to their own past, current, or potential future experiences.

For example, when reading the Inquiring and Trust section, I recall an experience I had last year where I couldn’t work with a team of researchers, because I had basically zero ability to defer (and even now as I write this, I find the notion of deferring somewhat distasteful). On the other hand, I don’t think there’s a real trade-off here. I don’t expect that anyone needs to naively trust that other people they are coordinating with will have their back. I’d probably accept the limits to coordination, and recalibrate my expectations of the usefulness of the research project, and probably continue if the expected value of working on the project until it is shipped is worth it (which in general it is).

When reading the Lightness and Diligence section, I was reminded of the Choudhuri 1985 paper, which describes the author’s notion of a practice of “partial science”, that is, an inability to push science forward due to certain systematic misconceptions of how basic (theoretical physics, in this context) science occurs. One misconception involves a sort of distaste around working on ‘unimportant’ problems, or problems that don’t seem fundamental, while only caring about or willing to put in effort to solve ‘fundamental’ problems. The author doesn’t make it explicit, but I believe that he believed that the incremental work that scientists do is almost essential for building their knowledge and skill to make their way forwards towards attacking these supposedly fundamental problems, and the aversion to working on supposedly incremental research problems leads people to being stuck. This seems very similar to the thing you are pointing at when you talk about diligence and hard work being extremely important. The incremental research progress, to me, seems similar to what you call ‘cataloguing rocks’. You need data to see a pattern, after all.

This is the sort of realization and thinking I wouldn’t have if I did not have research experience or did not read relevant case studies. I expect that Mesa of early 2023 would have mostly skimmed and ignored your essay, simply because he’d scoff at the notion of ‘Trust’ and ‘Lightness’ being relevant in any way to research work.

mesaoptimizer Jul 8, 2024, 4:28 PM
5 points
1
in reply to: niplav’s comment on: niplav’s Shortform

GPT-4o can not reproduce the string, and instead just makes up plausible candidates. You love to see it.

Hmm. I assume you could fine-tune away an LLM from reproducing the string. Eliciting it would just become more difficult. Try posting canary text, and a part of the canary string, and see if GPT-4o completes it.

mesaoptimizer Jul 5, 2024, 7:11 AM
2 points
2
in reply to: RedMan’s comment on: ryan_greenblatt’s Shortform
Please read the model organisms for misalignment proposal.

mesaoptimizer Jul 4, 2024, 6:41 PM
37 points
29
in reply to: Sam McCandlish’s comment on: Habryka’s Shortform Feed

Anyone who has signed a non-disparagement agreement with Anthropic is free to state that fact (and we regret that some previous agreements were unclear on this point).

I’m curious as to why it took you (and therefore Anthropic) so long to make it common knowledge (or even public knowledge) that Anthropic used non-disparagement contracts as a standard and was also planning to change its standard agreements.

The right time to reveal this was when the OpenAI non-disparagement news broke, not after Habryka connects the dots and builds social momentum for scrutiny of Anthropic.