learn math or hardware
mesaoptimizer
The first intuition is that the counterfactual involves changes the physical result of your decision making, not the process of your decision making itself. The second intuition is that the counterfactual involves a replacement of the process of your decision making such that you’d take another action than you would normally do.
I imagine it as the following:
Physical intervention: I imagine that I’m possessed by a demon that leads me to take the physical actions to choose another option than I would have voluntarily.
Logical intervention: I imagine that I was a different person with a different life history, that would have led me to choose a different path than the me in physical reality would choose. This doesn’t quite communicate how loopy logical intervention can feel, however: I usually imagine logical alternative futures as ones where you effectively have 2+2=3 or something equally clearly illogical as a part of the bedrock of the universe.
I don’t think that different problems lead one to develop different intuitions. I think that physical intervention is the more intuitive way people relate to counterfactuals, including for mundane decision theory problems like Newcomb’s problem, and that logical intervention is something people need clarifying thought experiments to get used to. I found Counterlogical Mugging (which is counterfactual mugging but involves a statement you have logical uncertainty over) as a very useful intuition pump to start thinking in terms of logical intervention as a counterfactual.
For a more rigorous explanation, here’s the relevant section from MacDermott et al., “Characterising Decision Theories with Mechanised Causal Graphs”:
But in the Twin Prisoner’s Dilemma, one might interpret the policy node in two different ways, and the interpretation will affect the causal structure. We could interpret intervening on your policy ˜D as changing the physical result of the compilation of your source code, such that an intervention will only affect your decision D, and not that of your twin T . Under this physical notion of causality, we get fig. 3a, where there is a common cause S explaining the correlation between the agent’s policy and its twin’s.
But on the other hand, if we think of intervening on your policy as changing the way your source code compiles in all cases, then intervening on it will affect your opponent’s policy, which is compiled from the same code. In this case, we get the structure shown in fig. 3b, where an intervention on my policy would affect my twin’s policy. We can view this as an intervention on an abstract “logical” variable rather than an ordinary physical variable. We therefore call the resulting model a logical-causal model.
Pearl’s notion of causality is the physical one, but Pearl-style graphs have also been used in the decision theory literature to represent logical causality. One purpose of this paper is to show that mechanism variables are a useful addition to any graphical model being used in decision theory.
Personal experience observing certain trans women doing it in in-person and online group conversations, in a part of my social circle that is composed of queer and trans autistic people.
Thank you for the passionate comment.
Indeed, and I apologize for not being more diplomatic.
a lot of dating advice given to men doesn’t reflect base reality
I agree.
I think it is appropriate to recommend people do expensive things, even if they are speculative, as many of the people I have in mind are distressed about matters of love and sex and have a lot of disposable income.
Seems fine if your intention was to bring it to the attention of these people, sure. I still feel somewhat wary of pushing people to take steroids out of a desire to be perceived as more masculine: it can go really badly. In general, I am wary of recommending extreme interventions to people without having a lot more context of their situation. It is hard to steer complex systems to outcomes you desire.
Facial attractiveness is very consequential, hedonic returns on cosmetic surgery are generally very high, regret is very low, and it seems to me that basically everyone could benefit by improvements on the margin.
Seems very plausible to me. On the other hand, I believe that the highest ROI interventions for most lonely people do not look like getting cosmetic surgery done to improve their ability to date. Location, social circle, and social skills seem to matter a lot more. Perhaps you take it as a given that these things have been already optimized to the extent possible.
It shouldn’t be an issue that you banish the non-extreme cases from your mind. I’m assuming from the way you’re phrasing the stuff about homeless people that you’re indicating that you do take this attitude but on some level don’t really endorse it?
I was communicating how I deal with certain ugly facets of reality, not necessarily stating how I believe people should deal with these facets of reality. Would I ideally interact with such people from a place of empathy and not-worrying-about-myself? Sure.
Second, I think the facial attractiveness literature makes this tension make more sense. It seems that “feminine” features really are more beautiful—for basically anyone. Hence my recommendation that Asian men need to masculinize their bodies (don’t let yourself have an unathletic skinny-fat build), but feminize their faces (softer face contours really are just more universally appealing that you would think).
Okay, I see what you mean and consider it plausible.
I respect the courage in posting this on LessWrong and writing your thoughts out for all to hear and evaluate and judge you for. It is why I’ve decided to go out on a limb and even comment.
take steroids
Taking steroids usually leads to a permanent reduction of endogenous testosterone production, and infertility. I think it is quite irresponsible for you to recommend this, especially on LW, without the sensible caveats.
take HGH during critical growth periods
Unfortunately, this option is only available for teenagers with parents who are rich enough to be willing to pay for this (assuming the Asian male we are talking about here has started with an average height, and therefore is unlikely to have their health insurance pay for HGH).
lengthen your shins through surgery
From what I hear, this costs between 50k − 150k USD, and six months to an year of being bedridden to recover. In addition, it might make your legs more fragile when doing squats or deadlifts.
(also do the obvious: take GLP-1 agonists)
This is sane, and I would agree, if the person is overweight.
Alternatively, consider feminizing.
So if Asian men are perceived to be relatively unmasculine, you want them to feminize themselves? This is a stupid and confused statement.
I believe that what you mean is some sort of costly signalling via flamboyance, which does not necessarily feminize them as much as make them stand out and perhaps signal other things like having the wealth to invest in grooming and fashion, and having the social status to be able to stand out.
Saying Asian men need to feminize reminds me of certain trans women’s rather insistent attempt to normalize the idea of effeminate boys transitioning for social acceptance, which is an idea I find quite distasteful (its okay for boys to cry and to be weak, and I personally really dislike people and cultures that traumatize young men for not meeting the constantly escalating standards of masculinity).
Schedule more plastic surgeries in general.
I see you expect people to have quite a lot of money to burn on fucking with their looks. I think I agree that plastic surgeries are likely a good investment for a young man with money burning a hole in their pocket and a face that they believe is suboptimal. Some young men truly are cursed with a face that makes me expect that no girl will find them sexually attractive, and I try to not think about it, in the same way that seeing a homeless person makes me anxious about the possibility of me being homeless and ruins the next five minutes of my life.
Don’t tell the people you’re sexually attracted to that you are doing this — that’s low status and induces guilt and ick.
You can tell them the de facto truth while communicating it in a way that makes it have no effect on how you are perceived.
Don’t ask Reddit, they will tell you you are imagining things and need therapy. Redditoid morality tells you that it is valid and beautiful to want limb lengthening surgery if you start way below average and want to go to the average, but it is mental illness to want to go from average to above average.
This also applies to you, and I think you’ve gone too far in the other direction.
Don’t be cynical or bitter or vengeful — do these things happily.
Utterly ridiculous, don’t tell people how to feel.
Even if I’d agree with your conclusion, your argument seems quite incorrect to me.
the seeming lack of reliable feedback loops that give you some indication that you are pushing towards something practically useful in the end instead of just a bunch of cool math that nonetheless resides alone in its separate magisterium
That’s what math always is. The applicability of any math depends on how well the mathematical models reflect the situation involved.
would build on that to say that for every powerfully predictive, but lossy and reductive mathematical model of a complex real-world system, there are a million times more similar-looking mathematical models that fail to capture the essence of the problem and ultimately don’t generalize well at all. And it’s only by grounding yourself to reality and hugging the query tight by engaging with real-world empirics that you can figure out if the approach you’ve chosen is in the former category as opposed to the latter.
It seems very unlikely to me that you’d have many ‘similar-looking mathematical models’. If a class of real-world situations seems to be abstracted in multiple ways such that you have hundreds (not even millions) of mathematical models that supposedly could capture its essence, maybe you are making a mistake somewhere in your modelling. Abstract away the variations. From my experience, you may have a small bunch of mathematical models that could likely capture the essence of the class of real-world situations, and you may debate with your friends about which one is more appropriate, but you will not have ‘multiple similar-looking models’.
Nevertheless, I agree with your general sentiment. I feel like humans will find it quite difficult make research progress without concrete feedback loops, and that actually trying stuff with existing examples of models (that is, the stuff that Anthropic and Apollo are doing, for example) provide valuable data points.
I also recommend maybe not spending so much time reading LessWrong and instead reading STEM textbooks.
Yes, he is doing something, but he is optimizing for signal rather than the true thing. Becoming a drug addict, developing schizophrenia, killing yourself—those are all costly signals of engaging with the abyss.
What? Michael Vassar has (AFAIK from Zack M. Davis’ descriptions) not taken drugs or promoted becoming a drug addict or “killing yourself”. If you hear his Spencer interview, you’ll notice that he seems very sane and erudite, and clearly does not give off the unhinged ‘Nick Land’ vibe that you seem to be claiming that he has or he promotes.
You are directly contributing to the increase of misinformation and FUD here, by making such claims without enough confidence or knowledge of the situation.
As of right now, I expect we have at least a decade, perhaps two, until we get a human intelligence level generalizing AI (which is what I consider AGI). This is a controversial statement in these social circles, and I don’t have the bandwidth or resources to write a concrete and detailed argument, so I’ll simply state an overview here.
-
Scale is the key variable driving progress to AGI. Human ingenuity is irrelevant. Lots of people believe they know the one last piece of the puzzle to get AGI, but I increasingly expect the missing pieces to be too alien for most researchers to stumble upon just by thinking about things without doing compute-intensive experiments.
-
Scale shall increasingly require more and larger datacenters and a lot of power. Humanity’s track record at accomplishing megaprojects is abyssmal. If we find ourselves needing to build city-sized datacenters (with all the required infrastructure to maintain and supply it), I expect that humanity will take twice the initially estimated time and resources to build something with 80% of the planned capacity.
So the main questions for me, given my current model, are these:
How many OOMs of optimization power would you need for your search process, to stumble upon a neural network model (or more accurately, an algorithm), that is just general enough that it can start improving itself? (To be clear, I expect this level of generalization to be achieved when we create AI systems that can do ML experiments autonomously.)
How much more difficult will each OOM increase be? (For example, if we see an exponential increase in resources and time to build the infrastructure, that seems to compensate for the exponential increase in the optimization power provided.)
Both questions are very hard to answer with rigor I’d consider adequate given their importance. If you did press me to answer, however: my intuition is that we’d need at least three OOMs and that the OOM-increase difficulty would be exponential, which I approximate via a doubling of time taken. Given that Epoch’s historical trends imply that it takes two years for one OOM, I’d expect that we roughly have at least 2 + 4 + 8 = 14 years more before the labs stumble upon a proto-Clippy.
-
IDK how to understand your comment as referring to mine.
I’m familiar with how Eliezer uses the term. I was more pointing to the move of saying something like “You are [slipping sideways out of reality], and this is bad! Stop it!” I don’t think this usually results in the person, especially confused people, reflecting and trying to be more skilled at epistemology and communication.
In fact, there’s a loopy thing here where you expect someone who is ‘slipping sideways out of reality’ to caveat their communications with an explicit disclaimer that admits that they are doing so. It seems very unlikely to me that we’ll see such behavior. Either the person has confusion and uncertainty and is usually trying to honestly communicate their uncertainty (which is different from ‘slipping sideways’), or the person would disagree that they are ‘slipping sideways’ and claim (implicitly and explicitly) that what they are doing is tractable / matters.
I think James was implicitly tracking the fact that takeoff speeds are a feature of reality and not something people can choose. I agree that he could have made it clearer, but I think he’s made it clear enough given the following line:
I suspect that even if we have a bunch of good agent foundations research getting done, the result is that we just blast ahead with methods that are many times easier because they lean on slow takeoff, and if takeoff is slow we’re probably fine if it’s fast we die.
And as for your last sentence:
If you don’t, you’re spraying your [slipping sideways out of reality] on everyone else.
It depends on the intended audience of your communication. James here very likely implicitly modeled his audience as people who’d comprehend what he was pointing at without having to explicitly say the caveats you list.
I’d prefer you ask why people think the way they do instead of ranting to them about ‘moral obligations’ and insinuating that they are ‘slipping sideways out of reality’.
Seems like most people believe (implicitly or explicitly) that empirical research is the only feasible path forward to building a somewhat aligned generally intelligent AI scientist. This is an underspecified claim, and given certain fully-specified instances of it, I’d agree.
But this belief leads to the following reasoning: (1) if we don’t eat all this free energy in the form of researchers+compute+funding, someone else will; (2) other people are clearly less trustworthy compared to us (Anthropic, in this hypothetical); (3) let’s do whatever it takes to maintain our lead and prevent other labs from gaining power, while using whatever resources we have to also do alignment research, preferably in ways that also help us maintain or strengthen our lead in this race.
If you meet Buddha on the road...
I recommend messaging people who seem to have experience doing so, and requesting to get on a call with them. I haven’t found any useful online content related to this, and everything I’ve learned in relation to social skills and working with neurodivergent people, I learned by failing and debugging my failures.
I hope you’ve at least throttled them or IP blocked them temporarily for being annoying. It is not that difficult to scrape a website while respecting its bandwidth and CPU limitations.
Yeah I think yours has achieved my goal—a post to discuss this specific research advance. Please don’t delete your post—I’ll move mine back to drafts.
I searched for it and found none. The twitter conversation also seems to imply that there has not been a paper / technical report out yet.
Based on your link, it seems like nobody even submitted anything to the contest throughout the time it existed. Is that correct?
yet mathematically true
This only seems to be the case because the equals sign is redefined in that sentence.
I expect that Ryan means to say one of the these things:
There isn’t enough funding for MATS grads to do useful work in the research directions they are working on, that have already been vouched for by senior alignment researchers (especially their mentors) to be valuable. (Potential examples: infrabayesianism)
There isn’t (yet) institutional infrastructure to support MATS grads to do useful work together as part of a team focused on the same (or very similar) research agendas, and that this is the case for multiple nascent and established research agendas. They are forced to go to academia and disperse across the world instead of being able to work together in one location. (Potential examples: selection theorems, multi-agent alignment (of the sort that Caspar Oesterheld and company work on))
There aren’t enough research managers in existing established alignment research organizations or frontier labs to enable MATS grads to work on the research directions they consider extremely high value, and would benefit from multiple people working together on (Potential examples: activation steering)
I’m pretty sure that Ryan does not mean to say that MATS grads cannot do useful work on their own. The point is that we don’t yet have the institutional infrastructure to absorb, enable, and scale new researchers the way our civilization has for existing STEM fields via, say, PhD programs or yearlong fellowships at OpenAI/MSR/DeepMind (which are also pretty rare). AFAICT, the most valuable part of such infrastructure in general is the ability to co-locate researchers working on the same or similar research problems—this is standard for academic and industry research groups, for example, and from experience I know that being able to do so is invaluable. Another extremely valuable facet of institutional infrastructure that enables researchers is the ability to delegate operations and logistics problems—particularly the difficulty of finding grant funding, interfacing with other organizations, getting paperwork handled, etc.
I keep getting more and more convinced, as time passes, that it would be more valuable for me to work on building the infrastructure to enable valuable teams and projects, than to simply do alignment research while disregarding such bottlenecks to this research ecosystem.
I’ve become somewhat pessimistic about encouraging regulatory power over AI development recently after reading this Bismarck Analysis case study on the level of influence (or lack of it) that scientists had over nuclear policy.
The impression I got from some other secondary/tertiary sources (specifically the book Organizing Genius) was that General Groves, the military man who was the interface between the military and Oppenheimer and the Manhattan Project, did his best to shield the Manhattan Project scientists from military and bureaucratic drudgery, and that Vannevar Bush was someone who served as an example of a scientist successfully steering policy.
This case study seems to show that Groves was significantly less of a value add than I thought given the likelihood of him having destroyed Leo Szilard’s political influence (and therefore Leo’s ability to influence nuclear policy in a direction of preventing an arms race or using it in war). Bush also seems like a disappointment—he waited months for information to pass through ‘official channels’ before he attempted to persuade people like FDR to begin a nuclear weapons development program. On top of that, Bush seemed like he internalized the bureaucratic norms of the political and military hierarchy he worked in—when a scientist named Ernest Lawrence tried to reach the relevant government officials to talk about the importance of nuclear weapons development, Bush (according to this paper) got annoyed by him seemingly bypassing the ‘chain of command’ (I assume by focusing on talking to people Bush would report to, instead of to Bush himself) that he threatened to politically marginalize Ernest.
Finally, I see clear parallels between the ineffective attempts by these physicists at influencing nuclear weapons policy via contributing technically and trying to build ‘political capital’, and the ineffective attempts by AI safety engineers and researchers who decide to go work at frontier labs (OpenAI is the clearest example) with the intention of building influence with the people in there so that they can steer things in the future. I’m pretty sure at this point that such a strategy is a pretty bad idea, given that it seems better to do nothing than to contribute to accelerating towards ASI.
There are galaxy-brained counter-arguments to this claim, such as davidad’s supposed game-theoretic model that (AFAICT) involves accelerating to AGI powerful enough to make the provable safety agenda viable, or Paul Christiano’s (again, AFAICT) plan that’s basically ‘given intense economic pressure for better capabilities, we shall see a steady and continuous improvement, so the danger actually is in discontinuities that make it harder for humanity to react to changes, and therefore we should accelerate to reduce compute overhang’. I remain unconvinced by them.
Yes.
I think that intervening on causality and logic are the only two ways one could intervene to create an outcome different from the one that actually occurs.
I don’t work in the decision theory field, so I want someone else to answer this question.