I don’t mean this as a criticism—you can both be right—but this is extremely correlated to the updates made by the average Bay Area x-risk reduction-enjoyer over the past 5-10 years, to the extent that it almost could serve as a summary.
RyanCarey
It may be useful to know that if events all obey the Markov property (they are probability distributions, conditional on some set of causal parents), then the Reichenbach Common Cause Principle follows (by d-separation arguments) as a theorem. So any counterexamples to RCCP must violate the Markov property as well.
There’s also a lot of interesting discussion here.
The idea that “Agents are systems that would adapt their policy if their actions influenced the world in a different way.” works well on mechanised CIDs whose variables are neatly divided into object-level and mechanism nodes: we simply check for a path from a utility function F_U to a policy Pi_D. But to apply this to a physical system, we would need a way to obtain such a partition those variables. Specifically, we need to know (1) what counts as a policy, and (2) whether any of its antecedents count as representations of “influence” on the world (and after all, antecedents A of the policy can only be ‘representations’ of the influence, because in the real world, the agent’s actions cannot influence themselves by some D->A->Pi->D loop). Does a spinal reflex count as a policy? Does an ant’s decision to fight come from a representation of a desire to save its queen? How accurate does its belief about the forthcoming battle have to be before this representation counts? I’m not sure the paper answers these questions formally, nor am I sure that it’s even possible to do so. These questions don’t seem to have objectively right or wrong answers.
So we don’t really have any full procedure for “identifying agents”. I do think we gain some conceptual clarity. But on my reading, this clear definition serves to crystallise how hard it is to identify agents, moreso than it shows practically how it can be done.
(NB. I read this paper months ago, so apologies if I’ve got any of the details wrong.)
Nice. I’ve previously argued similarly that if going for tenure, AIS researchers might places that are strong in departments other than their own, for inter-departmental collaboration. This would have similar implications to your thinking about recruiting students from other departments. But I also suggested we should favour capital cities, for policy input, and EA hubs, to enable external collaboration. But tenure may be somewhat less attractive for AIS academics, compared to usual, in that given our abundant funding, we might have reason to favour Top-5 postdocs over top-100 tenure.
Feature suggestion. Using highlighting for higher-res up/downvotes and (dis)agreevotes.
Sometimes you want to indicate what part of a comment you like or dislike, but can’t be bothered writing a comment response. In such cases, it would be nice if you could highlight the portion of text that you like/dislike, and for LW to “remember” that highlighting and show it to other users. Concretely, when you click the like/dislike button, the website would remember what text you had highlighted within that comment. Then, if anyone ever wants to see that highlighting, they could hover their mouse over the number of likes, and LW would render the highlighting in that comment.
The benefit would be that readers can conveniently give more nuanced feedback, and writers can have a better understanding of how readers feel about their content. It would stop this nagging wrt “why was this downvoted”, and hopefully reduce the extent to which people talk past each other when arguing.
The title suggests (weakly perhaps) that the estimates themselves peer-reviewed. Would be clearer to write “building on” peer reviewed argument, or similar.
Hi Orellanin,
In the early stages, I had in mind that the more info any individual anon-account revealed, the more easily one could infer what time they spent at Leverage, and therefore their identity. So while I don’t know for certain, I would guess that I created anonymoose to disperse this info across two accounts.
When I commented on the Basic Facts post as anonymoose, It was not my intent to contrive a fake conversation between two entities with separate voices. I think this is pretty clear from anonymoose’s comment, too—it’s in the same bulleted and dry format that throwaway uses, so it’s an immediate possibility that throwaway and anonymoose are one and the same. I don’t know why I used anonymoose there. Maybe due to carelessness, or maybe because I lost access to throwaway. (I know that at one time, an update to the forum login interface did rob me of access to my anon-account, but not sure if this was when that happened).
“A Russian nuclear strike would change the course of the conflict and almost certainly provoke a “physical response” from Ukraine’s allies and potentially from the North Atlantic Treaty Organization, a senior NATO official said on Wednesday.
Any use of nuclear weapons by Moscow would have “unprecedented consequences” for Russia, the official said on the eve of a closed-door meeting of NATO’s nuclear planning group on Thursday.
Speaking on condition of anonymity, he said a nuclear strike by Moscow would “almost certainly be drawing a physical response from many allies, and potentially from NATO itself”. “-Reuters
https://news.yahoo.com/russian-nuclear-strike-almost-certainly-144246235.html″
I have heard of talk that the US might instead arm Ukraine with tactical nukes of its own, although I think that would be at least comparably risky as military retaliation.
The reasoning is that retaliating is US doctrine—they generally respond to hostile actions in-kind, to deter them. If Ukraine got nuked, the level of outrage would place intense pressure on Biden to do something, and the hawks would become a lot louder than the doves, similar to after the 9/11 attacks. In the case of Russia, the US has exhausted most non-military avenues already. And US is a very militaristic country—they have many times bombed countries (Syria, Iraq, Afghanistan, Libya) for much less. So military action just seems very likely. (Involving all of NATO or not, as michel says.)
I think your middle number is clearly too low. The risk scenario does not require that NATO trigger article 5 necessarily, but just that they carry out a strategically significant military response, like eliminating Russia’s Black Sea Fleet, nuking, or creating a no-fly zone. And Max’s 80% makes more sense than your 50% for he union of these possibilities, because it is hard to imagine that the US would stand down without penalising the use of nukes.
I would be at maybe .2*.8*.15=.024 for this particular chain of events leading to major US-Russia nuclear war.
All of these seem to be good points, although I haven’t given up on liquidity subsidy schemes yet.
Some reports are not publicised in order not to speed up timelines. And ELK is a bit rambly—I wonder if it will get subsumed by much better content within 2yr. But I do largely agree.
It would be useful to have a more descriptive title, like “Chinchilla’s implications for data bottlenecks” or something.
It’s noteworthy that the safety guarantee relies on the “hidden cost” (:= proxy_utility—actual_utility) of each action being bounded above. If it’s unbounded, then the theoretical guarantee disappears.
For past work on causal conceptions of corrigibility, you should check out this by Jessica Taylor. Quite similar.
It seems like you’re saying that the practical weakness of forecasters vs experts is their inability to make numerous causal forecasts. Personally, I think the causal issue is the main issue, whereas you think it is that the predictions are so numerous. But they are not always numerous—sometimes you can affect big changes by intervening at a few pivot points, such as at elections. And the idea that you can avoid dealing with causal interventions by conditioning on every parent is usually not practical, because conditioning on every parent/confounder means that you have to make too many predictions, whereas you can just run one RCT.
You could test this to some extent by asking the forecasters to predict more complicated causal questions. If they lose most of their edge, then you may be right.
I don’t think the capital being locked up is such a big issue. You can just invest everyone’s money in bonds, and then pay the winner their normal return multiplied by the return of the bonds.
A bigger issue is that you seem to only be describing conditional prediction markets, rather than ones that truly estimate causal quantities, like P(outcome|do(event)). To see this, note that the economy will go down IF Biden is elected, whereas it is not decreased much by causing Biden to be elected. The issue is that economic performance causes Biden to be unpopular to a much greater extent than Biden shapes the economy. To eliminate confounders, you need to randomiser the action (the choice of president), or deploy careful causal identification startegies (such as careful regression discontinuity analysis, or controlling for certain variables, given knowledge of the causal structure of the data generating process). I discuss this a little more here.
I would do thumbs up/down for good/bad, and tick/cross for correct/incorrect.
This covers pretty well the altruistic reasons for/against working on technical AI safety at a frontier lab. I think the main reason for working at a frontier lab, however, is not altruistic. It’s that it offers more money and status than working elsewhere—so it would be nice to be clear-eyed about this.
To be clear, on balance, I think it’s pretty reasonable to want to work at a frontier lab, even based on the altruistic considerations alone.
What seems harder to justify altruistically, however, is why so many of us work on, and fund the same kinds of safety work that is done at frontier AI labs outside of frontier labs. After all, many of the downsides are the same: low neglectedness, safetywashing, shortening timelines, and benefiting (via industry grant programs) from the success of AI labs. Granted, it’s not impossible to get hired to a frontier lab later. But on balance, I’m not sure that the altruistic impact is so good. I do think, however, that it is a pretty good option on non-altruistic grounds, given the current abundance of funding.