Thanks, your confusion pointed out a critical typo. Indeed the relatively large number of walls broken should make it more likely that the orcs were the culprits. The 1:20 should have been 20:1 (going from −10 dB to +13 dB).
dentalperson
Thanks! These are great points. I applied the correction you noted about the signs and changed the wording about the direction of evidence. I agree that the clarification about the 3 dB rule is useful; linked to your comment.
Edit: The 10 was also missing a sign. It should be −10 + 60 − 37. I also flipped the 1:20 to 20:1 posterior odds that the orcs did it.
How surprising is this to the alignment community professionals (e.g. people at MIRI, Redwood Research, or similar)? From an outside view, the volatility/flexibility and movement away from pure growth and commercialization seems unexpected and could be to alignment researchers’ benefit (although it’s difficult to see the repercussions at this point). While it is surprising to me because I don’t know the inner workings of OpenAI, I’m surprised that it seems similarly surprising to the LW/alignment community as well.
Perhaps the insiders are still digesting and formulating a response, or want to keep hot takes to themselves for other reasons. If not, I’m curious if there is actually so little information flowing between alignment communities and companies like OpenAI such that this would be as surprising as it is to an outsider. For example, there seems to be many people at Anthropic that are directly in or culturally aligned with LW/rationality, and I expected the same to be true to a lesser extend for OpenAI.
I understood there was a real distance between groups, but still, I had a more connected model in my head that is challenged by this news and the response in the first day.
I will be there at 3:30 or so.
I missed your reply, but thanks for calling this out. I’m nowhere as close to you to EY so I’ll take your model over mine, since mine was constructed on loose grounds. I don’t even remember where my number came from, but my best guess is 90% came from EY giving 3/15/16 as the largest number he referenced in the timeline, and from some comments in the Death with Dignity post, but this seems like a bad read to me now.
Thanks for sharing! It’s nice to see plasticity, especially for stats, which seems to have more opinionated contributors than other applied maths. Although, it seems this ‘admission’ is not changing his framework, but rather reinterpreting how ML is used to be compatible with his framework.
Pearl’s texts talk about having causal models that use the do(X) operator (e.g. P(Y|do(X))) to signify causal information. Now in LLMs, he sees the text the model is conditioning on as sometimes being do(X) or X. I’m curious what else besides text would count as this. I’m not sure that I recall this correctly but in his third level, you can use purely observational data to infer causality with things like instrumental variables. If I had a ML model that took as input purely numerical input, such as (tar, smoking status, got cancer, and various other health data), should it be able to predict counterfactual results?
I’m uncertain about what the right answer here is, and how Pearl would view this. My guess is a naive ML model would be able to do this provided the data covered the counterfactual cases which is likely for the smoking case. But it would not be as useful for out of sample counterfactual inferences where there is little or no coverage for the interventions and outcomes (e.g. if one of the inputs was ‘location’ it had to predict the effects of smoking on the ISS, where no one smokes). However, if we kept adding more purely observational information about the universe, it feels like we might be able to get a causal model out a transformer-like thing. I’m aware there are some tools that try to extract a DAG from the data as a primitive form of this approach, but is at odds with the Bayesian stats approach of having a DAG first, then checking to see if the DAG holds with the data or vice versa. Please share if you have some references that would be useful.
Thanks! I’m aware of the resources mentioned but haven’t read deeply or frequently enough to have this kind of overview of the interaction between the cast of characters.
There are more than a few lists and surveys that state the CDFs for some of these people which helps a bit. A big-as-possible list of evidence/priors would be one way to closer inspect the gap. I wonder if it would be helpful to expand on the MIRI conversations and have a slow conversation between a >99% doom pessimist and a <50% doom ‘optimist’ with a moderator to prod them to exhaustively dig up their reactions to each piece of evidence and keep pulling out priors until we get to indifference. It probably would be an uncomfortable, awkward experiment with a useless result, but there’s a chance that some item on the list ends up being useful for either party to ask questions about.
That format would be useful for me to understand where we’re at. Maybe something along these lines will eventually prompt a popular and viral sociology author like Harari or Bostrom (or even just update the CDFs/evidence in Superintelligence). The general deep learning community probably needs to hear it mentioned and normalized on NPR and a bestseller a few times (like all the other x-risks are) before they’ll start talking about it at lunch.
Thanks! Do you know of any arguments with a similar style to The Most Important Century that is as pessimistic as EY/MIRI folks (>90% probability of AGI within 15 years)? The style looks good, but time estimates for that one (2/3rd chance AGI by 2100) are significantly longer and aren’t nearly as surprising or urgent as the pessimistic view asks for.
I still don’t follow why EY assigns seemingly <1% chance of non-earth-destroying outcomes in 10-15 years (not sure if this is actually 1%, but EY didn’t argue with the 0% comments mentioned in the “Death with dignity” post last year). This seems to place fast takeoff as being the inevitable path forward, implying unrestricted fast recursive designing of AIs by AIs. There are compute bottlenecks which seem slowish, and there may be other bottlenecks we can’t think of yet. This is just one obstacle. Why isn’t there more probability mass for this one obstacle? Surely there are more obstacles that aren’t obvious (that we shouldn’t talk about).
It feels like we have a communication failure between different cultures. Even if EY thinks the top industry brass is incentivized to ignore the problem, there are a lot of (non-alignment oriented) researchers that are able to grasp the ‘security mindset’ that could be won over. Both in this interview, and in the Chollet response referenced, the arguments presented by EY aren’t always helping the other party bridge from their view over to his, but go on ‘nerdy/rationalist-y’ tangents and idioms that end up being walls that aren’t super helpful for working on the main point, but instead help the argument by showing that EY is smart and knowledgeable about this field and other fields.
Are there any digestible arguments out there for this level of confident pessimism that would be useful for the public industry folk? By publicly digestible, I’m thinking more of the style in popular books like Superintelligence or Human Compatible.
There are many obstacles with no obvious or money-can-buy solutions.
The claim that current AI is superhuman in just about any task we can benchmark is not correct. The problems being explored are chosen because the researchers think AI have a shot at beating humans at it. Think about how many real world problems we pay other people money to solve that we can benchmark that aren’t being solved by AI. Think about why these problems require humans right now.
My upper bound is much more than 15 years because I don’t feel I have enough information. One thing I worry about is that I feel this community tends to promote confidence, especially when there is news/current events to react to and some leaders have stated their confidence. Sure, condition on new information. But I want to hear more integration of the opposite view beyond strawmaning when a new LLM comes out. It feels like all active voices on LW feels that a 10 or 15 years is the upper bound on when destructive AGI is going to start, which is probably closer to the lower bound for most non LW/rationality-based researchers working on LLMs or deep learning. I want to hear more about the discrepancy beyond ‘they don’t consider the problem the way we do, and we have a better bird’s eye view’. I want to understand how the estimates are arrived on—I feel that if there was more explanation and more variance in the estimates the folks on Hacker News to be able to understand/discuss the gap and not just write off the entire community as crazy as they have here.
Another thing I’m still curious about is who the buyers of these box spreads are (assuming the legs are sold as a combination and not separately) . The discussion says the arbitrage opportunity comes from the FDIC but the FDIC does not directly buy the spread; they allow the CD to exist, which the buyers should have access to. So do the buyers have access to options but not CDs, or are there other advantages that they have which I am missing?
The risk is small, but so is the benefit. As a result this was not a trivial analysis for me. I doubt if it’s low risk to redeposit just 20% at a 30% market drop due to volatility (the recent intraday crash movements exceeded 10%, and you get a margin call at a 40% with the 250k/150k example). After mulling it over I think I agree that it is worth it anyways though.
Here’s some more examples that I ran over. A better person would figure out the probabilities and equations to integrate over to calculate the expected value.
Withdrawing 75% at 0 months should be the break even (you lose .625% * 37.5k = 700 to penalties, but you get back a bit more than that with interest of .625% * 112.5k * 3 years = 700).
If you don’t need to withdraw within the first 6 months, you are ahead. Let’s look at the 100 percentage withdrawal case first to keep it simple. So the actual break even is withdraw 100% at month 6, and any time before this you lose money. If this happens at day 0 you lose .625% * 150k, or .375% of your portfolio as the worst case scenario. So the interesting loss scenarios range from 75% immediately to 100% in 6 months.
I don’t have the probabilities of these events happening, but casually looking at it, it seems like they happen at a significantly lower probability than the equal or greater gains case. Though it should reduce your expected value by a small amount.
Isn’t another risk if the market tanks within the first few months, because you will have to pay the withdrawal penalty from the CD out of pocket before you have the interest accumulate? This risk seems proportionate to the benefit (given that we have more than one huge correction every 50 years, there is a > 2% chance that the market will have a huge correction in the first year).
You say that you are moderately confident that the risk did not include this case, so I’m likely missing something (or you need to do a moderate update).
It seems the whole deal is dependent on margin interest rates, so I would appreciate more discussion of the available margin interest rates available to retail investors and what rates were used in the simulations. I would also like more evidence to the statement in comments that says “the fact that a young investor with 100% equities does better *on the margin* by adding a bit of leverage is very robust” to be able to take it as a fact, as it only seems true at certain rates that don’t seem obviously available.
As one datapoint, my current broker retail margin rates are high, at 8.625% under $25k, 7.5% > $25k loans. Given that this is a period of relatively low interest rates, (and I would expect it to be higher in other periods), it would not seem like an indisputable fact to me that a young investor expect a better return by investing $1 on the margin after interest. But I have no idea how it compares to other brokers. (Unless you are considering leveraged funds, which I considered to be a different beast).
I really appreciate the way you have written this up. It seems that 2-7% of refusals do not respond to the unidimensional treatment. I’m curious if you’ve looked at this subgroup the same way as you have the global data to see if they have another dimension for refusal, or if the statistics of the subgroup shed some other light on the stubborn refusals.