LessWrong dev & admin as of July 5th, 2022.
RobertM
Ah, does look like Zach beat me to the punch :)
I’m also still moderately confused, though I’m not that confused about labs not speaking up—if you’re playing politics, then not throwing the PM under the bus seems like a reasonable thing to do. Maybe there’s a way to thread the needle of truthfully rebutting the accusations without calling the PM out, but idk. Seems like it’d be difficult if you weren’t either writing your own press release or working with a very friendly journalist.
I hadn’t, but I just did and nothing in the article seems to be responsive to what I wrote.
Amusingly, not a single news source I found reporting on the subject has managed to link to the “plan” that the involved parties (countries, companies, etc) agreed to.
Nothing in that summary affirmatively indicates that companies agreed to submit their future models to pre-deployment testing by the UK AISI. One might even say that it seems carefully worded to avoid explicitly pinning the companies down like that.
EDIT: I believe I’ve found the “plan” that Politico (and other news sources) managed to fail to link to, maybe because it doesn’t seem to contain any affirmative commitments by the named companies to submit future models to pre-deployment testing by UK AISI.
I’ve seen a lot of takes (on Twitter) recently suggesting that OpenAI and Anthropic (and maybe some other companies) violated commitments they made to the UK’s AISI about granting them access for e.g. predeployment testing of frontier models. Is there any concrete evidence about what commitment was made, if any? The only thing I’ve seen so far is a pretty ambiguous statement by Rishi Sunak, who might have had some incentive to claim more success than was warranted at the time. If people are going to breathe down the necks of AGI labs about keeping to their commitments, they should be careful to only do it for commitments they’ve actually made, lest they weaken the relevant incentives. (This is not meant to endorse AGI labs behaving in ways which cause strategic ambiguity about what commitments they’ve made; that is also bad.)
Huh, that went somewhere other than where I was expecting. I thought you were going to say that ignoring letter-of-the-rule violations is fine when they’re not spirit-of-the-rule violations, as a way of communicating the actual boundaries.
Yeah, there needs to be something like a nonlinearity somewhere. (Or just preference inconsistency, which humans are known for, to say nothing of larger organizations.)
I’m not sure I personally endorse the model I’m proposing, but imagine a slightly less spherical AGI lab which has more than one incentive (profit maximization) driving its behavior. Maybe they care at least a little bit about not advancing the capabilities frontier as fast as possible. This can cause a preference ordering like:
don’t argmax capabilities, because there’s no open-source competition making it impossible to profit from current-gen models
argmax capabilities, since you need to stay ahead of open-source models nipping at your heels
don’t argmax capabilities; go bankrupt because open-source catches up to you (or gets “close enough” for enough of your customers)
ETA: But in practice most of my concerns around open-source AI development are elsewhere.
Take the wheel, Shoggoth! (LW frontpage algorithm experiments)
I think there might be many local improvements, but I’m pretty uncertain about important factors like elasticity of “demand” (for robbery) with respect to how much of a medication is available on demand. i.e. how many fewer robberies do you get if you can get at most a single prescriptions’ worth of some kind of controlled substance (and not necessarily any specific one), compared to “none” (the current situation) or “whatever the pharmacy has in stock” (not actually sure if this was the previous situation—maybe they had time delay safes for storing medication that wasn’t filling a prescription, and just didn’t store the filled prescriptions in the safes as well)?
Headline claim: time delay safes are probably much too expensive in human time costs to justify their benefits.
The largest pharmacy chains in the US, accounting for more than 50% of the prescription drug market[1][2], have been rolling out time delay safes (to prevent theft)[3]. Although I haven’t confirmed that this is true across all chains and individual pharmacy locations, I believe these safes are used for all controlled substances. These safes open ~5-10 minutes after being prompted.
There were >41 million prescriptions dispensed for adderall in the US in 2021[4]. (Note that likely means ~12x fewer people were prescribed adderall that year.) Multiply that by 5 minutes and you get >200 million minutes, or >390 person-years, wasted. Now, surely some of that time is partially recaptured by e.g. people doing their shopping while waiting, or by various other substitution effects. But that’s also just adderall!
Seems quite unlikely that this is on the efficient frontier of crime-prevention mechanisms, but alas, the stores aren’t the ones (mostly) paying the costs imposed by their choices, here.
use spaces that your community already has (Lighthaven?), even if they’re not quite set up the right way for them
Not set up the right way would be an understatement, I think. Lighthaven doesn’t have an indoor space which can seat several hundred people, and trying to do it outdoors seems like it’d require solving maybe-intractable logistical problems (weather, acoustics, etc). (Also Lighthaven was booked, and it’s not obvious to me to what degree we’d want to subsidize the solstice celebration. It’d also require committing a year ahead of time, since most other suitable venues are booked up for the holidays quite far in advance.)
I don’t think there are other community venues that could host the solstice celebration for free, but there might be opportunities for cheaper (or free) venues outside the community (with various trade-offs).
Having said that, I would NOT describe this as asking “how could I have arrived at the same destination by a shorter route”. I would just describe it as asking “what did I learn here, really”.
I mean, yeah, they’re different things. If you can figure out how to get to the correct destination faster next time you’re trying to figure something out, that seems obviously useful.
Some related thoughts. I think the main issue here is actually making the claim of permanent shutdown & deletion credible. I can think of some ways to get around a few obvious issues, but others (including moral issues) remain, and in any case the current AGI labs don’t seem like the kinds of organizations which can make that kind of commitment in a way that’s both sufficiently credible and legible that the remaining probability mass on “this is actually just a test” wouldn’t tip the scales.
I am not covering training setups where we purposefully train an AI to be agentic and autonomous. I just think it’s not plausible that we just keep scaling up networks, run pretraining + light RLHF, and then produce a schemer.[2]
Like Ryan, I’m interested in how much of this claim is conditional on “just keep scaling up networks” being insufficient to produce relevantly-superhuman systems (i.e. systems capable of doing scientific R&D better and faster than humans, without humans in the intellectual part of the loop). If it’s “most of it”, then my guess is that accounts for a good chunk of the disagreement.
Curated. I liked that this post had a lot of object-level detail about a process that is usually opaque to outsiders, and that the “Lessons Learned” section was also grounded enough that someone reading this post might actually be able to skip “learning from experience”, at least for a few possible issues that might come up if one tried to do this sort of thing.
(We check for “downvoter count within window”, not all-time.)
Curated. This dialogue distilled a decent number of points I consider cruxes between these two (clusters of) positions. I also appreciated the substantial number of references linking back to central and generally high-quality examples of each argument being made; I think this is especially helpful when writing a dialogue meant to represent positions people actually hold.
I look forward to the next installment.
Here’s the editor guide section for spoilers. (Note that I tested the instructions for markdown, and that does indeed seem broken in a weird way; the WYSIWYG spoilers still work normally but only support “block” spoilers; you can’t do it for partial bits of lines.)
In this case I think a warning at the top of the comment is sufficient, given the context of the rest of the thread, so up to whether you want to try to reformat your comment around our technical limitations.
Foobar! :::spoiler This text would be covered by a spoiler block. ::: test more stuff on the same line.
EDIT: looks like habryka got there earlier and I didn’t see it.
https://www.lesswrong.com/posts/zXJfH7oZ62Xojnrqs/#sLay9Tv65zeXaQzR4
Intercom is indeed hidden on mobile (since it’d be pretty intrusive at that screen size).