JenniferRM comments on What’s the short timeline plan?

JenniferRM 6 Jan 2025 22:12 UTC
6 points
1
I think you’re overindexing on the phrase “status quo”, underindexing on “industry standard”, and missing a lot of practical microstructure.
Lots of firms or teams across industry have attempted to “EG” implement multi-factor authentication or basic access control mechanisms or secure software development standards or red-team tests. Sony probably had some of that in some of its practices in some of its departments when North Korea 0wned them.
Google does not just “OR them together” and half-ass some of these things. It “ANDs together” reasonably high quality versions of everything. Then every year they anneal the culture a little bit more around small controlled probes of global adequacy.
..
Also, in reading that RAND document, I would like to report another “thonk!” sound!
..
Rand’s author(s) seem to have entirely (like at a conceptual level) left out the possibility that AGI (during a training run or during QA with humans or whatever) would itself “become the attacker” and need to be defended against.
It is like they haven’t even seen Ex Machina, or read A Fire Upon The Deep or Daemon.
You don’t just have to keep bad guys OUT, you have to keep “the possible bad guy that was just created by a poorly understood daemon summoning process” IN, and that perspective doesn’t appear anywhere in any of the RAND document that I can see.
No results when I ^f for [demon], [summon], [hypno], [subvert], [pervert], [escape].
((“Subvert” was used once, but it was in a basic bitch paragraph like this (bold in original):
Most access control systems are either software systems or have significant software components. In addition to the more specialized ways of undermining such systems described above, an attacker could undermine them by finding code vulnerabilities and then subverting their behavior (without actually dealing with their cryptographic or core functionality at all). A major category of code vulnerabilities that undermine access control systems on a regular basis are privilege escalation vulnerabilities.
Duh.
))
The best thing in that RAND document is [infected] which occurs twice <3
Page 54 (bold not in original):
PyTorch’s prerelease Linux packages, PyTorch-nightly, were compromised via a malicious binary on PyPi.122 The torchtriton dependency in the PyTorch-nightly build was replaced with a malicious package, leading to the extraction of sensitive data from infected systems.
Page 61 (bold not in original, footnote not a link in the original):
Agent.btz, the malware that infected the U.S. Department of Defense in 2008 and led to the creation of U.S. Cyber Command, came from a USB device that was found in a parking lot on a base in the Middle East.164
“Infection” is a really good metaphor and word for bringing extreme rigor to the tracing of actual causal processes, because it gets at the idea of “programs” (viral or microbial or fungal or whatever genomes (or genome products) spreading via arbitrary physical transmission methods). This is a pretty darn correct frame! :-)
Ever since covid I’ve been beating the drum for BSL5 to handle scary diseases by assuming that the real danger is NOT that “a known disease brought into a lab to be studied there merely infects a scientist (or everyone inside the facility for that matter)”, but that someone inside the BSL5 gets infected with something new that would never naturally evolve and then uses creativity to escape the BSL5 while infected and then that causes a pandemic.
Currently the BSL system only goes up to 4, and all incremental increases in caution and safety are related to the protection of the lab worker from what they are working with more and more and more, while assuming good faith by the site administrator, and the security guards at the exit gate (which may or may not be distinct from the entry gate) who might also get infected, and so on.
I want a Bio-SL5 standard to exist in biology, and I want all GoF research restricted by an international treaty that ensures that the GoF only happens inside BSL5 facilities.
Similarly, I also think humanity also needs an analogous “CompSci-SL5 standard” that imagines the entire computing facility (data center plus programmers plus janitors who have physical access and so on plus all the managers of everyone working inside the CSSL5 campus) is subverted by a nascent superintelligence and then used by that nascent superintelligence to try to subvert all of Earth.
There is no hint of anything at all like this as part of the threat modeling in the RAND report.
Also, if Google had such a thing back when I worked there, I didn’t hear about it. (Then again, maybe the existence of it would have been kept secret?)