Quinn

Karma: 1,296

https://quinnd.net

Quinn Apr 21, 2025, 12:30 AM
2 points
0
in reply to: Ariel’s comment on: How to end credentialism
I don’t know what legible/transferable evidence would be. I’ve audited a lot of courses at a lot of different universities. Anecdote, sorry.

Quinn Apr 20, 2025, 10:55 PM
3 points
−1
on: How to end credentialism
One thing I like about this is making the actual difficulty deltas between colleges more felt/legible/concrete (by anyone who takes the exams). What I might do in your system at my IQ level (which is pretty high outside of EA but pretty mediocre inside EA) is knock out a degree at an easy university to get warmed up then study for years for a degree at a hard school^[1].

In real life, I can download or audit courses from whatever university I want, but I don’t know what the grading curve is, so when ⁵⁄₆ exercises are too hard I don’t know if that’s because I’m dumb or if ¹⁄₆ is B+ level performance. This is a way that the current system underserves a credential-indifferent autodidact. It’s really hard to know how difficult a course is supposed to be when you’re isolated from the local conditions that make up the grading curve!

Another thing I like about your system is tutoring markets separated from assessment companies. Why is it that we bundle gatekeeping/assessment with preparation? Unbundling might help maintain objective standards, get rid of problems that look like “the professor feels too much affection for the student to fail them”.

This is all completely separate for why your proposal is a hard social problem / a complete nonstarter, which is that I don’t think the system is “broken” right now. There’s an idea you might pick up if you read the smarter leftists, which is that credentialism especially at elite levels preserves privilege and status as a first class use case. This is not completely false today, not least because the further you go back in time in western universities the truer it is.
1. ↩︎
  my prior, 15 years ago, looked like “stanford has a boating scholarship, so obviously selectivity is a wealth/status thing and not reflective of scholarship or rigor”, so the fact that I now believe “more selective colleges have harder coursework” means I’ve seen a lot of evidence. It pains me, believe me, but reality doesn’t care :)

March-April 2025 Progress in Guaranteed Safe AI

QuinnApr 20, 2025, 7:00 PM

6 points

0 comments4 min readLW link

(gsai.substack.com)

Quinn Apr 15, 2025, 1:26 AM
19 points
0
on: Quinn’s Shortform
I get pretty intense visceral outrage at overreaches in immigration enforcement, just seems the height of depravity. Ive looked for a lot of different routes to mental coolness over the last decade (since Trump started his speeches), they mostly amount to staying busy and distracted. Just seems like a really cost ineffective kind of activism to get involved in. Bankrolling lawyers for random people isn’t really in my action space and if it was i’d have opportunity cost to consider.

Quinn Mar 19, 2025, 6:02 PM
2 points
0
in reply to: Quinn’s comment on: Quinn’s Shortform
seems like there’s more prior literature than I thought https://en.wikipedia.org/wiki/Role-based_access_control

Quinn Mar 19, 2025, 5:49 PM
2 points
0
on: Quinn’s Shortform
are SOTA configuration languages sufficient for AI proliferation?

My main aim is to work on “hardening the box” i.e. eliminating software bugs so containment schemes don’t fail for preventable reasons. But in the famous 4o system card example, the one that looks a little like docker exfiltration, the situation arose from user error, wild guess in compose.yaml or the shell script invoking docker run.

In a linux machine

Here’s an example nix file
```
users.users =
    let
      authorized-key-files = [
        “${keyspath}/id_server_ed25519.pub”
        “${keyspath}/id_qd_ed25519.pub”
      ];
    in
    {
      unpermissioneduser = { 
        isNormalUser = false;
        extraGroups = [ “docker” ];
        description = “AgentID=claude-0x0000″;
      };
      coreuser = {
        isNormalUser = true;
        extraGroups = [
          “wheel”
          “networkmanager”
          “docker”
          “video”
        ];
        home = ”/home/coreuser”;
        description = “Core User (delegator of unpermissioneduser)”;
        shell = pkgs.fish;
        openssh.authorizedKeys.keyFiles = authorized-key-files;
      };
      root = {
        openssh.authorizedKeys.keyFiles = authorized-key-files;
        shell = pkgs.fish;
      };
    };
```
You can see that unpermissioneduser has less abilities than coreuser. So you can imagine I just say that unpermissioneduser is an agent and coreuser is the human delegator.

Nix is simply a fully declarative way to do standard linux permissioning (a feature not in the snippet is allocating chmod/chown information for particular users to particular parts of the filesystem). There’s no conceptual leaps from the status quo.

agents and delegation

is linux all that great for when you want to keep track of who’s a delegatee and who’s a delegator? do we need a more graph flavored version of linux userspace/permissions? I’m talking about once we’re reasoning about proliferating agents and their permissions on various machines. Linux groups do not support inheritance, but a user can be a member of many groups. So you could in principle MVP a graph based permissions DSL (perhaps in Nix) on top of the existing Linux user/group ontology, 80% confident, but it could be hairier than making a new ontology. idk.

Quinn Mar 17, 2025, 5:28 PM
2 points
0
on: AI Tools for Existential Security

Examples of promising risk-targeted applications

This section reeks of the guaranteed safe AI agendas, a lot of agreement. For example, using formal methods to harden any box we try to put the AI in is a kind of defensive acceleration that doesn’t work (too expensive) until certain pre-ASI stages of development. I’m working on formal verification agents along these lines right now.

Quinn Mar 1, 2025, 8:34 PM
2 points
0
on: Plausibly Factoring Conjectures
@Tyra Burgess and I wrote down a royalty-aware payout function yesterday:

For a type $B$ , let $L (B)$ be the “left closure under implication” or the admissible antecedents. I.e., the set of all the antecedents A in the public ledger such that $A \to B$ . $p : T y p e \to M o n e y$ is the price that a proposition was listed for (admitting summing over duplicates). Suppose player $1, . . ., k$ have previously proven $B_{1}, . . ., B_{k}$ and $L (A)$ is none other than the set of all $B_{i}$ from $1$ to $k$ .

We would like to fix an $ϵ$ (could be fairly big, like $\frac{1}{5}$ ) and say that the royalty-aware payout given epsilon of $A$ upon an introduction of $α : A$ to the database is $p (A) \times (1 - ϵ)$ such that, where $k = | L (A) |$ , $\frac{p (A) \times ϵ}{k}$ is paid out to each player $i \in 1, . . ., k$ .

This seems vaguely like it has some desirable properties, like the decay of a royalty with length in implications separating it from the currently outpaying type. You might even be able to reconcile it with cartesian-closedness / currying, where $A \times B \to C$ behaves equivalently to $A \to B \to C$ under the payout function.

I think to be more theoretically classy, royalties would arise from recursive structure, but it may work well enough without recursion. It’d be fun to advance all the way to coherence and incentive-compatible proofs, but I certainly don’t see myself doing that.

January-February 2025 Progress in Guaranteed Safe AI

QuinnFeb 28, 2025, 3:10 AM

15 points

1 comment8 min readLW link

(gsai.substack.com)

Quinn Feb 16, 2025, 9:18 PM
5 points
0
on: Quinn’s Shortform
I want a name for the following principle:

the world-spec gap hurts you more than the spec-component gap

I wrote it out much like this a couple years ago and Zac recently said the same thing.

I’d love to be able to just say “the <one to three syllables> principle”, yaknow?
What links here?
- January-February 2025 Progress in Guaranteed Safe AI by Quinn (Feb 28, 2025, 3:10 AM; 15 points)

Quinn Feb 16, 2025, 12:17 AM
2 points
0
in reply to: davekasten’s comment on: davekasten’s Shortform
I’m working on making sure we get high quality critical systems software out of early AGI. Hardened infrastructure buys us a lot in the slightly crazy story of “self-exfiltrated model attacks the power grid”, but buys us even more in less crazy stories about all the software modules adjacent to AGI having vulnerabilities rapidly patched at crunchtime.

Quinn Feb 16, 2025, 12:06 AM
71 points
35
on: Quinn’s Shortform
<standup_comedian> What’s the deal with evals </standup_comedian>

epistemic status: tell me I’m wrong.

Funders seem particularly enchanted with evals, which seems to be defined as “benchmark but probably for scaffolded systems and scoring that is harder than scoring most of what we call benchmarks”.

I can conjure a theory of change. It’s like, 1. if measurement is bad then we’re working with vibes, so we’d like to make measurement good. 2. if measurement is good then we can demonstrate to audiences (especially policymakers) that warning shots are substantial signals and not base it on vibes. (question: what am I missing?)

This is an at least coherent reason why dangerous capability evals pay into governance strats in such a way that maybe philanthropic pressure is correct. It relies on cruxes that I don’t share, like that a principled science of measurement would outperform vibes in a meme war in the first place, but it at least has a crux that works as a fulcrum.

Everything worth doing is at least a little dual use, I’m not attacking anybody. But it’s a faustian game where, like benchmarks, evals pump up races cuz everyone loves it when number go up. The primal urge to see number go up infects every chart with an x and y axis, in other words, evals come with steep capabilities externalities because they spray the labs with more charts that number hasn’t gone up on yet, daring and challenging the lab to step up their game. So the theory of change in which, in spite of this dynamic, an eval is differentially defensive just has to meet a really high standard.

A further problem: the theory of change where we can have really high quality / inarguable signals as warning shots instead of vibes as warning shots doesn’t even apply to most of the evals I’m hearing about from the nonprofit and independent sector. I’m hearing about evals that make me go, “huh, I wonder what’s differentially defensive about that?” and I don’t get good answers. Moreover, an ancient wisdom says “never ask a philanthropist for something capitalism gives you for free”. The case for an individual eval’s unlikeliness to be created by default lab incentives needs to be especially strong, cuz when it isn’t strong one is literally doing the lab’s work for them.

Quinn Feb 15, 2025, 11:32 PM
3 points
0
on: Quinn’s Shortform
The more I learn about measurement, the less seriously I take it

I’m impressed with models that accomplish tasks in zero or one shot with minimal prompting skill. I’m not sure what galaxy brained scaffolds and galaxy brained prompts demonstrate. There’s so much optimization in the measurement space.

I shipped a benchmark recently, but it’s secretly a synthetic data play so regardless of how hard people try in order to score on it, we get synthetic data out of it which leads to finetune jobs which leads to domain specific models that can do such tasks hopefully with minimal prompting effort and no scaffolding.

Quinn Feb 15, 2025, 11:22 PM
40 points
23
on: Quinn’s Shortform
$PERSON at $LAB once showed me an internal document saying that there are bad benchmarks—dangerous capability benchmarks—that are used negatively, so unlike positive benchmarks where the model isn’t shipped to prod if it performs under a certain amount, these benchmarks could block a model from going to prod that performs over a certain amount. I asked, “you create this benchmark like it’s a bad thing, and it’s a bad thing at your shop, but how do you know it won’t be used in a sign-flipped way at another shop?” and he said “well we just call it EvilBench and no one will want to score high on EvilBench”.

It sounded like a ridiculous answer, but is maybe actually true in the case of labs. It is extremely not true in the open weight case, obviously huggingface user Yolo4206969 would love to score high on EvilBench.

Quinn Feb 1, 2025, 6:04 PM
2 points
0
in reply to: Nora_Ammann’s comment on: In response to critiques of Guaranteed Safe AI
I’m surprised to hear you say that, since you write

Upfront, I want to clarify: I don’t believe or wish to claim that GSAI is a full or general panacea to AI risk.

I kinda think anything which is not a panacea is swiss cheese, that those are the only two options.

In a matter of what sort of portofolio can lay down slices of swiss cheese at what rate and with what uncorrelation. And I think in this way GSAI is antifragile to next year’s language models, which is why I can agree mostly with Zac’s talk and still work on GSAI (I don’t think he talks about my cruxes).

Specifically, I think the guarantees of each module and the guarantees of each pipe (connecting the modules) isolate/restrict the error to the world-model gap or the world-spec gap, and I think the engineering problems of getting those guarantees are straightforward / not conceptual problems. Furthermore, I think the conceptual problems with reducing the world-spec gap below some threshold presented by Safeguarded’s TA1 are easier than the conceptual problems in alignment/safety/control.

Quinn Jan 31, 2025, 5:09 AM
4 points
0
on: In response to critiques of Guaranteed Safe AI
I gave a lightning talk with my particular characterization, and included “swiss cheese” i.e. that gsai sources some layers of swiss cheese without trying to be one magic bullet. But if people agree with this, then really guaranteed-safe ai is a misnomer, cuz guarantee doesn’t evoke swiss cheese at all

Quinn Jan 30, 2025, 7:07 AM
6 points
0
on: Fertility Will Never Recover
For anecdata: id be really jazzed about 3 or 4, 5 might be a little crazy but somewhat open to that or more.

Ladies

Quinn Jan 27, 2025, 8:03 PM
5 points
0
in reply to: Ben Pace’s comment on: Ben Pace’s Shortform Feed
yeah last week was grim for a lot of people with r1′s implications for proliferation and the stargate fanfare after inauguration. Had a palpable sensation of it pivoting from midgame to endgame, but I would doubt that sensation is reliable or calibrated.

Quinn Jan 27, 2025, 4:48 AM
3 points
0
on: Tips and Code for Empirical Research Workflows

Tmux allows you to set up multiple panes in your terminal that keep running in the background. Therefore, if you disconnect from a remote machine, scripts running in tmux will not be killed. We tend to run experiments across many tmux panes (especially overnight).

Does no one use suffix & disown which sends a command to a background process that doesn’t depend on the ssh process, or prefix nohup which does the same thing? You have to make sure any logging that goes to stdout goes to a log file instead (and in this way tmux or screen are better)

Your remark about uv: forgot to mention that it’s effectively a poetry replacement, too

Like htop: btm and btop are a little newer, nicer to look at

Also for json: jq. cat file.json | jq for pretty printing json to terminal

Quinn Jan 23, 2025, 10:26 PM
15 points
2
on: Quinn’s Shortform
Feels like a MATS-like Program in india is a big opportunity. When I went to EAG in Singapore a while ago there were so many people underserved by the existing community building and mentorship organizations cuz of visa issues.

Quinn

March-April 2025 Progress in Guaran­teed Safe AI

are SOTA configuration languages sufficient for AI proliferation?

agents and delegation

Jan­uary-Fe­bru­ary 2025 Progress in Guaran­teed Safe AI

<standup_comedian> What’s the deal with evals </standup_comedian>

The more I learn about measurement, the less seriously I take it

March-April 2025 Progress in Guaranteed Safe AI

January-February 2025 Progress in Guaranteed Safe AI

`<standup_comedian>` What’s the deal with evals `</standup_comedian>`