Oliver Sourbut’s Shortform

Oliver SourbutJul 14, 2022, 3:39 PM

6 points

3 comments LW link

Oliver Sourbut May 23, 2025, 5:45 PM
6 points
2

What constitutes cooperation?

Realised my model pipeline:
1. surface options (or find common ground)
2. negotiate choices (agree a course of action)
3. cooperate/enforce (counteract defections, actually do the joint good thing)
was missing an important preliminary step.

For cooperation to happen, you also need:
1. identify potential coalitions (who could benefit from cooperating)!
(Could break down further: identifying, getting common knowledge, and securing initial prospective cooperative intent.)

In some cases, ‘identifying potential coalitions’ might be a large, even dominant part of the challenge of cooperation, especially when effects are diffuse!

That applies to global commons and it applies when coordinating political action. What other cases?

‘Identifying potential coalitions’ is what a lot of activism is about, and it might also be a big part of what various cooperative memeplexes like tribes, religions, political parties etc are doing.

This feels to me like another important part of the picture that new tech could potentially amplify!

Could we newly empower large groups of humans to cooperate by recognising and fulfilling the requirements of this cooperation pipeline?
Oliver Sourbut Jul 14, 2022, 3:39 PM
5 points
0

‘Temporary MAP stance’ or ‘subjective probability matching’

Temporary MAP stance or subjective probability matching are my words for useful mental manoeuvres for research, especially when dealing with confusing or prepradigmatic or otherwise non-crisp domains.

MAP is Maximum A Posteriori i.e. your best guess after considering evidence. Probability matching is making actions/guesses proportional to your estimate of them being right (rather than picking the single MAP choice).

By this manoeuvre I’m gesturing at a kind of behaviour where you are quite unsure about what’s best (e.g. ‘should I work on interpretability or demystifying deception?’) and rather than allowing that to result in analysis paralysis, you temporarily collapse some uncertainty and make some concrete assumptions to get moving in one or other direction. Hopefully in so doing you a) make a contribution and b) grow your skills and collect new evidence to make better decisions/contributions next time.

It happens to correspond somewhat to a decent heuristic called Thompson Sampling, which is optimal under some conditions for some uncertain-duration sequential decision problems.

HT Evan Hubinger for articulating his take on this in discussions about research, and I’m certain I’ve read others discussing similar principles on LW or EAF but I don’t have references to hand.
Oliver Sourbut May 20, 2025, 12:35 PM
2 points
0

If you want to be twice as profitable as your competitors, you don’t have to be twice as good as them. You just have to be slightly better.

I think AI development is mainly compute constrained (relevant for intelligence explosion dynamics).

There are some arguments against, based on the high spending of firms on researcher and engineer talent. The claim is that this supports one or both of a) large marginal returns to having more (good) researchers or b) steep power laws in researcher talent (implying large production multipliers from the best researchers).

Given that the workforces at labs remain not large, I think the spending naively supports (b) better.

But in fact I think there is another, even better explanation:
- Researchers’ taste (an AI production multiplier) varies more smoothly
- (research culture/collective intelligence of a team or firm may be more important)
- Marginal parallel researchers have very diminishing AI production returns (sometimes negative, when the researchers have worse taste)
- (also determining a researcher’s taste ex ante is hard)
- BUT firms’ utility is sharply convex in AI production
  - capturing more accolades and market share are basically the entire game
  - spending as much time as possible with a non-commoditised offering allows profiting off fast-evaporating margin
- so firms are competing over getting cool stuff out first
  - time-to-delivery of non-commoditised (!) frontier models
- and getting loyal/sticky customer bases
  - ease-of-adoption of product wrapping
  - sometimes differentiation of offerings
- this turns small differences in human capital/production multiplier/research taste into big differences in firm utility
- so demand for the small pool of the researchers with (legibly) great taste is very hot
This also explains why it’s been somewhat ‘easy’ (but capital intensive) for a few new competitors to pop into existence each year, and why firms’ revealed preferred savings rate into compute capital is enormous (much greater than 100%!).

We see token prices drop incredibly sharply, which supports the non-commoditised margin claim (though this is also consistent with a Wright’s Law effect from (runtime) algorithmic efficiency gains, which should definitely also be expected).

A lot of engineering effort is being put into product wrappers and polish, which supports the customer base claim.

The implications include: headroom above top human expert teams’ AI research taste could be on the small side (I think this is right for many R&D domains, because a major input is experimental throughput). So both quantity and quality of (perhaps automated) researchers should have steeply diminishing returns in AI production rate. But might they nevertheless unlock a practical monopoly (or at least an increasingly expensive barrier to entry) on AI-derived profit, by keeping the (more monetisable) frontier out of reach of competitors?
What links here?
- Oliver Sourbut's comment on Will compute bottlenecks prevent a software intelligence explosion? by Tom Davidson (May 26, 2025, 8:57 AM; 4 points)

Oliver Sourbut’s Shortform

‘Temporary MAP stance’ or ‘subjective probability matching’