Jsevillamol

Karma: 1,780

Jsevillamol Apr 19, 2025, 5:15 PM
6 points
0
in reply to: Sam Marks’s comment on: jacquesthibs’s Shortform
For clarity, at the moment of writing I felt that was a valid concern.
Currently I think this is no longer compelling to me personally, though I think at least some of our stakeholders would be concerned if we published work that significantly sped up AI capabilities and investment, which is a perspective we keep in mind when deciding what to work on.
I never thought that just because something speed up capabilities it means it is automatically something we shouldn’t work on. We are willing to make trade offs here in service of our core mission of improving the public understanding of the trajectory of AI. And in general we make a strong presumption in favour of freedom of knowledge.

Jsevillamol Jan 20, 2025, 7:55 AM
34 points
5
in reply to: Jonathan Claybrough’s comment on: Jonathan Claybrough’s Shortform
I’m talking from a personal perspective here as Epoch director.
- I personally take AI risks seriously, and I think they are worth investigating and preparing for.
- I co-started Epoch AI to get evidence and clarity on AI and its risks and this is still a large motivation for me.
- I have drifted towards a more skeptical position on risk in the last two years. This is due to a combination of seeing the societal reaction to AI, me participating in several risk evaluation processes, and AI unfolding more gradually than I expected 10 years ago.
- Currently I am more worried about concentration in AI development and how unimproved humans will retain wealth over the very long term than I am about a violent AI takeover.
- I also selfishly care about AI development happening fast enough that my parents, friends and myself could benefit from it, and I am willing to accept a certain but not unbounded amount of risk from speeding up development. Id currently be in favour of slightly faster development, specially if it could happen in a less distributed way. I feel very nervous about this however, as I see my beliefs as brittle.
I’m also going to risk also sharing more internal stuff without coordinating on it, erring on the side of over sharing. There is a chance that other management at Epoch won’t endorse these takes.
- At management level, we are choosing to not talk about risks or work on risk measurement publicly. If I try to verbalize it, it’s due to a combination of us having different beliefs on AI risk, which makes communicating from a consensus view difficult, believing that talking about risk would alienate us from stakeholders skeptical of AI Risk, and the evidence about risk being below what we are comfortable writing about.
- My sense is that OP is funding us primarily to gather evidence relevant to their personal models. Eg two senior people at OP particularly praised our algorithmic progress paper because it directly informs their models. They do also care about us producing legible evidence on key topics for policy, such as the software singularity or post training enhancements. We have had complete editorial control and I feel confident in rejecting topic suggestions that come from OP staff when they don’t match my vision of what we should be writing about (and have done so in the past).
- In term of overall beliefs, we have a mixture of people very worried and skeptical of risk. I think the more charismatic and outspoken people at Epoch err towards being more skeptical of risks, but no one at Epoch is dismissive of it.
- Some stakeholders I’ve talked to have expressed this view that they wish for Epoch AI to gain influence and then communicate publicly about AI risk. I don’t feel comfortable with that strategy, one should expect Epoch AI to keep a similar level of communication about risk as we gain influence. We might be willing to talk more about risks if we gather more evidence of risk, or if we build more sophisticated tools to talk about it, but this isn’t the niche we are filling or that you should expect us to fill.
- The podcast is actually a good example here. We talk toward the end about the share of the economy owned by biological humans becoming smaller over time, which is an abstraction we have studied and have moderate confidence in. This is compatible with an AI takeover scenario, but also a peaceful transition to an AI dominated society. This is the kind of communication about risks you can expect from Epoch, relying more on abstractions we have studied than stories we don’t have confidence in.
- The overall theory of change of Epoch AI is that having reliable evidence on AI will help raise the standards of conversation and decision making elsewhere. To be maximally clear, I am personally willing to make some tradeoffs like publishing work like FrontierMath and our distributed training paper that plausibly speed up AI development in service of that mission.

Jsevillamol Aug 19, 2024, 5:10 AM
7 points
2
on: Liability regimes for AI
The ability to pay liability is important to factor in and this illustrates it well. For the largest prosaic catastrophes this might well be the dominant consideration

For smaller risks, I suspect in practice mitigation, transaction and prosecution costs are what dominates the calculus of who should bear the liability, both in AI and more generally.

Jsevillamol Jul 16, 2024, 10:00 PM
7 points
0
on: Towards more cooperative AI safety strategies
What’s the FATE community? Fair AI and Tech Ethics?

Jsevillamol Jun 28, 2024, 6:05 PM
LW: 2 AF: 1
0
AF
in reply to: gwern’s comment on: Parameter counts in Machine Learning
We have conveniently just updated our database if anyone wants to investigate this further!
https://epochai.org/data/notable-ai-models

Jsevillamol May 10, 2024, 6:06 AM
15 points
4
on: We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”
Here is a “predictable surprise” I don’t discussed often: given the advantages of scale and centralisation for training, it does not seem crazy to me that some major AI developers will be pooling resources in the future, and training jointly large AI systems.

Jsevillamol Apr 25, 2024, 6:26 AM
15 points
6
on: Bayesian inference without priors
I’ve been tempted to do this sometime, but I fear the prior is performing one very important role you are not making explicit: defining the universe of possible hypothesis you consider.

In turn, defining that universe of probabilities defines how bayesian updates look like. Here is a problem that arises when you ignore this: https://www.lesswrong.com/posts/R28ppqby8zftndDAM/a-bayesian-aggregation-paradox

Jsevillamol Jan 23, 2024, 10:47 PM
2 points
0
in reply to: johnswentworth’s comment on: Revisiting algorithmic progress
shrug
I think this is true to an extent, but a more systematic analysis needs to back this up.
For instance, I recall quantization techniques working much better after a certain scale (though I can’t seem to find the reference...). It also seems important to validate that techniques to increase performance apply at large scales. Finally, note that the frontier of scale is growing very fast, so even if these discoveries were done with relatively modest compute compared to the frontier, this is still a tremendous amount of compute!

Jsevillamol Jan 23, 2024, 9:22 PM
2 points
0
in reply to: johnswentworth’s comment on: Revisiting algorithmic progress
even a pause which completely stops all new training runs beyond current size indefinitely would only ~double timelines at best, and probably less
I’d emphasize that we currently don’t have a very clear sense of how algorithmic improvement happens, and it is likely mediated to some extent by large experiments, so I think is more likely to slow timelines more than this implies.

Jsevillamol Nov 24, 2023, 8:39 PM
2 points
in reply to: paato’s comment on: Analysis of World Records in Speedrunning [LINKPOST]
I agree! I’d be quite interested in looking at TAS data, for the reason you mentioned.

Jsevillamol Aug 28, 2023, 10:41 PM
4 points
0
on: The Evidence for Question Decomposition is Weak
I think Tetlock and cia might have already done some related work?
Question decomposition is part of the superforecasting commandments, though I can’t recall off the top of my head if they were RCT’d individually or just as a whole.
ETA: This is the relevant paper (h/t Misha Yagudin). It was not about the 10 commandments. Apparently those haven’t been RCT’d at all?

Jsevillamol Aug 28, 2023, 9:00 AM
4 points
in reply to: gwern’s comment on: Analysis of World Records in Speedrunning [LINKPOST]
We actually wrote a more up to date paper here

https://arxiv.org/abs/2304.10004

Jsevillamol Aug 21, 2023, 7:54 PM
7 points
4
on: Rice’s Theorem says that AIs can’t determine much from studying AI source code
I cowrote a detailed response here

https://www.cser.ac.uk/news/response-superintelligence-contained/

Essentially, this type of reasoning proves too much, since it implies we cannot show any properties whatsoever of any program, which is clearly false.

Jsevillamol Aug 14, 2023, 7:01 AM
14 points
0
in reply to: Daniel Kokotajlo’s comment on: AGI is easier than robotaxis
Here is some data through Matthew Barnett and Jess Riedl

Number of cumulative miles driven by Cruise’s autonomous cars is growing as an exponential at roughly 1 OOM per year.

https://twitter.com/MatthewJBar/status/1690102362394992640

Jsevillamol Jun 6, 2023, 10:27 AM
6 points
3
in reply to: meijer1973’s comment on: Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?
That is to very basic approximation correct.
Davidson’s takeoff model illustrates this point, where a “software singularity” happens for some parameter settings due to software not being restrained to the same degree by capital inputs.
I would point out however that our current understanding of how software progress happens is somewhat poor. Experimentation is definitely a big component of software progress, and it is often understated in LW.
More research on this soon!

Jsevillamol Jun 4, 2023, 9:35 AM
6 points
2
in reply to: habryka’s comment on: Yudkowsky vs Hanson on FOOM: Whose Predictions Were Better?

algorithmic progress is currently outpacing compute growth by quite a bit

This is not right, at least in computer vision. They seem to be the same order of magnitude.

Physical compute has growth at 0.6 OOM/year and physical compute requirements have decreased at 0.1 to 1.0 OOM/year, see a summary here or a in depth investigation here

Another relevant quote

Algorithmic progress explains roughly 45% of performance improvements in image classification, and most of this occurs through improving compute-efficiency.

Jsevillamol May 9, 2023, 8:29 AM
LW: 2 AF: 1
AF
in reply to: Zenin Easa Panthakkalakath’s comment on: What’s the backward-forward FLOP ratio for Neural Networks?
$t$ is not a transpose! It is the timestep $t$ . We are raising $β$ to the $t$ -th power.

Power laws in Speedrunning and Machine Learning

Jsevillamol and Ege Erdil

Apr 24, 2023, 10:06 AM

71 points

1 comment1 min readLW link

(arxiv.org)

Jsevillamol Apr 21, 2023, 6:54 AM
LW: 2 AF: 1
0
AF
in reply to: Edouard Harris’s comment on: Announcing Epoch’s dashboard of key trends and figures in Machine Learning
Thanks!

Our current best guess is that this includes costs other than the amortized compute of the final training run.

If no extra information surfaces we will add a note clarifying this and/or adjust our estimate.

Jsevillamol Apr 20, 2023, 9:29 PM
LW: 2 AF: 1
0
AF
in reply to: Neel Nanda’s comment on: Announcing Epoch’s dashboard of key trends and figures in Machine Learning
Thanks Neel!

The difference between tf16 and FP32 comes to a x15 factor IIRC. Though also ML developers seem to prioritise other characteristics than cost effectiveness when choosing GPUs like raw performance and interconnect, so you can’t just multiply the top price performance we showcase by this factor and expect that to match the cost performance of the largest ML runs today.

More soon-ish.

Jsevillamol

Power laws in Speedrun­ning and Ma­chine Learning

Power laws in Speedrunning and Machine Learning