An open mind is like a fortress with its gates unbarred and unguarded.
MathiasKB
Is there any good write up on the gut/brain connection and the effect fecal transplants?
Watching the South Park episode where everyone tries to steal Tom Brady’s poo got me wondering why this isn’t actually a thing. I can imagine lots of possible explanations, ranging from “because it doesn’t have much of an effect if you’re healthy” to “because FDA”.
On this view, adversarial examples arise from gradient descent being “too smart”, not “too dumb”: the program is fine; if the test suite didn’t imply the behavior we wanted, that’s our problem.
Shouldn’t we expect to see RL models trained purely on self play not to have these issues then?
My understanding is that even models trained primarily with self play, such as katago, are vulnurable to adversarial attacks. If RL models are vulnurable to the same type of adversarial attacks, isn’t that evidence against this theory?
The amount of inference compute isn’t baked-in at pretraining time, so there is no tradeoff.
This doesn’t make sense to me.
In a subscription based model, for example, companies would want to provide users the strongest completions for the least amount of compute.
If they estimate customers in total will use 1 quadrillion tokens before the release of their next model, they have to decide how much of the compute they are going to be dedicating to training versus inference. As one changes the parameters (subscription price, anticipated users, fixed costs for a training run, etc.) you’d expect to find the optimal ratio to change.Test-time compute on one trace comes with a recommendation to cap reasoning tokens at 25K, so there might be 1-2 orders of magnitude more there with better context lengths. They are still not offering repeated sampling filtered by consensus or a reward model. If o1 proves sufficiently popular given its price, they might offer even more expensive options.
Thanks, this is a really good find!
Thanks!! this is exactly what I was looking for
With the release of openAI o1, I want to ask a question I’ve been wondering about for a few months.
Like the chinchilla paper, which estimated the optimal ratio of data to compute, are there any similar estimates for the optimal ratio of compute to spend on inference vs training?In the release they show this chart:
The chart somewhat gets at what I want to know, but doesn’t answer it completely. How much additional inference compute would I need a 1e25 o1-like model to perform as well as a one shotted 1e26?
Additionally, for some x number of queries, what is the optimal ratio of compute to spend on training versus inference? How does that change for different values of x?
Are there any public attempts at estimating this stuff? If so, where can I read about it?
If someone wants to set up a figgy group to play, I’d love to join
I agree the conclusion isn’t great!
Not so surprisingly, many people read the last section as an endorsement of some version of “RCTism”, but it’s not actually a view I endorse myself.
What I really wanted to get at in this post was just how pervasive priors are, and how difficult it is to see past them.
Just played through it tonight. This was my first D&D.Sci, found it quite difficult and learned a a few things while working on it.
Initially I tried to figure out the best counters and found a few patterns (flamethrowers were especially good against certain units). I then tried to look and adjust for any chronology, but after tinkering around for a while without getting anywhere I gave up on that. Eventually I just went with a pretty brainless ML approach.
I ended up sending squads for 5 and 6 which managed a 13.89% and 53.15% chance of surviving, I think it’s good I’m not in charge of any soldiers in real life!
Overall I had good fun, and I’m looking forward to looking at the next one.
This wouldn’t be the first time Deepmind pulled these shenanigans.
My impression of Deepmind is they like playing up the impressiveness of their achievements to give an impression of having ‘solved’ some issue, never saying anything technically false, while suspiciously leaving out relevant information and failing to do obvious tests of their models which would reveal a less impressive achievement.
For Alphastar they claimed ‘grandmaster’ level, but didn’t show any easily available stats which would make it possible to verify. As someone who was in Grandmaster league at the time of it playing (might even have run into it on ladder, some of my teammates did), its play at best felt like low grandmaster to me.
At their event showing an earlier prototype off, they had one player (TLO) play their off-race with which he certainly was not at a grandmaster level. The pro player (Mana) playing their main race beat it at the event, when they had it play with the same limited camera access humans have. I don’t remember all the details anymore, but I remember being continuously annoyed by suspicious omission after suspicious omission.
What annoys me most is that this still was a wildly impressive achievement! Just state in the paper: “we managed to reach grandmaster with one out of three factions”—Nobody has ever managed to create AI that played remotely as well as this!
Similarly Deepminds no-search chess engine is surely the furthest anyone has gotten without search. Even if it didn’t quite make grandmaster, just say so!
if it makes it easier, I can add the questions to manifold if you provide a list of questions and resolution criteria.
thanks for pointing that out, I’ve added a note in the description
There’s countries where cooperative firms are doing fine. Most of Denmark’s supermarket chains are owned by the cooperative coop. Denmark’s largest dairy producer Arla is a cooperative too. Both operate in a free market and are out-competing privately owned competitors.
Both also resort to many of the same dirty tricks traditionally structured firms are pulling. Arla, for example, has done tremendous harm to the plant-based industry through aggressive lobbying. Structuring firms as cooperatives doesn’t magically make them aligned.
Cicero, as it is redirecting its entire fleet: ‘What did you call me?’
Yeah, my original claim is wrong. It’s clear that KataGo is just playing sub-optimally outside of distribution, rather than punished for playing optimally under a different ruleset than its being evaluated.
Actually this modification shouldn’t matter. After looking into the definition of pass-alive, the dead stones in the adversarial attacks are clearly not pass-alive.
Under both unmodified and pass-alive modified tromp-taylor rules, KataGo would lose here and its surprising that self-play left such a weakness.
The authors are definitely onto something, and my original claim that the attack only works due to kataGo being trained under a different rule-set is incorrect.
No, the KataGo paper explicitly states at the start of page 4:
”Self play games used Tromp-Taylor rules [21] modified to not require capturing stones within pass-aliveterritory”Had KataGo been trained on unmodified Tromp-Taylor rules, the attack would not have worked. The attack only works because the authors are having KataGo play under a different ruleset than it was trained on.
If I have the details right, I am honestly very confused about what the authors are trying to prove with this paper. Given their Twitter announcement claimed that the rulesets were the same my best guess is simply that it was an oversight on their part.
(EDIT: this modification doesn’t matter, the authors are right, I am wrong. See my comment below)
As someone who plays a lot of go, this result looks very suspicious to me. To me it looks like the primary reason this attack works is due to an artifact of the automatic scoring system used in the attack. I don’t think this attack would be replicable in other games, or even KataGo trained on a correct implementation.
In the example included on the website, KataGo (White) is passing because it correctly identifies the adversary’s (Black) stones as dead meaning the entire outside would be its territory. Playing any move in KataGo’s position would gain no points (and lose a point under Japanese scoring rules), so KataGo passes.
The game then ends and the automatic scoring system designates the outside as undecided, granting white 0 points and giving black the win.
If the match were to be played between two human players, they would have to agree whether the outside territory belongs to white or not. If black were to claim their outside stones are alive the game would continue until both players pass and agree about the status of all territory (see ‘disputes’ in the AGA ruleset).
But in the adversarial attack, the game ends after the pass and black gets the win due to the automatic scoring system deciding the outcome. But the only reason that KataGo passed is that it correctly inferred that it was in a winning position with no way to increase its winning probability! Claiming that to be a successful adversarial attack rings a bit hollow to me.
I wouldn’t conclude anything from this attack, other than that Go is a game with a lot of edge-cases that need to be correctly handled.
EDIT: I just noticed the authors address this on the website, but I still think this significantly diminishes the ‘impressiveness’ of the adversarial attack. I don’t know the exact ruleset KataGo is trained under, but unless it’s the exact same as the ruleset used to evaluate the adversarial attack, the attack only works due to KataGo playing to win a different game than the adversary.
Evaluating the RCT is a chance to train the evaluation-muscle in a well-defined domain with feedback. I’ve generally found that the people who are best at evaluations in RCT’able domains, are better at evaluating the hard-to-evaluate claims as well.
Often the difficult to evaluate domains have ways of getting feedback, but if you’re not in the habit of looking for it, you’re less likely to find the creative ways to get data.
I think a much more common failure mode within this community, is that we get way overconfident beliefs about hard-to-evaluate domains, because there aren’t many feedback loops and we aren’t in the habit of looking for them.
Does anyone know of any zero-trust investigations on nuclear risk done in the EA/Rationalist community? Open phil has funded nuclear work, so they probably have an analysis somewhere that concluded it is a serious risk to civilization, but I haven’t ever looked into these analyses.
I’ll crosspost the comment I left on substack:
In Denmark the government has a service (ROFUS), which anyone can voluntarily sign up for to exclude themselves from all gambling providers operating in Denmark. You can exclude yourself for a limited duration or permanently. The decision cannot be revoked.
Before discussing whether gambling should be legal or illegal, I would encourage Americans to see how far they can get with similar initiatives first.