An open mind is like a fortress with its gates unbarred and unguarded.
MathiasKB
As someone who plays a lot of go, this result looks very suspicious to me. To me it looks like the primary reason this attack works is due to an artifact of the automatic scoring system used in the attack. I don’t think this attack would be replicable in other games, or even KataGo trained on a correct implementation.
In the example included on the website, KataGo (White) is passing because it correctly identifies the adversary’s (Black) stones as dead meaning the entire outside would be its territory. Playing any move in KataGo’s position would gain no points (and lose a point under Japanese scoring rules), so KataGo passes.
The game then ends and the automatic scoring system designates the outside as undecided, granting white 0 points and giving black the win.
If the match were to be played between two human players, they would have to agree whether the outside territory belongs to white or not. If black were to claim their outside stones are alive the game would continue until both players pass and agree about the status of all territory (see ‘disputes’ in the AGA ruleset).
But in the adversarial attack, the game ends after the pass and black gets the win due to the automatic scoring system deciding the outcome. But the only reason that KataGo passed is that it correctly inferred that it was in a winning position with no way to increase its winning probability! Claiming that to be a successful adversarial attack rings a bit hollow to me.
I wouldn’t conclude anything from this attack, other than that Go is a game with a lot of edge-cases that need to be correctly handled.
EDIT: I just noticed the authors address this on the website, but I still think this significantly diminishes the ‘impressiveness’ of the adversarial attack. I don’t know the exact ruleset KataGo is trained under, but unless it’s the exact same as the ruleset used to evaluate the adversarial attack, the attack only works due to KataGo playing to win a different game than the adversary.
Evaluating the RCT is a chance to train the evaluation-muscle in a well-defined domain with feedback. I’ve generally found that the people who are best at evaluations in RCT’able domains, are better at evaluating the hard-to-evaluate claims as well.
Often the difficult to evaluate domains have ways of getting feedback, but if you’re not in the habit of looking for it, you’re less likely to find the creative ways to get data.
I think a much more common failure mode within this community, is that we get way overconfident beliefs about hard-to-evaluate domains, because there aren’t many feedback loops and we aren’t in the habit of looking for them.
Does anyone know of any zero-trust investigations on nuclear risk done in the EA/Rationalist community? Open phil has funded nuclear work, so they probably have an analysis somewhere that concluded it is a serious risk to civilization, but I haven’t ever looked into these analyses.
For each tweet the post found arguing their point, I can find two arguing the opposite. Yes, in theory tweets are data points, but in practice the author just uses them to confirm his already held beliefs.
I don’t think the real world is good enough either.
The fact that humans feel a strong sense of the tetris effect, suggest to me that the brain is constantly generating and training on synthetic data.
Another issue with greenwashing and safetywashing is that it gives people who earnestly care a false impression that they are meaningfully contributing.
Despite thousands of green initiatives, we’re likely to blow way past the 1.5c mark because the far majority of those initiatives failed to address the core causes of climate change. Each plastic-straw ban and reusable diaper gives people an incorrect impression that they are doing something meaningful to improve the climate.
Similarly I worry that many people will convince themselves that they are doing something meaningful to improve AI Safety, but because they failed to address the core issues they end up contributing nothing. I am not saying this as a pure hypothetical, I think this is already happening to a large extent.
I quit a well paying job to become a policy trainee working with AI in the European Parliament because I was optimizing for “do something which looks like contributing to AI safety”, with a strenuous at best model of how my work would actually lead to a world which creates safe AI. What horrified me during this was that a majority of people I spoke to in the field of AI policy seemed to be making similar errors as I was.Many of us justify our work this by pointing out the second-order benefits such as “policy work is field-building”, “This policy will help create better norms” or “I’m skilling up / getting myself to a place of influence”, and while these second order effects are real and important, we should be very sceptical of interventions whose first-order effects aren’t promising.
I apologize that this became a bit of a rant about AI Policy, but I have been annoyed with myself for making such basic errors and this post helped me put a word on what I was doing.
The primary question on my mind is something like this:
How much retraining is needed for Gato to learn a new task? Given a task, such as “Stack objects and compose a relevant poem” which combines skills it has already learned, yet is a fundamentally different task, does it quickly learn how to perform well at it?
If not, then it seems Deepmind ‘merely’ managed to get a single agent to do a bunch of tasks we were previously only able to do with multiple agents. If it is also quicker at learning new tasks in similar domains, than an agent trained solely to do it, then it seems like a big step towards general intelligence.
Hi Niplav, happy to hear you think that.
I just uploaded the pkl files that include the pandas dataframes for the metaculus questions and GPT’s completions for the best performing prompt to github. Let me know if you need anything else :)
https://github.com/MperorM/gpt3-metaculus
Getting GPT-3 to predict Metaculus questions
I think wife rolls of the tongue uniquely well here due to ‘wife’ rhyming with ‘life’, creating the pun. Outside of that I don’t buy it. In Denmark, wife-jokes are common despite wife being a two syllable word (kone) and husband-jokes are rare despite husband being a one syllable word (mand).
My model of why we see this has much more to do with gender norms and normalised misogyny than with catchiness of the words.
Good point, though I would prefer we name it Quality Adjusted Spouse Years :)
Fantastic to see this wonderful game be passed onto a new generation!
...
My analysis was from no exercise to regular high intensity exercise. There’s probably an 80⁄20 in between, but I did not look into it.
For what it’s worth hastily made a spreadsheet and found that regular heavy exercise was by far the largest improvement I could make to my life expectancy. Everything else paled in comparison. That said I only evaluated interventions that were relevant to me. If you smoke, I imagine quitting would score high as well.
For me, this perfectly hits the nail on the head.
This is a somewhat weird question, but like, how do I do that?
I’ve noticed multiple communities fall into the meta-trap, and even when members notice it can be difficult to escape. While the solution is simply to “stop being meta”, that is much harder said than done.
When I noticed this happening in a community I am central in organizing I pushed back by bringing my own focus to output instead of process hoping others would follow suit. This has worked somewhat and we’re definitely on a better track. I wonder what dynamics lead to this ‘death by meta’ syndrome, and if there is a cure.
Really cool concept of drumming with your feet while playing another instrument.
I think it would be really cool to experiment with different trigger sounds. The muscles in your foot severely limits the nuances available to play, and trying to imitate the sounds of a regular drum-set will not go over well.
I think it is possible to achieve much cooler playing, if you skip the idea of your pedals needing to imitate a drum-set entirely. Experiment with some 808 bass, electric kicks, etc.
Combining that with your great piano playing would create an entirely new feel of music, whereas it can easily end up sounding like a good pianist struggling to cooperate with a much worse drummer
I spent 5 minutes searching amazon.de for replacements to the various products recommended and my search came up empty.
Is there someone who has put together the needed list of bright lighting products on amazon.de? I tried doing it myself and ended up hopelessly confused. What I’m asking for is eg. two desk lamps and corresponding light bulbs that live up to the criteria.
I’ll pay $50 to the charity of your choice, if I make a purchase based off your list.
And there doesn’t need to be an “overall goodness” of the job that would be anything else than just the combination of those two facts.
There needs to be an “overall goodness” that is exactly equal to the combination of those two facts. I really like the fundamental insight of the post. It’s important to recognize that your mind wants to push your perception of the “overall goodness” to the extremes, and that you shouldn’t let it do that.
If you now had to make a decision on whether to take the job, how would you use this electrifying zap help you make the decision?
No, the KataGo paper explicitly states at the start of page 4:
”Self play games used Tromp-Taylor rules [21] modified to not require capturing stones within pass-aliveterritory”
Had KataGo been trained on unmodified Tromp-Taylor rules, the attack would not have worked. The attack only works because the authors are having KataGo play under a different ruleset than it was trained on.
If I have the details right, I am honestly very confused about what the authors are trying to prove with this paper. Given their Twitter announcement claimed that the rulesets were the same my best guess is simply that it was an oversight on their part.
(EDIT: this modification doesn’t matter, the authors are right, I am wrong. See my comment below)