Review Voting Thread
We’re half way through the second annual review, and 121 posts have been nominated
We’ve had more than double the number of individual nominations than last year, but on reviews we’re still playing catchup. Last year we had 118 reviews, yet this year we’ve only had 51 so far.
When there’s so many posts, it can be daunting to figure out which ones to review, so to help out, I’m making this thread. Every comment on this thread will be a post, and you should vote on which ones you would like to read a review of.
A review is something ideally that puts it in context of a broader conversation, describes its key contributions, its strengths and flaws, and where more work can be done.
(Or something else. Many people who write self-reviews often give a different flavor of review. And I’ve read many great short reviews, e.g. Jameson Quinn and Zvi last year did a lot of short reviews that communicated their impression of the post quite clearly.)
So I’m going to leave 122 comments on this post. 121 comments will just be a post title, and the other one will be for thread meta. (Search “Meta Thread”.) I will remove my own votes from them, so they all start at zero.
Please vote on the comments to show how much you’d like to see reviews of different posts! Feel free to add a comment about what sort of review you’d like to see.
(Yes, I will probably get a lot of karma from this thread. Mwahaha you have fallen for my evil trap.)
(Also, my thanks to reviewers magfrump and Zvi with 5 each, johnswentworth with 6 reviews, and to fiddler with 10 (!), all thoughtful and valuable.)
What failure looks like by paulfchristiano.
I’m specifically interested in a review of this post by someone who found these scenarios novel.
Risks from Learned Optimization: Introduction by evhub, Chris van Merwijk, vlad_m, Joar Skalse, Scott Garrabrant.
Approval Extraction Advertised as Production by Benquo.
The Curse Of The Counterfactual by pjeby.
Gradient hacking by evhub.
Neural Annealing: Toward a Neural Theory of Everything (crosspost) by Michael Edward Johnson.
The Credit Assignment Problem by abramdemski.
Book summary: Unlocking the Emotional Brain by Kaj_Sotala.
This is an excellent review. One thing I thought it could do better is in the vein of epistemic spot checks, pointing out places where the authors conjectures are far ahead of the science For instance, AFAICT, memory reconsolidation has only been proven for up to a few weeks in Mice models, but they talk about it being used to reconsolidate childhood memories in the book.
Autism And Intelligence: Much More Than You Wanted To Know by Scott Alexander.
Where to Draw the Boundaries? by Zack_M_Davis.
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists by Zack_M_Davis.
No, it’s not The Incentives—it’s you by Zack_M_Davis.
I think there’s a lot of value in this post articulating a certain frame. Don’t know how this works since it’s a link post, but would love to see a review that more explicitly pointed at the frame, how it’s useful and not useful, and pointed out the ways the author could make the frame more explicit.
Book Review: The Secret Of Our Success by Scott Alexander.
Book Review: Secular Cycles by Scott Alexander.
Is Rationalist Self-Improvement Real? by Jacobian.
Thoughts on Human Models by Ramana Kumar, Scott Garrabrant.
Paper-Reading for Gears by johnswentworth.
Unconscious Economics by jacobjacob.
human psycholinguists: a critical appraisal by nostalgebraist.
Power Buys You Distance From The Crime by Elizabeth.
Integrity and accountability are core parts of rationality by habryka.
Mistakes with Conservation of Expected Evidence by abramdemski.
The Tale of Alice Almost: Strategies for Dealing With Pretty Good People by sarahconstantin.
Reason isn’t magic by Benquo.
[Answer] Why wasn’t science invented in China? by Ruby.
Rule Thinkers In, Not Out by Scott Alexander.
Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think by Zack_M_Davis.
Relevance Norms; Or, Gricean Implicature Queers the Decoupling/Contextualizing Binary by Zack_M_Davis.
I made a comment when that post first came out that I thought this was missing the mark on what contextualizing is actually trying to get at (in particular, focusing on language meanings rather than the consequences of language use). I think this whole sequence by Zach is at it’s best when it focuses on the upsides of decoupling, and at it’s worst when it tries to explain contextualizing, and would like to see a review that covers both those strengths and weaknesses.
The strategy-stealing assumption by paulfchristiano.
Soft takeoff can still lead to decisive strategic advantage by Daniel Kokotajlo.
Six AI Risk/Strategy Ideas by Wei_Dai.
The AI Timelines Scam by jessicata.
S-Curves for Trend Forecasting by mr-hire.
I would love to see someone review this post! In particular, there was a critique about “falsifiability”—would love to hear the exact problems with what’s unfalsifiable, and address them in a subsequent edit.
You Have About Five Words by Raemon.
Yes Requires the Possibility of No by Scott Garrabrant.
How to Ignore Your Emotions (while also thinking you’re awesome at emotions) by Hazard.
The Zettelkasten Method by abramdemski.
Meta Thread
I’m surprised by the presence of negative scores on some posts. Is this to be interpreted as “please do not review this post” or as “please get to the others first”?
All three seem to be good explanatory pieces in and of themselves; I wonder if there is a kind of performance penalty where if a post does not seem like it would benefit much from a review process, it gets pushed to the back of the line. This isn’t bad really, in fact it seems fairly efficient, I just didn’t expect it.
+1 was a bit surprised. Don’t think it matters too much. Except mildly think it increases the chance those posts get reviewed.
AI Safety “Success Stories” by Wei_Dai.
Blackmail by Zvi.
Less Competition, More Meritocracy? by Zvi.
Sequence introduction: non-agent and multiagent models of mind by Kaj_Sotala.
Understanding “Deep Double Descent” by evhub.
Specifically interested in reviews covering related work, replication, follow-up, etc.
AlphaStar: Impressive for RL progress, not for AGI progress by orthonormal.
Partial summary of debate with Benquo and Jessicata [pt 1] by Raemon.
The Schelling Choice is “Rabbit”, not “Stag” by Raemon.
Complex Behavior from Simple (Sub)Agents by moridinamael.
The Hard Work of Translation (Buddhism) by romeostevensit.
Some Thoughts on My Psychiatry Practice by Laura B.
Being the (Pareto) Best in the World by johnswentworth.
Instant stone (just add water!) by jasoncrawford.
The Forces of Blandness and the Disagreeable Majority by sarahconstantin.
Asymmetric Justice by Zvi.
Turning air into bread by jasoncrawford.
I (weakly) think this one is probably more important than the progress studies post on concrete.
The Amish, and Strategic Norms around Technology by Raemon.
Noticing Frame Differences by Raemon.
Make more land by jefftk.
The Parable of Predict-O-Matic by abramdemski.
Seeking Power is Often Robustly Instrumental in MDPs by TurnTrout, elriggs.
Coherent decisions imply consistent utilities by Eliezer Yudkowsky.
Gears vs Behavior by johnswentworth.
Bioinfohazards by Spiracular.
Why Subagents? by johnswentworth.
Selection vs Control by abramdemski.
Moloch Hasn’t Won by Zvi.
Excerpts from a larger discussion about simulacra by Benquo.
Humans Who Are Not Concentrating Are Not General Intelligences by sarahconstantin.
Propagating Facts into Aesthetics by Raemon.
[Part 2] Amplifying generalist research via forecasting – results from a preliminary exploration by jacobjacob, ozziegooen, Elizabeth, NunoSempere, bgold.
Mental Mountains by Scott Alexander.
Chris Olah’s views on AGI safety by evhub.
How do you assess the quality / reliability of a scientific study? by elityre.
Two explanations for variation in human abilities by Matthew Barnett.
But exactly how complex and fragile? by KatjaGrace.
Book Review: Design Principles of Biological Circuits by johnswentworth.
The Power to Teach Concepts Better by Liron.
I made some critiques of this sequence when it first came out, related to the implicit framing that specificity is always good, and generalist is always sloppy ( this a frame, I don’t think it’s stated explicitly). Similar to my comment about Zach’s sequence, I think this sequence is at its best when talking about the benefits of specificity, and at its worst when talking about the problems with non-specificity.
I’d like to see a review that highlights these merits while pointing out the missed perspectives. Bonus points if it relates to the some of the notions about withholding specificity discussed in alkjashs post on Babble, which was included in last year’s review.
Utility ≠ Reward by vlad_m.
Classifying specification problems as variants of Goodhart’s Law by Vika, Ramana Kumar.
The Real Rules Have No Exceptions by Said Achmiz.
The Costs of Reliability by sarahconstantin.
Everybody Knows by Zvi.
Simple Rules of Law by Zvi.
1960: The Year The Singularity Was Cancelled by Scott Alexander.
Rest Days vs Recovery Days by Unreal.
Alignment Research Field Guide by abramdemski.
What are the open problems in Human Rationality? by Raemon.
Book Summary: Consciousness and the Brain by Kaj_Sotala.
Reframing Superintelligence: Comprehensive AI Services as General Intelligence by rohinmshah,
In My Culture by Duncan_Sabien.
System 2 as working-memory augmented System 1 reasoning by Kaj_Sotala.
Building up to an Internal Family Systems model by Kaj_Sotala.
Steelmanning Divination by Vaniver.
The Power to Demolish Bad Arguments by Liron.
From Personal to Prison Gangs: Enforcing Prosocial Behavior by johnswentworth.
What determines the balance between intelligence signaling and virtue signaling? by Wei_Dai.
Gears-Level Models are Capital Investments by johnswentworth.
Evolution of Modularity by johnswentworth.
Healthy Competition by Raemon.
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More by Ben Pace.
Rationality and Levels of Intervention by Geoff_Anders.
What is operations? by Swimmer963.
Integrating the Lindy Effect by lsusr.
The unexpected difficulty of comparing AlphaStar to humans by Richard Korzekwa.
How Much is Your Time Worth? by lynettebye.
Dual Wielding by Zvi.
Trauma, Meditation, and a Cool Scar by elriggs.
Forum participation as a research strategy by Wei_Dai.
Does it become easier, or harder, for the world to coordinate around not building AGI as time goes on? by elityre.
Do you fear the rock or the hard place? by Ruby.
No nonsense version of the “racial algorithm bias” by Yuxi_Liu.
Some Ways Coordination is Hard by Zvi.
Circle Games by sarahconstantin.
Coordination Surveys: why we should survey to organize responsibilities, not just predictions by Academian.
Dishonest Update Reporting by Zvi.
Literature Review: Distributed Teams by Elizabeth.
“Other people are wrong” vs “I am right” by Buck.
mAIry’s room: AI reasoning to solve philosophical problems by Stuart_Armstrong.
Megaproject management by ryan_b.
Total horse takeover by KatjaGrace.
Reframing Impact by TurnTrout.
Strategic implications of AIs’ ability to coordinate at low cost, for example by merging by Wei_Dai.
Book Review: The Structure Of Scientific Revolutions by Scott Alexander.