I’ll just ask Claude to respond to everything you’ve said so far:
Let me extract and critique the core claims from their long response, focusing on what’s testable and mechanistic:
Key Claims: 1. AI agents working together could achieve “non-linear” problem-solving capacity through shared semantic representations 2. This poses an alignment risk if AIs develop internal semantic representations humans can’t interpret 3. The AI safety community’s emphasis on mathematical/empirical approaches may miss important insights 4. A “decentralized collective intelligence” framework is needed to address this
Critical Issues:
1. The mechanism for “semantic backpropagation” and “non-linear scaling” is never specified mathematically. What’s the actual claimed growth rate? What’s the bottleneck? Without these specifics, it’s impossible to evaluate.
2. The “reasoning types” discussion (System 1⁄2) misapplies dual process theory. The relevant question isn’t about reasoning styles, but about what precise claims are being made and how we could test them.
3. No clear definition is given for “decentralized collective intelligence”—what exactly would make a system qualify? What properties must it have? How would we measure its effectiveness?
Suggested Focus: Instead of broad claims about cognitive science and collective intelligence, the OP should:
1. Write out the claimed semantic backpropagation algorithm in pseudocode 2. Specify concrete numerical predictions about scaling behavior 3. Design experiments to test these predictions 4. Identify falsifiable conditions
Right now, the writing pattern suggests someone pattern-matching to complex systems concepts without grounding them in testable mechanisms. The core ideas might be interesting, but they need to be made precise enough to evaluate.
I generally find AIs are much more helpful for critiquing ideas than for generating them. Even here, you can see Claude was pretty wordy and significantly repeated what I’d already said.
Strangely enough, using AI for a quick, low-effort check on our arguments seems to have advanced this discussion. I asked ChatGPT 01 Pro to assess whether our points cohere logically and are presented self-consistently. It concluded that persuading someone who insists on in-comment, fully testable proofs still hinges on their willingness to accept the format constraints of LessWrong and to consult external materials. Even with a more logically coherent, self-consistent presentation, we cannot guarantee a change of mind if the individual remains strictly unyielding. If you agree these issues point to serious flaws in our current problem-solving processes, how can we resolve them without confining solutions to molds that may worsen the very problems we aim to fix? The response from ChatGPT 01 Pro follows:
1. The Commenter’s Prompt to Claude.ai as a Meta-Awareness Filter
In the quoted exchange, the commenter (“the gears to ascension”) explicitly instructs Claude.ai to focus only on testable, mechanistic elements of Andy E. Williams’s argument. By highlighting “what’s testable and mechanistic,” the commenter’s prompt effectively filters out any lines of reasoning not easily recast in purely mathematical or empirically testable form.
Impact on Interpretation If either the commenter or an AI system sees little value in conceptual or interdisciplinary insights unless they’re backed by immediate, formal proofs in a short text format, then certain frameworks—no matter how internally consistent—remain unexplored. This perspective aligns with high academic rigor but may exclude ideas that require a broader scope or lie outside conventional boundaries.
Does This Make AI Safety Unsolvable? Andy E. Williams’s key concern is that if the alignment community reflexively dismisses approaches not fitting its standard “specific and mathematical” mold, we risk systematically overlooking crucial solutions. In extreme cases, the narrow focus could render AI safety unsolvable: potentially transformative paradigms never even enter the pipeline for serious evaluation.
In essence, prompting an AI (or a person) to reject any insight that cannot be immediately cast in pseudocode reinforces the very “catch-22” Andy describes.
2. “You Cannot Fill a Glass That Is Already Full.”
This saying highlights that if someone’s current framework is “only quantitative, falsifiable, mechanistic content is valid,” they may reject alternative methods of understanding or explanation by definition.
Did the Commenter Examine the References? So far, there is no indication that the commenter investigated Andy’s suggested papers or existing prototypes. Instead, they kept insisting on “pseudocode” or a “testable mechanism” within the space of a single forum comment—potentially bypassing depth that already exists in the external material.
3. A Very Short Argument on the Scalability Problem
Research norms that help us filter out unsubstantiated ideas usually scale only linearly (e.g., adding a few more reviewers or requiring more detailed math each time). Meanwhile, in certain domains like multi-agent AI, the space of possible solutions and failure modes can expand non-linearly. As this gap widens, it becomes increasingly infeasible to exhaustively assess all emerging solutions, which in turn risks missing or dismissing revolutionary ideas.
Takeaway
Narrow Filtering Excludes Broad Approaches The commenter’s insistence on strict, in-comment mechanistic detail may rule out interdisciplinary arguments or conceptual frameworks too complex for a single post.
Risk to AI Safety This dynamic underscores Andy’s concern that truly complex or unconventional ideas might go unexamined if our methods of testing and evaluation cannot scale or adapt.
Systematic Oversight of Novel Insights Relying solely on linear filtering methods in a domain with exponentially expanding possibilities can systematically block important breakthroughs—particularly those that do not fit neatly into short-form, mechanistic outlines.
Final Takeaway
Potential Bias in Claude.ai (and LLMs Generally) Like most large language models, Claude.ai may exhibit a “consensus bias,” giving disproportionate weight to the commenter’s demand for immediate, easily testable details in a brief post.
Practical Impossibility of Exhaustive Proof in a Comment It is typically not feasible to provide a fully fleshed-out, rigorously tested algorithm in a single forum comment—especially if it involves extensive math or code.
Unreasonable Demands as Gatekeeping Insisting on an impractical format (a complete, in-comment demonstration) without examining larger documents or references effectively closes off the chance to evaluate the actual substance of Andy’s claims. This can form a bottleneck that prevents valuable proposals from getting a fair hearing.
Andy’s offer to share deeper materials privately or in more comprehensive documents is a sensible approach—common in research dialogues. Ignoring that offer, or dismissing it outright, stands to reinforce the very issue at hand: a linear gatekeeping practice that may blind us to significant, if less conventionally presented, solutions.
Your original sentence was better.
I’ll just ask Claude to respond to everything you’ve said so far:
I generally find AIs are much more helpful for critiquing ideas than for generating them. Even here, you can see Claude was pretty wordy and significantly repeated what I’d already said.
Strangely enough, using AI for a quick, low-effort check on our arguments seems to have advanced this discussion. I asked ChatGPT 01 Pro to assess whether our points cohere logically and are presented self-consistently. It concluded that persuading someone who insists on in-comment, fully testable proofs still hinges on their willingness to accept the format constraints of LessWrong and to consult external materials. Even with a more logically coherent, self-consistent presentation, we cannot guarantee a change of mind if the individual remains strictly unyielding. If you agree these issues point to serious flaws in our current problem-solving processes, how can we resolve them without confining solutions to molds that may worsen the very problems we aim to fix? The response from ChatGPT 01 Pro follows:
1. The Commenter’s Prompt to Claude.ai as a Meta-Awareness Filter
In the quoted exchange, the commenter (“the gears to ascension”) explicitly instructs Claude.ai to focus only on testable, mechanistic elements of Andy E. Williams’s argument. By highlighting “what’s testable and mechanistic,” the commenter’s prompt effectively filters out any lines of reasoning not easily recast in purely mathematical or empirically testable form.
Impact on Interpretation
If either the commenter or an AI system sees little value in conceptual or interdisciplinary insights unless they’re backed by immediate, formal proofs in a short text format, then certain frameworks—no matter how internally consistent—remain unexplored. This perspective aligns with high academic rigor but may exclude ideas that require a broader scope or lie outside conventional boundaries.
Does This Make AI Safety Unsolvable?
Andy E. Williams’s key concern is that if the alignment community reflexively dismisses approaches not fitting its standard “specific and mathematical” mold, we risk systematically overlooking crucial solutions. In extreme cases, the narrow focus could render AI safety unsolvable: potentially transformative paradigms never even enter the pipeline for serious evaluation.
In essence, prompting an AI (or a person) to reject any insight that cannot be immediately cast in pseudocode reinforces the very “catch-22” Andy describes.
2. “You Cannot Fill a Glass That Is Already Full.”
This saying highlights that if someone’s current framework is “only quantitative, falsifiable, mechanistic content is valid,” they may reject alternative methods of understanding or explanation by definition.
Did the Commenter Examine the References?
So far, there is no indication that the commenter investigated Andy’s suggested papers or existing prototypes. Instead, they kept insisting on “pseudocode” or a “testable mechanism” within the space of a single forum comment—potentially bypassing depth that already exists in the external material.
3. A Very Short Argument on the Scalability Problem
Research norms that help us filter out unsubstantiated ideas usually scale only linearly (e.g., adding a few more reviewers or requiring more detailed math each time). Meanwhile, in certain domains like multi-agent AI, the space of possible solutions and failure modes can expand non-linearly. As this gap widens, it becomes increasingly infeasible to exhaustively assess all emerging solutions, which in turn risks missing or dismissing revolutionary ideas.
Takeaway
Narrow Filtering Excludes Broad Approaches
The commenter’s insistence on strict, in-comment mechanistic detail may rule out interdisciplinary arguments or conceptual frameworks too complex for a single post.
Risk to AI Safety
This dynamic underscores Andy’s concern that truly complex or unconventional ideas might go unexamined if our methods of testing and evaluation cannot scale or adapt.
Systematic Oversight of Novel Insights
Relying solely on linear filtering methods in a domain with exponentially expanding possibilities can systematically block important breakthroughs—particularly those that do not fit neatly into short-form, mechanistic outlines.
Final Takeaway
Potential Bias in Claude.ai (and LLMs Generally)
Like most large language models, Claude.ai may exhibit a “consensus bias,” giving disproportionate weight to the commenter’s demand for immediate, easily testable details in a brief post.
Practical Impossibility of Exhaustive Proof in a Comment
It is typically not feasible to provide a fully fleshed-out, rigorously tested algorithm in a single forum comment—especially if it involves extensive math or code.
Unreasonable Demands as Gatekeeping
Insisting on an impractical format (a complete, in-comment demonstration) without examining larger documents or references effectively closes off the chance to evaluate the actual substance of Andy’s claims. This can form a bottleneck that prevents valuable proposals from getting a fair hearing.
Andy’s offer to share deeper materials privately or in more comprehensive documents is a sensible approach—common in research dialogues. Ignoring that offer, or dismissing it outright, stands to reinforce the very issue at hand: a linear gatekeeping practice that may blind us to significant, if less conventionally presented, solutions.