David James

Karma: 125

My top interest is AI safety, followed by reinforcement learning. My professional background is in software engineering, computer science, machine learning. I have degrees in electrical engineering, liberal arts, and public policy. I currently live in the Washington, DC metro area; before that, I lived in Berkeley for about five years.

David James Mar 4, 2025, 3:08 PM
8 points
0
in reply to: davekasten’s comment on: The Milton Friedman Model of Policy Change
Right. To expand on this: there are also situations where an interest group pushes hard on a broader coalition to move faster, sometimes even accusing their partners or allies of “not caring enough” or “dragging their feet”. Assuming bad faith or impugning the motives of one’s allies can sour working relationships. Understanding the constraints in play goes a long way towards fostering compromise.

David James Mar 4, 2025, 2:55 PM
8 points
0
on: The Milton Friedman Model of Policy Change
The idea of “focusing events” is well known in public policy.

For example, see Thomas Birkland’s book “After Disaster: Agenda Setting, Public Policy, and Focusing Events” or any of his many articles such as “Focusing Events, Mobilization, and Agenda Setting” (Journal of Public Policy; Vol. 18, No. 1; 1998)

According to Birkland in “During Disaster: Refining the Concept of Focusing Events to Better Explain Long-Duration Crises”, John Kingdon first used the term “focusing events” in his book “Agendas, Alternatives, and Public Policy”.

There is a considerable literature on these topics that does not rely on Milton Friedman or his political philosophy. Invoking Friedman in policy circles can make it harder to have neutral conversation about topics unrelated to markets, such as the Overton window and “theories of change” which thankfully seem to have survived as both neutral and intellectually honest ways of talking about the policy process.

With this in mind, I suggest listing these other authors alongside Milton Friedman to give a broader context. This will help us flawed humans focus on the core ideas rather than wonder in the back of our heads if the ideas are part of a particular political philosophy. As such, it probably will help to get these concepts in wider circulation.

To the students of history out there, let me know to what degree Friedman played a key role in developing and/or socializing the ideas around crises and focusing events. If so, credit where credit it due.

For what it is worth, Friedman, Arthur Okun (“Equality and Efficiency”), and Birkland were assigned reading in my public policy studies. We were expected to be able to articulate all of their points of view clearly and honestly, even if we disagreed.

David James Feb 24, 2025, 9:27 PM
3 points
0
on: So You Want To Make Marginal Progress...
I find this article confusing. So I find myself returning to fundamentals of computer science algorithms: to greedy algorithms and under what conditions they are optimal. Would anyone care to build a bridge from this terminology to what the author is trying to convey?

David James Feb 22, 2025, 6:27 PM
2 points
1
in reply to: KlockworkCanary’s comment on: How AI Takeover Might Happen in 2 Years
I wonder if you underestimate the complexity of brokering, much less maintaining, a lasting peace, whether it be via superior persuasive abilities or vast economic resource advantages. If you are thinking more along the lines of domination that is so complete that any violent resistance seems minuscule and pointless that’s a different category for me. When I think of “long term peace”, I usually don’t think of simmering grudges that remain dormant because of a massive power imbalance. I will grant that perhaps ultimate form of “persuasion” would involve removing even the mental possibility of resistance.

David James Feb 19, 2025, 5:58 PM
11 points
9
on: How might we safely pass the buck to AI?
As I understand it, the phrase “passing the buck” often involves a sense of abdicating responsibility. I don’t think this is what this author means. I would suggest finding alternative phrasings that convey the notion of delegating implementation according to some core principles, combined with the idea of passing the torch to more capable actors.

Note: this comment should not be taken to suggest that I necessarily agree or disagree with the article itself.

David James Dec 16, 2024, 1:54 PM
2 points
2
on: Understanding Shapley Values with Venn Diagrams
To clarify: the claim is that Shapley values are the only way to guarantee the set containing all four properties: {Efficiency, Symmetry, Linearity, Null player}. There are other metrics that can achieve proper subsets.

David James Dec 16, 2024, 1:47 PM
3 points
2
on: Understanding Shapley Values with Venn Diagrams

Hopefully, you have gained some intuition for why Shapley values are “fair” and why they account for interactions among players.

The article fails to make a key point: in political economy and game theory, there are many definitions of “fairness” that seem plausible at face value, especially when considered one at a time. Even if one puts normative questions to the side, there are mathematical limits and constraints as one tries to satisfy various combinations simultaneously. Keeping these in mind, you can think of this as a design problem; it takes some care to choose metrics that reinforce some set of desired norms.

David James Nov 25, 2024, 4:56 AM
9 points
3
on: Compute and size limits on AI are the actual danger

Should the bill had been signed, it would have created severe enough pressures to do more with less to focus on building better and better abstractions once the limits are hit.

Ok, I see the argument. But even without such legislation, the costs of large training runs create major incentives to build better abstractions.

David James Nov 25, 2024, 4:50 AM
5 points
2
on: Compute and size limits on AI are the actual danger
Does this summary capture the core argument? Physical constraints on the human brain contributed to its success relative to other animals, because it had to “do more with less” by using abstraction. Analogously, constraints on AI compute or size will encourage more abstraction, increasing the likelihood of “foom” danger.

David James Nov 22, 2024, 3:51 PM
8 points
2
in reply to: Ann’s comment on: “Open Source AI” isn’t Open Source

Though I’m reasonably sure Llama license (sic) isn’t preventing viewing the source

This is technically correct but irrelevant. Meta doesn’t provide any source code, by which I mean the full set of precursor steps (including the data and how to process it).

Generally speaking, a license defines usage rights; it has nothing to do with if/how the thing (e.g. source code) is made available.

As a weird example, one could publish a repository with a license but no source code. This would be odd. The license would have no power to mandate the code be released; that is a separate concern.

To put it another way, a license does not obligate the owner to release or share anything, whether it be compiled software, source code, weights, etc. A license simply outlines the conditions under which the thing (e.g. source code), once released, can be used or modified.

David James Nov 22, 2024, 3:13 PM
3 points
1
on: What are some positive developments in AI safety in 2024?
The paper AI Control: Improving Safety Despite Intentional Subversion is a practical, important step in the right direction. It demonstrates various protocols for aiming for safety even with malicious models that know they are suspected of being dangerous.

Ryan Greenblatt, Buck Shlegeris, Kshitij Sachan, Fabien Roger Proceedings of the 41st International Conference on Machine Learning, PMLR 235:16295-16336, 2024.

As large language models (LLMs) become more powerful and are deployed more autonomously, it will be increasingly important to prevent them from causing harmful outcomes. To do so, safety measures either aim at making LLMs try to avoid harmful outcomes or aim at preventing LLMs from causing harmful outcomes, even if they try to cause them. In this paper, we focus on this second layer of defense. We develop and evaluate pipelines of safety techniques (protocols) that try to ensure safety despite intentional subversion—an approach we call AI control. We investigate a setting in which we want to solve a sequence of programming problems without ever submitting subtly wrong code, using access to a powerful but untrusted model (in our case, GPT-4), access to a less powerful trusted model (in our case, GPT-3.5), and limited access to high-quality trusted labor. We investigate a range of protocols and red-team them by exploring strategies that the untrusted model could use to subvert them. We find that using the trusted model to edit untrusted-model code or using the untrusted model as a monitor substantially improves on simple baselines.

Related Video by Robert Miles: I highly recommend Using Dangerous AI, But Safely? released on Nov. 15, 2024.

David James Nov 22, 2024, 3:05 PM
2 points
1
on: What are some positive developments in AI safety in 2024?
NIST’s AI Safety Institute (AISI) hired Paul Christiano as its Head of AI Safety.

David James Nov 22, 2024, 3:02 PM
2 points
1
in reply to: avturchin’s comment on: What are some positive developments in AI safety in 2024?
From Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever’s Recent Claims:

But what we seem to be seeing is a bit different from deep learning broadly hitting a wall. More specifically it appears to be: returns to scaling up model pretraining are plateauing.

David James Nov 22, 2024, 1:32 AM
1 point
0
in reply to: Paul_Gowder’s comment on: Newcomb’s Problem and Regret of Rationality
I agree, but I’m not sure how durable this agreement will be. (I reversed my position while drafting this comment.)

Here is my one sentence summary of the argument above: If Omega can make a fully accurate prediction in a universe without backwards causality, this implies a deterministic universe.

David James Nov 21, 2024, 12:04 AM
9 points
4
on: China Hawks are Manufacturing an AI Arms Race
The Commission recommends: [...] 1. Congress establish and fund a Manhattan Project-like program dedicated to racing to and acquiring an Artificial General Intelligence (AGI) capability.

As mentioned above, the choice of Manhattan Project instead of Apollo Project is glaring.

Worse, there is zero mention of AI safety, AI alignment, or AI evaluation in the Recommendations document.

Lest you think I’m expecting too much, the report does talk about safety, alignment, and evaluation … for non-AI topic areas! (see bolded words below: “safety”, “aligning”, “evaluate”)
- “Congress direct the U.S. Government Accountability Office to investigate the reliability of safety testing certifications for consumer products and medical devices imported from China.” (page 736)
- “Congress direct the Administration to create an Outbound Investment Office within the executive branch to oversee investments into countries of concern, including China. The office should have a dedicated staff and appropriated resources and be tasked with: [...] Expanding the list of covered sectors with the goal of aligning outbound investment restrictions with export controls.” (page 737)
- “Congress direct the U.S. Department of the Treasury, in coordination with the U.S. Departments of State and Commerce, to provide the relevant congressional committees a report assessing the ability of U.S. and foreign financial institutions operating in Hong Kong to identify and prevent transactions that facilitate the transfer of products, technology, and money to Russia, Iran, and other sanctioned countries and entities in violation of U.S. export controls, financial sanctions, and related rules. The report should [...] Evaluate the extent of Hong Kong’s role in facilitating the transfer of products and technologies to Russia, Iran, other adversary countries, and the Mainland, which are prohibited by export controls from being transferred to such countries;” (page 741)

David James Nov 19, 2024, 2:06 PM
2 points
0
in reply to: dr_s’s comment on: Neutrality
I am not following the context of the comment above. Help me understand the connection? The main purpose of my comment above was to disagree with this sentence two levels up:

The frenzy to couple everything into a single tangle of complexity is driven by the misunderstanding that complacency is the only reason why your ideology is not the winning one

… in particular, I don’t think it captures the dominant driver of “coupling” or “bundling”.

Does the comment one level up above disagree with my claims? I’m not following the connection.

David James Nov 18, 2024, 2:52 PM
4 points
0
in reply to: dr_s’s comment on: Neutrality
The frenzy to couple everything into a single tangle of complexity is driven by…

In some cases, yes, but this is only one factor of many. Others include:
- Our brains are often drawn to narratives, which are complex and interwoven. Hence the tendency to bundle up complex logical interdependencies into a narrative.
- Our social structures are guided/constrained by our physical nature and technology. For in-person gatherings, bundling of ideas is often a dominant strategy.
For example, imagine a highly unusual congregation: a large unified gathering of monotheistic worshippers with considerable internal diversity. Rather than “one track” consisting of shared ideology, they subdivide their readings and rituals into many subgroups. Why don’t we see much of this (if any) in the real world? Because ideological bundling often pairs well with particular ways of gathering.

P.S. I personally welcome gathering styles that promote both community and rationality (spanning a diversity of experiences and values).

David James Nov 18, 2024, 2:41 PM
1 point
0
in reply to: YonatanK’s comment on: Neutrality
Right. Some such agreements are often called social contracts. One catch is that a person born into them may not understand their historical origin or practical utility, much less agree with them.

David James Nov 18, 2024, 2:29 PM
3 points
0
in reply to: Raemon’s comment on: Neutrality
Durable institutions find ways to survive. I don’t mean survival merely in terms of legal continuity; I mean fidelity to their founding charter. Institutions not only have to survive past their first leader; they have to survive their first leader themself! The institution’s structure and policies must protect against the leader’s meandering attention, whims, and potential corruptions. In the case of Elon, based on his mercurial history, I would not bet that Musk would agree to the requisite policies.

David James Nov 18, 2024, 12:20 PM
3 points
0
on: Neutrality

they weren’t designed to be ultra-robust to exploitation, or to make serious attempts to assess properties like truth, accuracy, coherence, usefulness, justice

There are notable differences between these properties. Usefulness and justice are quite different from the others (truth, accuracy, coherence). Usefulness (defined as suitability for a purpose, which is non-prescriptive as to the underlying norms) is different from justice (defined by some normative ideal). Coherence requires fewer commitments than truth and accuracy.

Ergo, I could see various instantiations of a library designed to satisfy various levels. Level 1 would value coherence. Level 2 would add truth and accuracy. Level 3: +usefulness. Level 4, +justice.