I’m excited to have this written up so clearly, nice work! I think this is important for alignment work in two ways: discourse and thinking about alignment is affected by powerful cognitive biases that this hypothesis explains; and, as you point out, we might build AGI that works like this, since it’s so effective for human cognition.
I’m very curious if this “rings true” to other readers based on their introspection and observation of others’ thinking patterns. I think this is both true and important. I’d arrived at this conclusion over the course of a research career studying dopamine and higher cognition. When we started researching cognitive biases, this came together, and I think this ubiquitous valence effect is the source of the most important cognitive biases. This goes by the names motivated reasoning, confirmation bias, and the halo effect; they have overlapping behavioral definitions. I think they’re the major stumbling block to humans behaving rationally.
I think this hypothesis is consistent with a vast array of empirical work on dopamine function and related cognitive function. But the evidence isn’t adequate to firmly establish that dopamine signals valence. That’s part of why I’d never written this up adequately, and because hypotheses this broad are outside of the scope of standard neuroscience funding.
I’m looking forward to the rest of the series, and hoping the posts addressing cognitive biases generate some discussion about how those biases affect alignment discussions. I think the combination of motivated reasoning, confirmation bias, and the halo/horns effects create powerful polarization that’s a big obstacle to rational discussions of alignment
I’m excited to have this written up so clearly, nice work! I think this is important for alignment work in two ways: discourse and thinking about alignment is affected by powerful cognitive biases that this hypothesis explains; and, as you point out, we might build AGI that works like this, since it’s so effective for human cognition.
I’m very curious if this “rings true” to other readers based on their introspection and observation of others’ thinking patterns. I think this is both true and important. I’d arrived at this conclusion over the course of a research career studying dopamine and higher cognition. When we started researching cognitive biases, this came together, and I think this ubiquitous valence effect is the source of the most important cognitive biases. This goes by the names motivated reasoning, confirmation bias, and the halo effect; they have overlapping behavioral definitions. I think they’re the major stumbling block to humans behaving rationally.
I think this hypothesis is consistent with a vast array of empirical work on dopamine function and related cognitive function. But the evidence isn’t adequate to firmly establish that dopamine signals valence. That’s part of why I’d never written this up adequately, and because hypotheses this broad are outside of the scope of standard neuroscience funding.
I’m looking forward to the rest of the series, and hoping the posts addressing cognitive biases generate some discussion about how those biases affect alignment discussions. I think the combination of motivated reasoning, confirmation bias, and the halo/horns effects create powerful polarization that’s a big obstacle to rational discussions of alignment