can you say more about your reasoning for this?
Matt Goldenberg
Excellent work! Thanks for what you do
fwiw while it’s fair to call this “heavy nudging”, this mirrors exactly what my prompts for agentic workflows look like. I have to repeat things like “Don’t DO ANYTHING YOU WEREN’T ASKED” multiple times to get them to work consistently.
I found this post to be incredibly useful to get a deeper sense of Logan’s work on naturalism.
I think his work on Naturalism is a great and unusual example of original research happening in the rationality community and what actually investigating rationality looks like.
Emailed you.
In my role as Head of Operations at Monastic Academy, every person in the organization is on a personal improvement plan that addresses the personal responsibility level, and each team in the organization is responsible for process improvements that address the systemic level.
In the performance improvement weekly meetings, my goal is to constantly bring them back to the level of personal responsibility. Any time they start saying the reason they couldn’t meet their improvement goal was because of X event or Y person, I bring it back. What could THEY have done differently, what internal psychological patterns prevented them from doing that, and what can they do to shift those patterns this week.
Meanwhile, each team also chooses process improvements weekly. In those meetings, my role is to do the exact opposite, and bring it back to the level of process. Any time they’re examining a team failure and come to the conclusion “we just need to prioritize it more, or try harder, or the manager needs to hold us to something”, I bring it back to the level of process. How can we change the order or way we do things, or the incentives involved, such that it’s not dependent on any given person’s ability to work hard or remember or be good at a certain thing.
Personal responsibility and systemic failure are different levels of abstraction.
If you’re within the system and doing horrible things while saying, “🤷 It’s just my incentives, bro,” you’re essentially allowing the egregore to control you, letting it shove its hand up your ass and pilot you like a puppet.
At the same time, if you ignore systemic problems, you’re giving the egregore power by pretending it doesn’t exist—even though it’s puppeting everyone. By doing so, you’re failing to claim your own power, which lies in recognizing your ability to work towards systemic change.
Both truths coexist:
There are those perpetuating evil by surrendering their personal responsibility to an evil egregore.
There are those perpetuating evil by letting the egregore run rampant and denying its existence.
The solution requires addressing both levels of abstraction.
I think the model of “Burnout as shadow values” is quite important and loadbearing in my own model of working with many EAs/Rationalists. I don’t think I first got it from this post but I’m glad to see it written up so clearly here.
Any easy quick way to test is to offer some free coaching in this method.
Can you say more about how you’ve used this personally or with clients? What approaches you tried that didn’t work, and how this has changed if at all to be more effective over time?
There’s a lot here that’s interesting, but hard for me to tell from just your description how battletested this is
What would the title be?
I still don’t quite get it. We already have an Ilya Sutskever who can make type 1 and type 2 improvements, and don’t see the sort of jump’s in days your talking about (I mean, maybe we do, and they just look discontinuous because of the release cycles?)
Why do you imagine this? I imagine we’d get something like one Einstein from such a regime, which would maybe increase the timelines over existing AI labs by 1.2x or something? Eventually this gain compounds but I imagine that could tbe relatively slow and smooth , with the occasional discontinuous jump when something truly groundbreaking is discovered
Right, and per the second part of my comment—insofar as consciousness is a real phenomenon, there’s an empirical question of if whatever frame invariant definition of computation you’re using is the correct one.
Do you think wants that arise from conscious thought processes are equally valid to wants that arise from feelings? How do you think about that?
while this paradigm of ‘training a model that’s an agi, and then running it at inference’ is one way we get to transformative agi, i find myself thinking that probably WON’T be the first transformative AI, because my guess is that there are lots of tricks using lots of compute at inference to get not quite transformative ai to transformative ai.
my guess is that getting to that transformative level is gonna require ALL the tricks and compute, and will therefore eek out being transformative BY utilizing all those resources.
one of those tricks may be running millions of copies of the thing in an agentic swarm, but i would expect that to be merely a form of inference time scaling, and therefore wouldn’t expect ONE of those things to be transformative AGI on it’s own.
and i doubt that these tricks can funge against train time compute, as you seem to be assuming in your analysis. my guess is that you hit diminishing returns for various types of train compute, then diminishing returns for various types of inference compute, and that we’ll get to a point where we need to push both of them to that point to get tranformative ai
This seems arbitrary to me. I’m bringing in bits of information on multiple layers when I write a computer program to calculate the thing and then read out the result from the screen
Consider, if the transistors on the computer chip were moved around, would it still process the data in the same way and wield the correct answer?
Yes under some interpretation, but no from my perspective, because the right answer is about the relationship between what I consider computation and how I interpret the results in getting
But the real question for me is—under a computational perspective of consciousness, are there features of this computation that actually correlate to strength of consciousness? Does any interpretation of computation get equal weight? We could nail down a precise definition of what we mean by consciousness that we agreed on that didn’t have the issues mentioned above, but who knows whether that would be the definition that actually maps to the territory of consciousness?
For me the answer is yes. There’s some way of interpreting the colors of grains of sands on the beach as they swirl in the wind that would perfectly implement the miller robin primality test algorithm. So is the wind + sand computing the algorithm?
No, people really do see it, that whispiness can be crisp and clear
I’m not the most visual person. But occasionally when I’m reading I’ll start seeing the scene. I then get jolted out of it when I realize I don’t know how I’m seeing the words as they’ve been replaced with the imagined visuals
can you say the types of problems they are?