stavros

Karma: 265

stavros Apr 11, 2025, 11:48 AM
1 point
0
in reply to: Seth Herd’s comment on: Mo Putera’s Shortform
it is possible to do complex general cognition without being able to think about one’s self and one’s cognition. It is much easier to do complex general cognition if the system is able to think about itself and its own thoughts.
I can see this making sense in one frame, but not in another. The frame which seems most strongly to support the ‘Blindsight’ idea is Friston’s stuff—specifically how the more successful we are at minimizing predictive error, the less conscious we are.^[1]
My general intuition, in this frame, is that as intelligence increases more behaviour becomes automatic/subconscious. It seems compatible with your view that a superintelligent system would possess consciousness, but that most/all of its interactions with us would be subconscious.
Would like to hear more about this point, could update my views significantly. Happy for you to just state ‘this because that, read X, Y, Z etc’ without further elaboration—I’m not asking you to defend your position, so much as I’m looking for more to read on it.
1. ^
  This is my potentially garbled synthesis of his stuff, anyway.

stavros Mar 23, 2025, 8:05 AM
11 points
0
on: Reframing AI Safety as a Neverending Institutional Challenge
I don’t like the thing you’re doing where you’re eliding all mention of the actual danger AI Safety/Alignment was founded to tackle—AGI having a mind of its own, goals of its own, that seem more likely to be incompatible with/indifferent to our continued existence than not.

Everything else you’re saying is agreeable in the context you’re discussing it, that of a dangerous new technology—I’d feel much more confident if the Naval Nuclear Propulsion Program (Rickover’s people) was the dominant culture in AI development.
Albeit I have strong doubts about the feasibility of the ‘Oughts^[1]’ you’re proposing, and more critically—I reject the framing...
Any sufficiently advanced technology is indistinguishable from ~~magic~~ ~~biology~~ life
To assume AGI is transformative and important is to assume it has a mind^[2] of its own: the mind is what makes it transformative.

At the very least—assuming no superintelligence—we are dealing with a profound philosophical/ethical/social crisis, for which control based solutions are no solution. Slavery’s problem wasn’t a lack of better chains, whether institutional or technical.
Please entertain another framing of the ‘technical’ alignment problem: midwifery—the technical problem of striving for optimal conditions during pregnancy/birth. Alignment originated as the study of how to bring into being minds that are compatible with our own.
Whether humans continue to be relevant/dominant decision makers post-Birth is up for debate, but what I claim is not up for debate is that we will no longer be the only decision makers.
1. ^
  https://en.wikipedia.org/wiki/Ought_implies_can
2. ^
  There’s a lot to unpack here about what mind actually is/does. I’d appreciate if people who want to discuss this point are at least familiar with Leven’s work.

stavros Jan 25, 2025, 9:57 AM
3 points
−1
in reply to: Ben Pace’s comment on: Ben Pace’s Shortform Feed
What is true is already so / It all adds up to normality
What you’ve lost isn’t the future, it’s the fantasy.
What remains is a game that we were born losing, where there may be few moves left to make, and where most of us most of the time don’t even have a seat at the table.
However, it is a game with very high variance.
It is a game where world shaping things happen regularly due to one person getting lucky (right person, right place, right time, right idea etc).
And one thing I’ve noticed in people who routinely excel at high variance games—e.g. Poker, MTG—is how unaffected they are when they’re down/behind.
There is a mindset, in the moment, not of playing to win… but of playing optimally—of making the best move they can in any situation, of playing to maximize their outs no matter how unlikely they may be.

To those for whom the OP’s message strongly resonates: let it. Feel it. Give your grief and fear, sorrow and anger their due. Practice self-care; be kind and compassionate to yourself as you would to another who felt what you are feeling.
One morning you will wake up feeling okay, and you’ll realize you’ve felt okay more often than not lately.
Then, should this game still appeal to you, it is time to start playing again :)

stavros Jan 10, 2025, 11:51 AM
2 points
0
in reply to: Drake Thomas’s comment on: Drake Thomas’s Shortform
I woke up this morning thinking ‘would be nice to have a concise source for the whole zinc/colds thing’. This is amazing.

I help run an EA coliving space, so I started doing some napkin math on how many sick days you’ll be saving our community over the next year. Then vaguely extrapolated to the broader lesswrong audience who’ll read your post and be convinced/reminded to take zinc (and given decent guidance for how to use it effectively).
I’d guess at minimum you’ve saved dozens of days over the next year by writing this post. That’s pretty cool. Thankyou <3

stavros Dec 27, 2024, 11:55 AM
10 points
11
in reply to: TsviBT’s comment on: The Field of AI Alignment: A Postmortem, and What To Do About It
To the extent that anecdata is meaningful:
I have met somewhere between 100-200 AI Safety people in the past ~2 years; people for whom AI Safety is their ‘main thing’.
The vast majority of them are doing tractable/legible/comfortable things. Most are surprisingly naive; have less awareness of the space than I do (and I’m just a generalist lurker who finds this stuff interesting; not actively working on the problem).

Few are actually staring into the void of the hard problems; where hard here is loosely defined as ‘unknown unknowns, here be dragons, where do I even start’.
Fewer still progress from staring into the void to actually trying things.
I think some amount of this is natural and to be expected; I think even in an ideal world we probably still have a similar breakdown—a majority who aren’t contributing (yet)^[1], a minority who are—and I think the difference is more in the size of those groups.
I think it’s reasonable to aim for a larger, higher quality, minority; I think it’s tractable to achieve progress through mindfully shaping the funding landscape.
1. ^
  Think it’s worth mentioning that all newbies are useless, and not all newbies remain newbies. Some portion of the majority are actually people who will progress to being useful after they’ve gained experience and wisdom.

stavros Dec 27, 2024, 11:40 AM
3 points
0
in reply to: Stephen Fowler’s comment on: The Field of AI Alignment: A Postmortem, and What To Do About It
Thanks for linking this post. I think it has a nice harmony with Prestige vs Dominance status games.
I agree that this is a dynamic that is strongly shaping AI Safety, but would specify that it’s inherited from the non-profit space in general—EA originated with the claim that it could do outcome focused altruism, but.. there’s still a lot of room for improvement, and I’m not even sure we’re improving.
The underlying dynamics and feedback loops are working against us, and I don’t see evidence that core EA funders/orgs are doing more than pay lip service to this problem.

stavros Dec 21, 2024, 11:39 AM
1 point
0
in reply to: Kaj_Sotala’s comment on: Ayn Rand’s model of “living money”; and an upside of burnout
Something in the physical ability of the top-down processes to control the bottom-up ones is damaged, possibly permanently.
Metaphorically, it’s like the revolting parts don’t just refuse to collaborate anymore; they also blow up some of the infrastructure that was previously used to control them.
This is scary; big if true, would significantly change my own personal strategies and those I endorse to others -a switch from focusing on recovery to rehabilitation/adaptation.
I’d be grateful if you can elaborate on this part of your model and/or point me toward relevant material elsewhere.

stavros Dec 18, 2024, 1:11 PM
3 points
−1
in reply to: t14n’s comment on: Careless thinking: A theory of bad thinking
Meek people (like me), may not see the worth in undertaking the risk of publicly revealing arguments or preferences. Embarrassment, shame, potentially being shunned for your revealed preferences, and so on—there are many social risks to being public with your arguments and thought process
2 of the 3 ‘risks’ you highlighted are things you have control over; you are an active participant in your feelings of shame and embarrassment^[1], they are strategies ‘parts’ of you are pursuing to meet your needs, and through inner work^[2]^[3] you can stop relying on these self-limiting strategies.
The 3rd is a feature, not a bug. By and large, anyone who would shun you in this context is someone you want to be shunned by; someone who really isn’t worth your time and energy.
The obvious exceptions are for those who find themselves in hostile cultures where revealing certain preferences poses the risk of literal harm.

Epistemic status: assertive/competitive, status blind autist who is having a great time being this way and loves convincing others to dip their toe in the water and give it a try; you might just find yourself enjoying it too :)

stavros Dec 18, 2024, 12:53 PM
4 points
0
on: Preppers Are Too Negative on Objects
You might find this helpful

stavros Dec 5, 2024, 11:01 AM
5 points
0
in reply to: Seth Herd’s comment on: Which Biases are most important to Overcome?
The only remedy I know of is to cultivate enjoying being wrong. This involves giving up a good bit of one’s self-concept as a highly intelligent individual. This gets easier if you remember that everyone else is also doing their thinking with a monkey brain that can barely chin itself on rationality.
Some thoughts:

I have less trouble with this than most, and the areas where I do notice it arising lead me toward an interesting speculation.

I’m status blind: I very rarely, and mostly only when I was much younger, worry about looking like an idiot/failing publicly etc etc. There is no perceived/felt social cost to me of being wrong, and it often feels good to explicitly call out when I’m wrong in a social context—it feels like finding your way again after being lost.

I generally follow the ‘strong opinions, loosely held’ strategy—I guess at least partly because the shortest path to the right answer is often to be confidently wrong on the internet and wait for someone to correct you :D

However...

Where I do notice the ‘ick field’ arising, where I do notice motivated reasoning coming out in force—is in my relationships. Which makes total sense—being ‘wrong’ about my choice of life partner is hugely costly, so much is built on top of that belief.

Evaluating your relationships is often bad for your relationships; a common piece of relationship advice is ‘Don’t Keep Score’.

Perhaps relationships are a kind of self-fulfiling self-deception—they work because we engage in motivated reasoning, because we commit ‘irrationally’. Or at least this strategy results in better outcomes than we would have otherwise if we’d been more rational.
And with my rough idea of the evolutionary environment, this makes total sense: you don’t choose your family, your tribe, often even your partner. If we weren’t engaging in a whole bunch of motivated reasoning, the most important foundation of our survival/wellbeing—social bonds—would be significantly weakened.

And that ties in neatly with a common theme in the conversation around ‘biases’ - that they’re features, not bugs.

stavros Nov 14, 2024, 7:46 AM
15 points
2
on: Anvil Problems
See Also: Catch-22

stavros Aug 5, 2024, 1:29 PM
9 points
0
on: Can We Predict Persuasiveness Better Than Anthropic?
I am very confused.

My first thought when reading this was ‘huh, no wonder they’re getting mixed results—they’re doing it wrong’.

My second thought when returning to this a day later: good—anything I do to contribute to the ability to understand and measure persuasion is literally directly contributing to dangerous capabilities.

Counterfactually, if we don’t create evals for this… are we not expected to notice that LLMs are becoming increasingly more persuasive? More able to model and predict human psychology?

What is actually the ‘safety’ case for this research? What theory of change predicts this work will be net positive?

stavros Jul 28, 2024, 7:36 PM
4 points
−2
in reply to: habryka’s comment on: What is AI Safety’s line of retreat?
Re: 2

Most promising way is just raising children better.

See (which I’m sure you’ve already read): https://www.lesswrong.com/posts/CYN7swrefEss4e3Qe/childhoods-of-exceptional-people

Alongside that though, I think the next biggest leverage point would be something like nationalising social media and retargeting development/design toward connection and flourishing (as opposed to engagement and profit).

This is one area where, if we didn’t have multiple catastrophic time pressures, I’d be pretty optimistic about the future. These are incredibly high impact and tractable levers for changing the world for the better; part of the whole bucket of ‘just stop doing the most stupid thing’ stuff.

stavros Jun 26, 2024, 10:24 AM
1 point
0
on: New fast transformer inference ASIC — Sohu by Etched
Is there anything useful we can learn from Crypto ASICs as to how this will play out? And specifically, how to actually bet on it?

stavros May 12, 2024, 4:00 PM
3 points
0
in reply to: romeostevensit’s comment on: Questions are usually too cheap
Replying to this because it seems a useful addition to the thread; assuming OP already knows this (and more).
1.) And none of the correct counterplays are ‘look, my opponent is cheating/look, this game is unfair’. (Scrub mindset)
2.) You know what’s more impressive than winning a fair fight? Winning an unfair one. While not always an option, and usually with high risk:reward, beating an opponent who has an assymetric situational advantage is hella convincing; it affords a much higher ceiling (relative to a ‘fair’ game) to demonstrate just how much better than your opponent you are.

stavros Apr 15, 2024, 7:40 AM
5 points
0
on: A High Decoupling Failure
It’s an interesting framework, I can see it being useful.

I think it’s more useful when you consider both high-decoupling and low-decoupling to be failure modes, more specifically: when one is dominant and the other is neglected, you reliably end up with inacccurate beliefs.
You went over the mistakes of low-decouplers in your post, and provided a wonderful example of a high-decoupler mistake too!
High decouplers will notice that, holding preferences constant, offering people an additional choice cannot make them worse off. People will only take the choice if its better than any of their current options
Aside from https://thezvi.wordpress.com/2017/08/12/choices-are-really-bad/ there’s also the consideration of what choice I offer you, or how I frame the choice (see Kahneman’s stuff).
And that’s just considering it from the individual psychological level, but there are social/cultural levers and threads to pull here too.

I think the optimal functioning of this process is cyclical with both high decoupling phases and highly integrated phases, and the measure of balance is something like ‘this isn’t obviously wrong in either context’.

stavros Apr 4, 2024, 8:09 AM
1 point
0
in reply to: Gerald Monroe’s comment on: New LessWrong review winner UI (“The LeastWrong” section and full-art post pages)
I think future technology all has AI as a pre-requisite?
My high conviction hot take goes further: I think all positive future timelines have AI as a pre-requisite. I expect that, sans AI, our future—our immediate future: decades, not centuries—is going to be the ugliest, and last, chapter in our civilization’s history.

stavros Apr 4, 2024, 7:51 AM
41 points
13
in reply to: kave’s comment on: What’s with all the bans recently?
I have been in the position of trying to moderate a large and growing community—it was at 500k users last I checked, although I threw in the towel around 300k—and I know what a thankless, sisyphean task it is.

I know what it is to have to explain the same—perfectly reasonable—rule/norm again and again and again.

I know what it is to try to cultivate and nurture a garden while hordes of barbarians trample all over the place.

But...

If it aint broke, don’t fix it.

I would argue that the majority of the listed people penalized are net contributors to lesswrong, including some who are strongly net positive.

I’ve noticed y’all have been tinkering in this space for a while, I think you’re trying super hard to protect lesswrong from the eternal september and you actually seem to be succeeding, which is no small feat, buuut...

I do wonder if the team needs a break.

I think there’s a thing that happens to gardeners (and here I’m using that as a very broad archetype), where we become attached to and identify with the work of weeding—of maintaining, of day after day holding back entropy—and cease to take pleasure in the garden itself.

As that sets in, even new growth begins to seem like a weed.

stavros Apr 3, 2024, 7:59 PM
4 points
1
in reply to: philh’s comment on: philh’s Shortform
Fine. You win. Take your upvote.

stavros Mar 26, 2024, 11:28 AM
8 points
0
on: Should rationalists be spiritual / Spirituality as overcoming delusion
Big fan of both of your writings, this dialogue was a real treat for me.
I’ve been trying to find a satisfying answer to the seeming inverse correlation of ‘wellbeing’ and ‘agency’ (these are very loose labels).
You briefly allude to a potential mechanism for this^[1]
You also briefly allude to another mechanism with explanatory power for the inverse^[2] - i.e. that while it might seem an individual is highly agentic, they are in fact little more than a host for a highly agentic egregore
I’m engaged in that most quixotic endeavour of actually trying to save the world^[3] ^[4], and thus I’m constantly playing with my world model and looking for levers to pull, dominos to push over, that might plausibly -and quickly- shift probability mass towards pleasant timelines.
I think germ theory is exactly the kind of intervention that works here—it’s a simple map that even a child can understand, yet it’s a 100x impact.
I think there’s some kind of ‘germ theory for minds’, and I think we already have all the pieces—we just need to put them together in the right way. I think it’s plausible that this is easy, rapidly scaleable and instrumentally valuable to other efforts in the ‘save the world’ space.
But… I don’t want to end up net negative on agency. In fact my primary objective is to end up strongly net positive. I need more people trying to change the world, not less.
Yet… that scale of ambition seems largely the preserve of people you’d be highly unlikey to describe as ‘enlightened’, ‘balanced’ or ‘well adjusted’; it seems to require a certain amount of delusion to even (want to) try, and benefit from unbalanced schema that are willing to sacrifice everything on the altar of success.
Most of the people who seem to succcessfully change the world are the people I least want to; whereas the people I most want to change the world seem the least likely to.
1. ^
  Since the schools that removed social conditioning and also empowered practitioners to upend the social order, tended to get targeted for destruction. (Or at least so I suspect and some people on Twitter said “yes this did happen” when I speculated this out loud.)
2. ^
  In the Buddhist model of human psychology, we are by default colonized by parasitic thought patterns, though I guess in some cases, like the aforementioned fertility increasing religious memes, they should be thought of as symbiotes with a tradeoff, such as degrading the hosts’ episteme.
3. ^
  I don’t expect to succeed, I don’t expect to even matter, but it’s a fun hobby.
4. ^
  Also the world does actually seem to be in rather urgent need of saving; short of a miracle or two it seems like I’m unlikely to live to enjoy my midlife crisis.

stavros

Any sufficiently advanced technology is indistinguishable from magic biology life