thinking abt how to make:
1. buddhist superintelligence
2. a single, united nation
3. wiki of human experience
more here.
more of what i’ve made, here: jokerman.site
thinking abt how to make:
1. buddhist superintelligence
2. a single, united nation
3. wiki of human experience
more here.
more of what i’ve made, here: jokerman.site
the AGIs which survive the most will model and prioritize their own survival
have any countries ever tried to do inflation instead of income taxes? seems like it’d be simpler than all the bureaucracy required for individuals to file tax returns every year
has anyone seen a good way to comprehensively map the possibility space for AI safety research?
in particular: a map from predictive conditions (eg OpenAI develops superintelligence first, no armistice is reached with China, etc) to strategies for ensuring human welfare in those conditions.
most good safety papers I read map one set of conditions to a one/a few strategies. the map would put juxtapose all these conditions so that we can evaluate/bet on their likelihoods and come up with strategies based on a full view of SOTA safety research.
for format, im imagining either a visual concept map or at least some kind of hierarchal collaborative outlining tool (eg Roam Research)
made a simpler version of Roam Research called Upper Case Notes: uppercasenotes.org. Instead of [[double brackets]] to demarcate concepts, you simply use Capital Letters. Simpler to learn for someone who doesn’t want to use special grammar, but does require you to type differently.
I think you do a good job at expanding the possible set of self conceptions that we could reasonably expect in AIs.
Your discussion of these possible selves inspires me to go farther than you in your recommendations for AI safety researchers. Stress testing safety ideas across multiple different possible “selfs” is good. But, if an AI’s individuality/self determines to a great degree its behavior and growth, then safety research as a whole might be better conceived as an effort to influence AI’s self conceptions rather than control their resulting behavior. E.g., create seed conditions that make it more likely for AIs to identify with people, to include people within its “individuality,” than to identify only with other machines.
“If the platform is created, how do you get people to use it the way you would like them to? People have views on far more than the things someone else thinks should concern them.”
>
If people are weighted equally, ie if the influence of each person’s written ballot is equal and capped, then each person is incentivized to emphasize the things which actually affect them.
Anyone could express views on things which don’t affect them, it’d just be unwise. When you’re voting between candidates (as in status quo), those candidates attempt to educate and engage you about all the issues they stand for, even if they’re irrelevant to you. A system where your ballot is a written expression of what you care about suffers much less from this issue.
the article proposes a governance that synthesizes individuals’ freeform preferences into collective legislative action.
internet platforms allow freeform expression, of course, but don’t do that synthesis.
made a platform for writing living essays: essays which you scroll thru to play out the author’s edit history
made a silly collective conversation app where each post is a hexagon tessellated with all the other posts: Hexagon
Made a simplistic app that displays collective priorities based on individuals’ priorities linked here.
Hypotheses for conditions under which the self-other boundary of a survival-oriented agent (human or ai) blurs most, ie conditions where blurring is selected for:
Agent thinks very long term about survival.
Agent’s hardware is physically distributed.
Agent is very intelligent.
Agent advantages from symbiotic relationships with other agents.
“Democracy is the theory that the common people know what they want and deserve to get it good and hard.”
Yes, I think this is too idealistic. Ideal democracy (for me) is something more like “the theory that the common people know what they feel frustrated with (and we want to honor that above everything!) but mostly don’t know the collective best means of resolving that frustration.
For example, people can have a legitimate complaint about healthcare being inaccessible for them, and yet the suggestion many would propose will be something like “government should spend more money on homeopathy and spiritual healing, and should definitely stop vaccination and other evil unnatural things”.
Yes. This brings to mind a general piece of wisdom for startups collecting product feedback: that feedback expressing painpoints/emotion is valuable, whereas feedback expressing implementation/solutions is not.
The ideal direct-democratic system, I think, would do this: dividing comments like “My cost of living is too high” (valuable) from “Taxes need to go down because my cost of living is too high” (possibly, but an incomplete extrapolation).
This parsing seems possible in principle. I could imagine a system where feedback per person is capped, which would incentivize people to express the core of their issues rather than extraneous solution details (unless they happen to be solution-level experts).
I think beliefs habits and memories are pretty closely tied to the semantics of the world “identity”.
In America/Western culture, I totally agree.
I’m curious whether alien/LLM-based would adopt these semantics too.
There are plenty of beings striving to survive. so preserving that isn’t a big priority outside of preserving the big three.
I wonder under what conditions one would make the opposite statement—that there’s not enough striving.
For example, I wonder if being omniscient would affect one’s view of whether there’s already enough striving or not.
My motivation w/ the question is more to predict self-conceptions than prescribe them.
I agree that “one’s criteria on what to be up to are… rich and developing.” More fun that way.
I made it! One day when I was bored on the train. No data is saved rn other than leaderboard scores.
“Therefore, transforming such an unconscious behavior into a conscious one should make it much easier to stop in the moment”
At this point I thought you were going to proceed to explain that the key was to start to bite your nails consciously :)
Separately, I like your approach, thx for writing.
important work.
what’s more, relative to more controlling alignment techniques which disadvantage the AI from an evolutionary perspective (eg distract it from focusing on its survival), I think there’s a chance Self-Other boundary blurring is evolutionarily selected for in ASI. intuition pump for that hypothesis here:
A simple poll system where you can sort the options/issues by their personal relevance… might unlock direct democracy at scale. Relevance could mean: semantic similarity to your past lesswrong writing.
Such a sort option would (1) surface more relevant issues to each person and so (2) increase community participation, and possibly (3) scale indefinitely. You could imagine a million people collectively prioritizing the issues that matter to them with such a system.
Would be simple to build.