Tetraspace

Karma: 600

Drew the shoggoth and named notkilleveryoneism.

https://twitter.com/TetraspaceWest

Tetraspace Jan 5, 2025, 7:41 AM
1 point
0
in reply to: sapphire’s comment on: deluks917′s Shortform
Just saw this and can confirm it was one of the best times of my life.

Tetraspace Dec 25, 2024, 5:46 AM
3 points
0
in reply to: niplav’s comment on: niplav’s Shortform
Dominance/submission dynamics in relationships
In Act I outputs Claudes do a lot of this, e.g. this screenshot of Sonnet 3.6

Tetraspace Sep 29, 2024, 4:16 AM
−2 points
−15
in reply to: Raemon’s comment on: “Slow” takeoff is a terrible term for “maybe even faster takeoff, actually”
Fast/Slow takeoff

Tetraspace Aug 30, 2024, 1:36 AM
5 points
0
in reply to: Ruby’s comment on: Ruby’s Shortform Feed
I’d like beta access. My main use case is that I intend to write up some thoughts on alignment (Manifold gives 40% that I’m proud of a write-up, I’d like that number up), and this would be helpful for literature review and finding relevant existing work. Especially so because a lot of the public agent foundations work is old and migrated from the old alignment forum, where it’s low-profile compared to more recent posts.

Tetraspace Aug 14, 2024, 3:40 AM
9 points
7
in reply to: Nathan Young’s comment on: Ten arguments that AI is an existential risk
AI isn’t dangerous because of what experts think, and the arguments that persuaded the experts themselves are not “experts think this”. It would have been a misleading argument for Eliezer in 2000 being among the first people to think about it in the modern way, or for people who weren’t already rats in maybe 2017 before GPT was in the news and when AI x-risk was very niche.
I also have objections to its usefulness as an argument; “experts think this” doesn’t give me any inside view of the problem by which I can come up with novel solutions that the experts haven’t thought of. I think this especially comes up if the solutions might be precise or extreme; if I was an alignment researcher, “experts think this” would tell me nothing about what math I should be writing, and if I was a politician, “experts think this” would be less likely to get me to come up with solutions that I think would work rather than solutions that are compromising between the experts coalition and my other constituents.
So, while it is evidence (experts aren’t anticorrelated with the truth), there’s better reasoning available that’s more entangled with the truth and gives more precise answers.

Tetraspace Aug 9, 2024, 6:30 AM
7 points
6
on: How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage
I learned this lesson looking at the conditional probabilities of candidates winning given they were nominated in 2016, where the candidates with less than about 10% chance of being the nominee had conditional probabilities with noise between 0 and 100%. And this was on the thickly traded real-money markets of Betfair! I personally engage in, and also recommend, just kinda throwing out any conditional probabilities that look like this, unless you have some reason to believe it’s not just noise.
Another place this causes problems is in the infinitely-useful-if-they-could-possibly-work decision markets, where you want to be able to evaluate counterfactual decisions, except these are counterfactuals so you don’t make the decision so there’s no liquidity and it can take any value.

Tetraspace Jul 23, 2024, 7:24 AM
2 points
4
in reply to: RogerDearnaley’s comment on: Simplifying Corrigibility – Subagent Corrigibility Is Not Anti-Natural
Obeying it would only be natural if the AI thinks that the humans are more correct than the AI would ever be, after gathering all available evidence, where “correct” is given by the standards of the definition of the goal that the AI actually has, which arguendo is not what the humans are eventually going to pursue (otherwise you have reduced the shutdown problem to solving outer alignment, and the shutdown problem is only being considered under the theory that we won’t solve outer alignment).
An agent holding a belief state that given all available information it will still want to do something other than the action it will think is best then is anti-natural; utility maximisers would want to take that action.
This is discussed on Arbital as the problem of fully updated deference.

Tetraspace Feb 8, 2024, 1:26 AM
3 points
0
on: Conditional prediction markets are evidential, not causal
This ends up being pretty important in practise for decision markets (“if I choose to do X, will Y?”), where by default you might e.g. only make a decision if it’s a good idea (as evaluated by the market), and therefore all traders will condition on the market having a high probability which is obviously quite distortionary.

Tetraspace Dec 24, 2023, 2:51 PM
2 points
0
in reply to: Tamsin Leake’s comment on: carado’s Shortform
I replied on discord that I feel there’s maybe something more formalisable that’s like:
- reality runs on math because, and is the same thing as, there’s a generalised-state-transition function
- because reality has a notion of what happens next, realityfluid has to give you a notion of what happens next, i.e. it normalises
- the idea of a realityfluid that doesn’t normalise only comes to mind at all because you learned about R^n first in elementary school instead of S^n
which I do not claim confidently because I haven’t actually generated that formalisation, and am posting here because maybe there will be another Lesswronger’s eyes on it that’s like “ah, but...”.

Tetraspace Jun 1, 2023, 6:54 PM
1 point
0
in reply to: mako yass’s comment on: Shutdown-Seeking AI
Not unexpected! I think we should want AGI to, at least until it has some nice coherent CEV target, explain at each self-improvement step exactly what it’s doing, to ask for permission for each part of it, to avoid doing anything in the process that’s weird, to stop when asked, and to preserve these properties.

Tetraspace Apr 28, 2023, 4:02 PM
3 points
on: Tetraspace Grouping’s Shortform
Even more recently I bought a new laptop. This time, I made the same sheet, multiplied the score from the hard drive by $\frac{2}{3}$ because 512 GB is enough for anyone and that seemed intuitively the amount I prioritised extra hard drive space compared to RAM and processor speed, and then looked at the best laptop before sharply diminishing returns set in; this happened to be the HP ENVY 15-ep1503na 15.6″ Laptop—Intel® Core™ i7, 512 GB SSD, Silver. This is because I have more money now, so I was aiming to maximise consumer surplus rather than minimise the amount I was spending.^[1]
Surprisingly, it came with a touch screen! That’s just the kind of nice thing that laptops do nowadays, because as I concluded in my post, everything nice about laptops correlates with everything else so high/low end is an axis it makes sense to sort things on. Less surprisingly, it came with a graphics card, because ditto.
Unfortunately this high-end laptop is somewhat loud; probably my next one will be less loud, up to including an explicit penalty for noise.
1. ^
  It would have been predictable, however, at the time that I bought that new laptop, that I would have had that much money at a later date. Which means that I should have just skipped straight to consumer surplus maxxing.

Tetraspace Apr 26, 2023, 4:58 PM
5 points
5
on: Is the fact that we don’t observe any obvious glitch evidence that we’re not in a simulation?
It would be evidence at all. Simple explanation: if we did observe a glitch, that would pretty clearly be evidence we were in a simulation. So by conservation of expected evidence, non-glitches are evidence against.

Tetraspace Apr 18, 2023, 10:34 PM
4 points
0
in reply to: the gears to ascension’s comment on: Pausing AI Developments Isn’t Enough. We Need to Shut it All Down
I don’t think it’s quite that; a more central example I think would be something like a post about extrapolating demographic trends to 2070 under the UN’s assumptions, where then justifying whether or not 2070 is a real year is kind of a different field.

Tetraspace Apr 10, 2023, 2:03 PM
7 points
on: Tetraspace Grouping’s Shortform
$arg max U$ , as a mathematical structure, is smarter than god and perfectly aligned to $U$ ; the value of $arg max U$ will never actually be $arg max V$ because $V$ is more objectively rational, or because you made a typo and it knows you meant to say $arg max V$ ; and no matter how complicated the mapping is from $a$ to $U (a)$ it will never fall short of giving the $a$ that gives the highest value of $U$ .
Which is why in principle you can align a superior being, like $arg max$ , or maybe like a superintelligence.

Tetraspace Apr 3, 2023, 7:54 PM
4 points
on: Tetraspace Grouping’s Shortform
“The AI does our alignment homework” doesn’t seem so bad—I don’t have much hope for it, but because it’s a prosaic alignment scheme so someone trying to implement it can’t constrain where Murphy shows up, rather than because it’s an “incoherent path description”.
A concrete way this might be implemented is
- A language model is trained on a giant text corpus to learn a bunch of adaptations that make it good at math, and then fine-tuned for honesty. It’s still being trained at a safe and low level of intelligence where honesty can be checked, so this gets a policy that produces things that are mostly honest on easy questions and sometimes wrong and sometimes gibberish and never superhumanly deceptive.^[1]
- It’s set to work producing conceptually crisp pieces of alignment math, things like expected utility theory or logical inductors, slowly on inspectable scratchpads and so on, with the dumbest model that can actually factor scientific research^[1], with human research assistants to hold their hand if that lets you make the model dumber. It does this, rather than engineering, because this kind of crisp alignment math is fairly uniquely pinned down so it can be verified, and it’s easier to generate compared to any strong pivotal engineering task where you’re competing against humans on their own ground so you need to be smarter than humans, so while it’s operating in a more dangerous domain it’s using a safer level of intelligence.^[1]
- The human programmers then use this alignment math to make an corrigible thingy that has dangerous levels of intelligence that does difficult engineering and doesn’t know about humans, while this time knowing what they’re doing. Getting the crisp alignment math from parallelisable language models helps a lot and gives them a large lead time, because a lot of it’s the alignment version of backprop where it would have took a surprising amount of time to discover otherwise.
This all happens at safe-ish low-ish levels of intelligence (such a model would probably be able to autonomously self-replicate on the internet, but probably not reverse protein folding, which means that all the ways it could be dangerous are “well don’t do that”s as long as you keep the code secret^[1]), with the actual dangerous levels of optimisation being done by something made by the humans using pieces of alignment math which are constrained down to a tiny number of possibilities.
EDIT 2023-07-25: A longer debate that I think is worth reading about the model that leads it to being an incoherent path description between Holden Karnofsky (pro) and Nate Soares (against) is here; I hadn’t read this as of writing this.
1. ^
  Unless it isn’t; it’s a giant pile of tensors, how would you know? But this isn’t special to this use case.

Tetraspace Mar 30, 2023, 8:36 PM
21 points
0
in reply to: Eagleshadow’s comment on: “Dangers of AI and the End of Human Civilization” Yudkowsky on Lex Fridman
The solanine poisoning example was originally posted to Reddit here, the picture of Sydney Bing from a text description was posted on Twitter here.

Tetraspace Mar 25, 2023, 12:00 PM
2 points
−1
in reply to: hold_my_fish’s comment on: The Overton Window widens: Examples of AI risk in the media
The alignment, safety and interpretability is continuing at full speed, but if all the efforts of the alignment community are sufficient to get enough of this to avoid the destruction of the world in 2042, and AGI is created in 2037, then at the end you get a destroyed world.
It might not be possible in real life (List of Lethalities: “we can’t just decide not to build AGI”), and even if possible it might not be tractable enough to be worth focusing any attention on, but it would be nice if there was some way to make sure that AGI happens after alignment is sufficient at full speed (EDIT: or, failing that, to happen later, so if alignment goes quickly that takes the world from bad outcomes to good outcomes, instead of bad outcomes to bad outcomes).

Tetraspace Mar 23, 2023, 8:09 PM
6 points
0
on: Alignment-related jobs outside of London/SF
80,000 Hours’ job board lets you filter by city. As of the time of writing, roles in their AI Safety & Policy tag are ⁶¹⁄₁₁₂ San Francisco, ¹⁶⁄₁₁₂ London, ³⁵⁄₁₁₂ other (including remote).

Tetraspace Mar 22, 2023, 10:48 PM
9 points
2
in reply to: Gordon Seidoh Worley’s comment on: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
There are about 8 billion people, so your 24,000 QALYs should be 24,000,000.

Tetraspace Mar 22, 2023, 5:57 PM
4 points
3
in reply to: Fergus Fettes’s comment on: Are we too confident about unaligned AGI killing off humanity?
I don’t mean to say that it’s additional reason to respect him as an authority or accept his communication norms above what you would have done for other reasons (and I don’t think people particularly are here), just that it’s the meaning of that jokey aside.