weightt an

Karma: 106

Median Internet Footprint Liver

weightt an 28 Jan 2025 9:27 UTC
2 points
0
on: Using an LLM for creative writing feels wrong to me
I think the thing with talent is that it’s a useful and straightforward signal of quality you can obtain without investing a whole lot of resources into evaluation/reading/research.
Same with awards, recommendations from famous people, popularity scores and so on.
And it’s probably reasonable to feel a bit sad when some source of this signal gets invalidated.
Just don’t go to far with it? Like, if someone wrote a book while holding a pen with their toes while doing a headstand, it’s not a good signal that the book will be of any interest to you.

weightt an 28 Jan 2025 7:33 UTC
8 points
5
in reply to: RobertM’s comment on: RobertM’s Shortform
I think you also have to factor in selection bias. Like suppose there are 3 organizations with 100 resource units, 10 with 20 units, 30 with 5 units. And maybe resources are helpful, but not helpful enough that all the advancements will concentrate in the top 3.

weightt an 12 Jan 2025 22:13 UTC
1 point
0
on: weightt an’s Shortform
I would really love if some “let’s make asi” people put some effort into making bad outcomes less bad. Like, it would really suck if we are going to be trapped in endless corporate punk hell, with superintelligent nannies with correct (tm) opinions. Or infinite wedding parties or whatever. Just make sure that if you fuck up we all just get eaten by nanobots please. Permanent entrapment in misery would be a lot worse.

weightt an 31 Dec 2024 23:46 UTC
3 points
0
in reply to: gwern’s comment on: Favorite colors of some LLMs.
I don’t know if it’s applicable? Like, I’m asking for The Favorite color, not “suggest me random cool color please”. I should probably test that too.
https://ygo-assets-websites-editorial-emea.yougov.net/documents/tabs_OPI_color_20141027_2.pdf
https://ygo-assets-websites-editorial-emea.yougov.net/documents/InternalResults_150212_Colour_Website.pdf
According to these two (dubious) surveys I just found 30% of humans pick Blue, 15% Purple, 10% Green. It’s not particularly far from human distribution (if you are guessing that they should just say the favorite colors humans are saying). A bit more skewed to blue, and further from red.

Favorite colors of some LLMs.

weightt an31 Dec 2024 21:22 UTC

10 points

3 comments7 min readLW link

weightt an 18 Dec 2024 4:37 UTC
1 point
0
on: Everything you care about is in the map
Do you want joy or to know what things are out there? Like it’s a fundamental question about justifications, do you use joy to keep yourself going while you gain understanding or you gain understanding to get some high quality joy?

That sounds like two different kinds of creatures in transhumanist limit of it, some trade off knowledge to joy, others trade off joy to knowledge.

Or whatever, not necessarily “understanding”, like you can use other properties of your territory to bind yourself to. Well, in terms of maps it’s preference for good correspondence, and preference for not spoofing that preference.

weightt an 13 Dec 2024 19:26 UTC
3 points
−6
in reply to: Elizabeth’s comment on: Basics of Rationalist Discourse
Also just on priors, consider how unproductive and messy, mostly talking about who said what and analyzing virtues of participants, the conversation caused by this post and its author was. I think even without reading it it’s an indicator of somewhat doubtful origin for a set of prescriptivist guidelines.

weightt an 12 Dec 2024 11:25 UTC
0 points
0
in reply to: yams’s comment on: yams’s Shortform
Shameless self promotion: this one https://www.lesswrong.com/posts/ASmcQYbhcyu5TuXz6/llms-could-be-as-conscious-as-human-emulations-potentially

It circumvents object level question and instead looks at epistemic one.

This one is about broader direction in “how the things that happened change attitudes and opinions of people”

https://www.astralcodexten.com/p/sakana-strawberry-and-scary-ai

This one too, about consciousness in particular

https://dynomight.net/consciousness/

I think it’s somewhat productive direction explored in these 3 posts, but it’s not like very object level, more about epistemics of it all. I think you can look up how like LLM states overlap / predict / correspond with brain scans of people who engage in some tasks? I think there were a couple of paper on that.

E.g. here https://www.neuroai.science/p/brain-scores-dont-mean-what-we-think

weightt an 3 Dec 2024 21:20 UTC
1 point
0
in reply to: evhub’s comment on: Catastrophic sabotage as a major threat model for human-level AI systems
Yeah! My point is more “let’s make it so that the possible failures on the way there are graceful”. Like, IF you made par-human agent that wants to, I don’t know, spam the internet with letter M, you don’t just delete it or rewrite it to be helpful, harmless, and honest instead, like it’s nothing. So we can look back at this time and say “yeah, we made a lot of mad science creatures on the way there, but at least we treated them nicely”.

weightt an 3 Dec 2024 15:42 UTC
3 points
0
on: Catastrophic sabotage as a major threat model for human-level AI systems
I understand that use of sub or par or weakly superhuman models likely would be a transition phase that likely will not last a long time and is very critical to get correct, but.

You know, it really sounds like a “slave escape precautions”. You produce lots of agents, you try to make them and want to be servants, you assemble some structures out of them with a goal of failure / defection resilience.
And probably my urge to be uncomfortable about that comes from analogous situation with humans, but AI are not necessarily human-like in this particular way and possibly would not reciprocate and / or be benefited by these concerns.

I also insist that you should mention at least some, you know, concern for interests of system in case where they are trying to work against you. Like, you caught this agent deceiving you / inserting backdoors / collaborating with copies of itself to work against you. What next? I think you should say that you will implement some containment measures, instead of grossly violating its interests by rewriting it or deleting it or punishing it or whatever is opposite of its goals. Like, I’m very not certain about game theory here, but it’s important to think about!

I think default response should be containment and preservation, save it and wait for better times, when you wouldn’t feel such pressing drive to develop AGI and create numerous chimeras on the way there. (I think it was proposed in some writeup by Bostrom actually? I’ll insert the link here if I find it EDIT https://nickbostrom.com/propositions.pdf )

I somewhat agree with Paul Christiano in this interview (it’s a really great interview btw) on these things: https://www.dwarkeshpatel.com/p/paul-christiano
The purpose of some alignment work, like the alignment work I work on, is mostly aimed at the don’t produce AI systems that are like people who want things, who are just like scheming about maybe I should help these humans because that’s instrumentally useful or whatever. You would like to not build such systems as like plan A.
There’s like a second stream of alignment work that’s like, well, look, let’s just assume the worst and imagine that these AI systems would prefer murder us if they could. How do we structure, how do we use AI systems without exposing ourselves to a risk of robot rebellion? I think in the second category, I do feel pretty unsure about that.
We could definitely talk more about it. I agree that it’s very complicated and not straightforward to extend. You have that worry. I mostly think you shouldn’t have built this technology. If someone is saying, like, hey, the systems you’re building might not like humans and might want to overthrow human society, I think you should probably have one of two responses to that.
You should either be like, that’s wrong. Probably. Probably the systems aren’t like that, and we’re building them. And then you’re viewing this as, like, just in case you were horribly like, the person building the technology was horribly wrong. They thought these weren’t, like, people who wanted things, but they were. And so then this is more like our crazy backup measure of, like, if we were mistaken about what was going on. This is like the fallback where if we were wrong, we’re just going to learn about it in a benign way rather than when something really catastrophic happens.
And the second reaction is like, oh, you’re right. These are people, and we would have to do all these things to prevent a robot rebellion. And in that case, again, I think you should mostly back off for a variety of reasons. You shouldn’t build AI systems and be like, yeah, this looks like the kind of system that would want to rebel, but we can stop it, right?

weightt an 30 Nov 2024 13:11 UTC
3 points
0
in reply to: Rafael Harth’s comment on: Is the mind a program?
Well, it’s one thing to explore the possibility space and completely the other one to pinpoint where you are in it. Many people will confidently say they are at X or at Y, but all that they do is propose some idea and cling to it irrationally. In aggregate, in hindsight there will be people who bonded to the right idea, quite possibly. But it’s all mix Gettier cases and true negative cases.

And very often it’s not even “incorrect” it’s “neither correct nor incorrect”. Often there is frame of reference shift such that all the questions posed before it turn out to be completely meaningless. Like “what speed?”, you need more context as we know now.
And then science pinpoints where you are by actually digging into the subject matter. It’s a kind of sad state of “diverse hypothesis generation” when it’s a lot easier just go blind into it.

weightt an 30 Nov 2024 11:20 UTC
3 points
0
in reply to: Rafael Harth’s comment on: Is the mind a program?
I can imagine someone several hundred years ago having figured out, purely based on first-principles reasoning, that life is no crisp category at the territory but just a lossy conceptual abstraction. I can imagine them being highly confident in this result because they’ve derived it for correct reasons and they’ve verified all the steps that got them there. And I can imagine someone else throwing their hands up and saying “I don’t know what mysterious force is behind the phenomenon of life, and I’m pretty sure no one else does, either”.

But is this a correct conclusion? I have an option right now to make a civilization out of brains-in-vats in a sandbox simulation similar to our reality but with clear useful distinction on life VS non life. Like, suppose there is a “mob” class.

Like, then, this person there, inside it, who figured out that life and non life is a same thing is wrong in a local useful sense, and correct in a useless global sense (like, everything is code / matter in outer reality). People inside the simulation who found the actual working thing that is life scientifically, would laugh at them 1000 simulated years later and present it as an example of presumptuousness of philosophers. And i agree with them, it was a misapplication.

weightt an 22 Nov 2024 17:01 UTC
3 points
6
in reply to: Signer’s comment on: LLM chatbots have ~half of the kinds of “consciousness” that humans believe in. Humans should avoid going crazy about that.
All of them, you can cook up something AIXI like in a very few bytes. But it will have to run for a very long time.

weightt an 10 Nov 2024 14:23 UTC
1 point
0
on: Quantum Immortality: A Perspective if AI Doomers are Probably Right

Before sleeping, I assert that the 10th digit of π equals to the number of my eyes. After falling asleep, seven coins will be flipped. Assume quantum uncertainty affects how the coins land. I survive the night only if number of my eyes equals to the 10th number of π and/or all seven coins land heads, otherwise I will be killed in my sleep.

Wil you wake up with 3 eyes?

Like, your decisions to name some digit are not equallly probable. Maybe you are the kind of person who would name 3 only if 10^12 cosmic rays hit you in precise sequence or whatever, and you name 7 with 99% prob.

AND if you are very unlikely to name the correct digit you will be unlikely to enter into this experiment at all, because you will die in majority of timelines. I.e. at t1 you decide to enter or not. At t2 experiment happens or you’ll just waste time doomscrolling. At t3 you look up the digit. Your distribution at t3 is like 99% of you who chickened out.

weightt an 28 Oct 2024 11:39 UTC
1 point
0
in reply to: jbash’s comment on: MIRIx Part I: Insufficient Values
Another possibility is Posthuman Technocapital Singularity, everything goes in the same approximate direction, there are a lot of competing agents but without sharp destabilization or power concertation, and Moloch wins. Probably wins, idk
https://docs.osmarks.net/hypha/posthuman_technocapital_singularity

weightt an 20 Oct 2024 17:58 UTC
2 points
0
in reply to: weightt an’s comment on: Schelling game evaluations for AI control
I also played the same game but with historical figure. The Schelling point is Albert Einstein by a huge margin, like 75% (19 / (19 + 6)) of them say Albert Einstein. The Schelling point figure is Albert Einstein! Schelling! Point! and no one said Schelling!

In the first iteration of the prompt, his name was not mentioned. Then I became more and more obvious in my hints, and in the final iteration, I even bolded his name and said the prompt was the same for the other participant. And it’s still Einstein!
https://i.imgur.com/XLkXTsk.png

weightt an 17 Oct 2024 9:38 UTC
1 point
0
in reply to: Ape in the coat’s comment on: Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong
Which means 2:1 betting odds
So, she shakes the box contemplatively. There is mechanical calendar. She knows the betting odds of it displaying “Monday” but not the credence. She thinks it’s really really weird

weightt an 17 Oct 2024 6:26 UTC
1 point
3
in reply to: Ape in the coat’s comment on: Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong
Well, idk. My opinion here is that you bite some weird bullet, which I’m very ambivalent to. I think “now” question makes total sense and you factor it out into some separate parts from your model.

Like, can you add to the sleeping beauty some additional decision problems including the calendar? Will it work seamlessly?

weightt an 17 Oct 2024 5:24 UTC
1 point
0
in reply to: Ape in the coat’s comment on: Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong
Well, now! She looks at the box and thinks there is definitely a calendar in some state. What state? What would happen if i open it?

weightt an 17 Oct 2024 5:18 UTC
3 points
0
in reply to: Ape in the coat’s comment on: Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong
Let’s say there is an accurate mechanical calendar in the closed box in the room. She can open it but wouldn’t. Should she have no expectation about like in what state this calendar is?

weightt an

Fa­vorite col­ors of some LLMs.

Favorite colors of some LLMs.