Bridgett Kay

Karma: 161

In another life I wrote Gemini Song and The Coven series. Now I’m working on the pause.

Bridgett Kay Mar 23, 2025, 8:14 PM
2 points
0
in reply to: Rafka’s comment on: Metacognition Broke My Nail-Biting Habit
Thank you! I’ll work on that and see if I have any other questions.

Bridgett Kay Mar 22, 2025, 4:43 PM
3 points
0
on: Metacognition Broke My Nail-Biting Habit
“(I typically closed my eyes briefly and generated a small pleasurable feeling.)”—could you explain a little further how to do this? I’m not good at this at all and I think it would be an extremely useful skill for me. Apologies if you’ve answered this elsewhere.

Bridgett Kay Feb 6, 2025, 12:53 PM
1 point
0
in reply to: Anon User’s comment on: Alignment Paradox and a Request for Harsh Criticism
“Future progress is a part of current human values” of course- the danger lies in the “future” always being just that- the future. One would naturally hope that it wouldn’t go this way, but continuously putting off the future because now is always the present is a possible outcome. It can even be a struggle with current models to get them to generate novel ideas, because of a stubbornness not to say anything for which there is not yet evidence.
Thank you for that criticism- I hadn’t necessarily given that point enough thought, and I think I am starting to see where the weaknesses are.

Bridgett Kay Feb 6, 2025, 12:48 PM
1 point
0
in reply to: Seth Herd’s comment on: Alignment Paradox and a Request for Harsh Criticism
Yeah- calling myself a failed scifi writer really was half in jest- had some very limited success as an indie writer for a good number of years, and recently need has made me shift direction. Thank you for the encouragement, though!

Bridgett Kay Feb 6, 2025, 12:08 AM
3 points
0
in reply to: Seth Herd’s comment on: Alignment Paradox and a Request for Harsh Criticism
“If your conclusion is that we don’t know how to do value alignment, I and I think most alignment thinkers would agree with you. If the conclusion is that AGI is useless, I don’t think it is at all.”
Sort of- I worry that it may be practically impossible for current humans to align AGI to the point of usefulness.
“If we had external help that allowed us to focus more on what we truly want—like eliminating premature death from cancer or accidents, or accelerating technological progress for creative and meaningful projects—we’d arrive at a very different future. But I don’t think that future would be worse; in fact, I suspect it would be significantly better.”
That’s my intuition and hope- but I worry that these things are causally entangled with things that we don’t anticipate. To use your example- what if we only ask an aligned and trusted AGI to cure premature death by disease and accident, which wouldn’t greatly conflict with most people’s values in the way that radical life extension would, but then a sudden loss of an entire healthcare and insurance industry results, causing such a total economic collapse that causes vast swaths of people to starve. (I don’t think this would actually happen, but it’s an example of the kind of unforeseen consequence that getting a wish suddenly granted may cause, when you ask an instruction following AGI to give, without counting on a greater intelligence to project and weigh all of the consequences.)
I also worry about the phrase “a human you trust.”
Again- this feels like cynicism, if not the result of a catastrophizing mind (which I know I have.) I think you make a very good argument- I’m probably indulging too much in black-and-white thought- that there’s a way to fulfill these desires quickly enough that we are able to relieve more suffering than we would have if left to our devices, but still slow enough to monitor unforeseen consequences. Maybe the bigger question is just whether we will.

Bridgett Kay Jan 21, 2025, 12:58 PM
1 point
0
on: Don’t ignore bad vibes you get from people
Does anyone have any generally helpful advice for someone who doesn’t really get vibes? Should I just continue to be more timid than normal, or is there a helpful heuristic I can use (aside from the ‘don’t talk to strangers, don’t join MLM’s, be wary of things that are too good to be true’ stuff that our parents tell us at a young age.)

Bridgett Kay Sep 30, 2024, 1:38 AM
8 points
1
in reply to: jbash’s comment on: shminux’s Shortform
1991/1992, actually (Harry Potter was born July 1980, and the story takes place the school year after his 11th birthday.)

Bridgett Kay Sep 25, 2024, 10:39 PM
1 point
0
in reply to: Yoav Ravid’s comment on: Yoav Ravid’s Shortform
Seems to me that the only winning move is not to play.

Bridgett Kay Jul 29, 2024, 2:51 PM
3 points
2
on: This is already your second chance
This might be our third, fourth fifth… nth chance.

Bridgett Kay Jun 22, 2024, 10:41 PM
1 point
0
in reply to: Raemon’s comment on: Meta Alignment: Communication Wack-a-Mole
Thank you.

Bridgett Kay Apr 22, 2024, 7:34 PM
1 point
0
in reply to: Jiro’s comment on: Fruits of our Labors Introduction: The Art of Weirdness
This is legitimate- the definition of weirdness was kept open-ended. I intended weirdness to be any behavior that is divergent from what most in a certain group considers to be the status quo, but even within a group, each member may have a different definition of what weird behavior is, and a consensus will be difficult to pin down.
I would consider rudeness to be weird behavior under this definition. It is a social behavior that comes with the cost of disrupting social cohesion. What is considered rude, vs. frank and straightforward, will vary from person to person even within a group, and may change over time as people within the group analyze whether the cost of the behavior is worth the social cost of ostracizing the individual who engages in that behavior. For example, cursing was considered much more rude by my parent’s generation than the current generation. It took time and discourse for the status quo to change, and for people to decide that cursing is less harmful than was once imagined.
As for whether I’m trying to excuse my character flaws, that may well be the case. In learning how to more effectively examine the costs and benefits of my behavior, I hope to recognize what is a flaw, and what is not, and to mend the former.

Bridgett Kay Apr 1, 2024, 5:38 PM
32 points
9
on: Introducing Open Asteroid Impact
We don’t know how to align asteroids’ trajectories, so it’s important to use smaller asteroids to align larger ones- like a very large game of amateur billiards.

Bridgett Kay Apr 1, 2024, 3:05 PM
12 points
9
on: The Story of “I Have Been A Good Bing”
I love this! But I find myself a little disappointed there’s not a musical rendition of the “I have been a good bing” dialogue.

Bridgett Kay Feb 28, 2024, 6:39 PM
1 point
0
on: Can we get an AI to do our alignment homework for us?
As one scales up a system, any small misalignment within that system will become more apparent- more skewed. I use shooting an arrow as an example. Say you shoot an arrow at a target from only a few feet away. If you are only a few degrees off from being lined up with the bullseye, when you shoot the close target your arrow will land very close to the bullseye. However, if you shoot a target many yards away with the same degree of error, your arrow will land much, much farther from the bullseye.
So if you get a less powerful AI aligned with your goals to a degree where everything looks fine, and then assign it the task of aligning a much more powerful AI, then any small flaw in the alignment of the less powerful AI will go askew far worse in the more powerful AI. What’s worse- since you assigned the less powerful AI the task aligning the larger AI, you won’t be able to see exactly what the flaw was until it’s too late, because if you’d been able to see the flaw, you would have aligned the larger AI yourself.

Bridgett Kay Jul 12, 2023, 12:29 AM
4 points
in reply to: gwern’s comment on: My Weirdest Experience
That seems fairly consistent with what happened to me. I did not experience my entire life in the dream- just the swim meet and the aftermath, and my memories were things I just summoned in the moment, like just coming up with small pieces of a story in real time. The thing that disturbed me the most wasn’t living another life- though that was disturbing enough- but the fact that a character in the dream knew a truth that “I” did not.

Bridgett Kay Jul 11, 2023, 9:00 PM
1 point
in reply to: Raphaël’s comment on: My Weirdest Experience
I have a similar trick I use with pirouettes- if I can turn and turn without stopping, then it is a dream. Of course, in this dream, I was not a dancer and had never danced, so I didn’t even think of it.

Bridgett Kay Jul 10, 2023, 5:40 PM
6 points
4
on: Ways I Expect AI Regulation To Increase Extinction Risk
Lately I’ve been appreciating, more and more, something I’m starting to call “Meta-Alignment.” Like, with everything that touches AI, we have to make sure that thing is aligned just enough to where it won’t mess up or “misalign” the alignment project. For example, we need to be careful about the discourse surrounding alignment, because we might give the wrong idea to people who will vote on policy or work on AI/AI adjacent fields themselves. Or policy needs to be carefully aligned, so it doesn’t create misaligned incentives that mess up the alignment project; the same goes for policies in companies that work with AI. This is probably a statement of the obvious, but it is really a daunting prospect the more I think about it.

Bridgett Kay Jan 5, 2021, 11:07 PM
3 points
on: The LessWrong 2019 Review
I was just wondering, on the subject of research debt, if there was any sort of system so that people could “adopt” the posts of others. Like say, if someone posts an interesting idea that they don’t have the time to polish or expand upon, they could post is somewhere for people who can.
What links here?
- What currents of thought on LessWrong do you want to see distilled? by ryan_b (Jan 8, 2021, 9:43 PM; 48 points)

Bridgett Kay Dec 9, 2020, 8:11 PM
2 points
in reply to: Kaj_Sotala’s comment on: My Weirdest Experience
Yeah- the experience really shook me. I’m prone to fairly vivid and interesting dreams, but this was definitely the strangest.

Bridgett Kay Jul 15, 2020, 2:10 AM
8 points
in reply to: Filipe Marchesini’s comment on: Null-boxing Newcomb’s Problem
But this was the final trick, for as soon as Maxwell accepted the two million dollars, the simulation ended.