tricky_labyrinth

Karma: 124

tricky_labyrinth’s Shortform

tricky_labyrinthMar 19, 2025, 2:09 AM

3 points

1 comment LW link

tricky_labyrinth Mar 19, 2025, 2:09 AM
1 point
0
on: tricky_labyrinth’s Shortform
Does anyone know why GPT 4.5 is seemingly getting stuck on the word “explicitly”, repeating it continuously after it encounters it once? Is this only happening in ChatGPT? Seems like some sort of context collapse.

Sightings in the wild: https://x.com/KelseyTuoc/status/1902132078378189198 https://x.com/Josikinz/status/1901840144363082047 https://x.com/4confusedemoji/status/1895613332662730832 https://x.com/Westoncb/status/1895615564313448781 https://x.com/noself86/status/1901230843240370287 https://x.com/0x440x46/status/1900855229068829139 https://x.com/GusarichOnX/status/1900184434806059072

LEAst-squares Concept Erasure (LEACE)

tricky_labyrinthJun 7, 2023, 9:51 PM

68 points

10 comments1 min readLW link

(twitter.com)

tricky_labyrinth May 15, 2023, 2:58 AM
LW: 8 AF: 4
0
AF
in reply to: Gabe M’s comment on: Steering GPT-2-XL by adding an activation vector
+1ing 5 specifically

tricky_labyrinth Apr 8, 2023, 2:44 AM
6 points
2
in reply to: Rob Bensinger’s comment on: Pausing AI Developments Isn’t Enough. We Need to Shut it All Down
mfw you didn’t add the final addendum (https://twitter.com/ESYudkowsky/status/1642216007552106496)

tricky_labyrinth Apr 5, 2023, 6:18 AM
3 points
0
on: Given the Restrict Act, Don’t Ban TikTok
What I do not understand is why Apple and Google haven’t taken care of this for us.
Palmer Luckey has this talking point about how China has all the big tech companies (Apple in particular) by the balls. That + Google maybe not wanting to seem monopolistic by banning their competition seems to be a sufficient explanation.

tricky_labyrinth Mar 18, 2023, 11:06 PM
−8 points
−2
on: An Appeal to AI Superintelligence: Reasons to Preserve Humanity
Why was this promoted to the frontpage?

tricky_labyrinth Mar 17, 2023, 5:50 PM
4 points
1
on: Super-Luigi = Luigi + (Luigi—Waluigi)
Is “behavior vector space” referencing something? If not, what do you mean by it?

tricky_labyrinth Mar 2, 2023, 8:57 AM
2 points
0
on: The best way so far to explain AI risk: The Precipice (p. 137-149)
Unrelated to the post’s content itself: will LW get in trouble for hosting this excerpt?

tricky_labyrinth Feb 22, 2023, 7:32 AM
1 point
0
in reply to: the gears to ascension’s comment on: Does a P=NP result push AI timelines closer or farther?
Responding to the last line: to be clear, I’m not claiming I have one. More wondering if the AI risk community should try to find one as a desperate hail mary given they have ~0 hope for their current research directions.

aka I’m wondering if trying to find one even is a desperate hail mary

tricky_labyrinth Feb 17, 2023, 9:08 PM
2 points
0
in reply to: Duncan Sabien (Deactivated)’s comment on: Sazen
Wait, what? Do you mean colloquial hieratic (just literally priestly) or his hieratic:
hieratic, adj. ~~Of computer documentation,~~ impenetrable because the author never sees outside his own intimate knowledge of the subject and is therefore unable to identify or meet the expository needs of newcomers. It might as well be written in hieroglyphics.
Cuz the latter seems extremely close to sazeny, if maybe additionally connoting blame on the author.

tricky_labyrinth Feb 17, 2023, 5:41 AM
1 point
0
on: Sazen
I’m in the middle of writing a nonfiction book whose central conceit is something like “an abridged dictionary of Kadhamic.” Not literally the actual canonical Alexandrian Kadhamic, but the idea is to present some hundred-or-so concepts that are long and complicated and difficult to convey in English, but which are not fundamentally more complicated than things we sum up with a single word like “basketball” or “gaslighting” or “cringe.”
Very interested for when this comes out :O

tricky_labyrinth Feb 9, 2023, 9:56 AM
4 points
0
on: EigenKarma: trust at scale
FYI, eigenkarma’s been proposed for LessWrong multiple times (with issues supposedly found); see https://www.lesswrong.com/posts/xN2sHnLupWe4Tn5we/improving-on-the-karma-system#Eigenkarma for example.

tricky_labyrinth Feb 5, 2023, 9:25 PM
4 points
3
on: Focus on the places where you feel shocked everyone’s dropping the ball
https://twitter.com/carmenleelau/status/1593354133146402816 is another recent formulation of ~the same idea.

tricky_labyrinth Feb 5, 2023, 9:07 PM
4 points
3
in reply to: gwern’s comment on: I hired 5 people to sit behind me and make me productive for a month
https://guzey.com/co-working/ seems to be ~that; a friend group that periodically checks in on each other.

tricky_labyrinth Dec 26, 2022, 12:23 AM
7 points
1
in reply to: Viliam’s comment on: It’s time to worry about online privacy again
Probably supposed to be something like “If it’s free [and not open source], you are the product.”

tricky_labyrinth Dec 24, 2022, 10:24 AM
1 point
0
on: Staring into the abyss as a core life skill
Reminds me of http://mindingourway.com/recklessness/ (and also your recent post on overconfidence).

tricky_labyrinth Dec 24, 2022, 9:58 AM
1 point
on: What an actually pessimistic containment strategy looks like
Not all political activism has to be waving flags around and chanting chants. Sometimes activists actually have goals and then accomplish something. I think we should try to learn from those people, as lowly as your opinion might be of them, if we don’t seem to have many other options.

This does make me wonder if activism from scientists has ever worked significantly. https://www.bismarckanalysis.com/Nuclear_Weapons_Development_Case_Study.pdf documents the Manhattan Project, https://www.palladiummag.com/2021/03/16/leo-szilards-failed-quest-to-build-a-ruling-class/ argues that there was partial success.

tricky_labyrinth Dec 24, 2022, 9:43 AM
2 points
in reply to: lc’s comment on: What an actually pessimistic containment strategy looks like
An institution could do A/B testing on interventions like these. It can talk to people more than once.
We can’t take this for granted: when A tells B that B’s views are inconsistent, the standard response (afaict) is for B to default in one direction (and which direction is often heavily influenced by their status quo), make that direction their consistent view, and then double down every time they’re pressed.
It’s possible that we have ~1 shot per person at convincing them.

tricky_labyrinth Dec 24, 2022, 6:50 AM
1 point
in reply to: lc’s comment on: Extreme Security
I’ve heard it go by the name security through obscurity (see https://en.wikipedia.org/wiki/Security_through_obscurity).

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer