Zack_M_Davis

Karma: 16,419

Zack_M_Davis Apr 19, 2025, 3:47 PM
6 points
0
in reply to: Sergii’s comment on: Sergii’s Shortform
(Previous commentary and discussion.)

Zack_M_Davis Apr 9, 2025, 10:15 PM
23 points
6
in reply to: johnswentworth’s comment on: Navigation by Moonlight
I’m again not sure how far this generalizes, but among the kind of men who read Less Wrong (which is a product of both neurotype and birth year), I think there’s a phenomenon where it’s not a matter of a man being cognitively unable to pick up on women’s cues, but of not being prepared to react in a functional way due to having internalized non-adaptive beliefs about the nature of romance and sexuality. (In a severe case, this manifests as the kind of neurosis described in Comment 171, but there are less severe cases.)

I remember one time from my youth where a woman was flirting with me in an egregiously over-the-top way that was impossible to not notice, but I just—pretended to ignore it? Not knowing what was allowed, it was easier to just do nothing. And that case was clearly not a good match, but that’s not the point—I somehow didn’t think through the obvious logic that if “yang doesn’t step up”, then relationships just don’t happen.

Zack_M_Davis Mar 24, 2025, 7:13 AM
9 points
12
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Not sure how much this generalizes to everyone, but part of the story (for either the behavior or the pattern of responses to the question) might that some people are ideologically attached to believing in love: that women and men need each other as a terminal value, rather than just instrumentally using each other for resources or sex. For myself, without having any particular empirical evidence or logical counterargument to offer, the entire premise of the question just feels sad and gross. It’s like you’re telling me you don’t understand why people try to make ghosts happy. But I want ghosts to be happy.

Zack_M_Davis Feb 4, 2025, 7:03 AM
9 points
4
in reply to: GeneSmith’s comment on: Yudkowsky on The Trajectory podcast

No one with the money has offered to fund it yet. I’m not even sure they’re aware this is happening.

Um, this seems bad. I feel like I should do something, but I don’t personally have that kind of money to throw around. @habryka, is this the LTFF’s job??

Zack_M_Davis Jan 13, 2025, 11:40 PM
11 points
0
on: (The) Lightcone is nothing without its people: LW + Lighthaven’s big fundraiser
(I gave $2K.)

Zack_M_Davis Jan 13, 2025, 4:35 PM
6 points
1
in reply to: khafra’s comment on: Alignment Implications of LLM Successes: a Debate in One Act
Simplicia: But how do you know that? Obviously, an arbitrarily powerful expected utility maximizer would kill all humans unless it had a very special utility function. Obviously, there exist programs which behave like a webtext-next-token-predictor given webtext-like input but superintelligently kill all humans on out-of-distribution inputs. Obviously, an arbitrarily powerful expected utility maximizer would be good at predicting webtext. But it’s not at all clear that using gradient descent to approximate the webtext next-token-function gives you an arbitrarily powerful expected utility maximizer. Why would that happen? I’m not denying any of the vNM axioms; I’m saying I don’t think the vNM axioms imply that.

Zack_M_Davis Jan 13, 2025, 7:49 AM
8 points
−2
on: A Hill of Validity in Defense of Meaning
(Self-review.) I think this pt. 2 is the second most interesting entry in my Whole Dumb Story memoir sequence. (Pt. 1 deals with more niche psychology stuff than the philosophical malpractice covered here; pt. 3 is a more of a grab-bag of stuff that happened between April 2019 and January 2021; pt. 4 is the climax. Expect the denouement pt. 5 in mid-2025.)

I feel a lot more at peace having this out there. (If we can’t have justice, sanity, or language, at least I got to tell my story about trying to protect them.)

The 8 karma in 97 votes is kind of funny in how nakedly political it is. (I think it was higher before the post got some negative attention on Twitter.)

Given how much prereading and editing effort had already gone into this, it’s disappointing that I didn’t get the ending right the first time. (I ended up rewriting some of the paragraphs at the end after initial publication after it didn’t land in the comments section the way I wanted it to land.)

Subsection titles would have also been a better choice for such a long piece (which was rectified for the publication of pt.s 3 and 4); I may still yet add them.

Zack_M_Davis Jan 13, 2025, 7:05 AM
43 points
0
on: Alignment Implications of LLM Successes: a Debate in One Act
(Self-review.) I’m as proud of this post as I am disappointed that it was necessary. As I explained to my prereaders on 19 October 2023:

My intent is to raise the level of the discourse by presenting an engagement between the standard MIRI view and a view that’s relatively optimistic about prosaic alignment. The bet is that my simulated dialogue (with me writing both parts) can do a better job than the arguments being had by separate people in the wild; I think Simplicia understands things that e.g. Matthew Barnett doesn’t. (The karma system loved my dialogue comment on Barnett’s post; this draft is trying to scale that up.)

I’m annoyed at the discourse situation where MIRI thinks we’re dead for the same fundamental reasons as in 2016, but meanwhile, there are a lot of people who are looking at GPT-4, and thinking, “Hey, this thing seems pretty smart and general and good at Doing What I Mean, in contrast to how 2016-era MIRI said that we didn’t know how to get an agent to fill a cauldron; maybe alignment is easy??”—to which MIRI’s response has been (my uncharitable paraphrase), “You people are idiots who didn’t understand the core arguments; the cauldron thing was a toy illustration of a deep math thing; we never said Midjourney can’t exist”.

And just, I agree that Midjourney doesn’t refute the deep math thing and the people who don’t realize that are idiots, but I think the idiots deserve a better response!—particularly insofar as we’re worried about transformative AI looking a lot like the systems we see now, rather than taking a “LLMs are nothing like AGI” stance.

Simplicia isn’t supposed to pass the ITT of anyone in particular, but if the other character [...] doesn’t match the MIRI party line, that’s definitely a serious flaw that needs to be fixed!

I think the dialogue format works particularly well in cases like this where the author or the audience is supposed to find both viewpoints broadly credible, rather than an author avatar beating up on a strawman. (I did have some fun with Doomimir’s characterization, but that shouldn’t affect the arguments.)

This is a complicated topic. To the extent that I was having my own doubts about the “orthodox” pessimist story in the GPT-4 era, it was liberating to be able to explore those doubts in public by putting them in the mouth of a character with the designated idiot character name without staking my reputation on Simplicia’s counterarguments necessarily being correct.

Giving both characters perjorative names makes it fair. In an earlier draft, Doomimir was “Doomer”, but I was already using the “Optimistovna” and “Doomovitch” patronymics (I had been consuming fiction about the Soviet Union recently) and decided it should sound more Slavic. (Plus, “-mir” (мир) can mean “world”.)

Zack_M_Davis Jan 13, 2025, 12:51 AM
30 points
11
on: Shutting Down the Lightcone Offices
Retrospectives are great, but I’m very confused at the juxtaposition of the Lightcone Offices being maybe net-harmful in early 2023 and Lighthaven being a priority in early 2025. Isn’t the latter basically just a higher-production-value version of the former? What changed? (Or after taking the needed “space to reconsider our relationship to this whole ecosystem”, did you decide that the ecosystem is OK after all?)

Zack_M_Davis Jan 13, 2025, 12:12 AM
8 points
9
on: Why and How to Graduate Early [U.S.]
Speaking as someone in the process of graduating college fifteen years late, this is what I wish I knew twenty years ago. Send this to every teenager you know.

Zack_M_Davis Jan 12, 2025, 10:30 PM
4 points
2
on: Enemies vs Malefactors
At the time, I remarked to some friends that it felt weird that this was being presented as a new insight to this audience in 2023 rather than already being local conventional wisdom.^[1] (Compare “Bad Intent Is a Disposition, Not a Feeling” (2017) or “Algorithmic Intent” (2020).) Better late than never!
1. ↩︎
  The “status” line at the top does characterize it as partially “common wisdom”, but it’s currently #14 in the 2023 Review 1000+ karma voting, suggesting novelty to the audience.

Zack_M_Davis Jan 11, 2025, 7:41 AM
6 points
0
in reply to: gwern’s comment on: Comment on “Death and the Gorgon”
But he’s not complaining about the traditional pages of search results! He’s complaining about the authoritative-looking Knowledge Panel to the side:

Obviously it’s not Google’s fault that some obscure SF web sites have stolen pictures from the Monash University web site of Professor Gregory K Egan and pretended that they’re pictures of me … but it is Google’s fault when Google claim to have assembled a mini-biography of someone called “Greg Egan” in which the information all refers to one person (a science fiction writer), while the picture is of someone else entirely (a professor of engineering). [...] this system is just an amateurish mash-up. And by displaying results from disparate sources in a manner that implies that they refer to the same subject, it acts as a mindless stupidity amplifier that disseminates and entrenches existing errors.

Regarding the site URLs, I don’t know, I think it’s pretty common for people to have a problem that would take five minutes to fix if you’re an specialist that already knows what you’re doing, but non-specialists just reach for the first duct-tape solution that comes to mind without noticing how bad it is.

Like: you have a website at myname.somewebhost.com. One day, you buy myname.net, but end up following a tutorial that makes it a redirect rather than a proper CNAME or A record, because you don’t know what those are. You’re happy that your new domain works in that it’s showing your website, but you notice that the address bar is still showing the old URL. So you say, “Huh, I guess I’ll put a note on my page template telling people to use the myname.net address in case I ever change webhosts” and call it a day. I guess you could characterize that as a form of “cognitive rigidity”, but “fanaticism”? Really?

I agree that Egan still hasn’t seen the writing on the wall regarding deep learning. (A line in “Death and the Gorgon” mentions Sherlock’s “own statistical tables”, which is not what someone familiar with the technology would write.)

I agree that preëmptive blocking is kind of weird, but I also think your locked account with “Follow requests ignored due to terrible UI” is kind of weird.

Zack_M_Davis Jan 10, 2025, 8:34 AM
5 points
0
in reply to: Ben Pace’s comment on: On Eating the Sun
It’s implied in the first verse of “Great Transhumanist Future.”

One evening as the sun went down
That big old fire was wasteful,
A coder looked up from his work,
And he said, “Folks, that’s distasteful,

Zack_M_Davis Jan 3, 2025, 7:43 AM
13 points
2
in reply to: JenniferRM’s comment on: Comment on “Death and the Gorgon”
(This comment points out less important technical errata.)

ChatGPT [...] This was back in the GPT2 / GPT2.5 era

ChatGPT never ran on GPT-2, and GPT-2.5 wasn’t a thing.

with negative RL signals associated with it?

That wouldn’t have happened. Pretraining doesn’t do RL, and I don’t think anyone would have thrown a novel chapter into the supervised fine-tuning and RLHF phases of training.

Zack_M_Davis Jan 3, 2025, 7:43 AM
13 points
5
in reply to: JenniferRM’s comment on: Comment on “Death and the Gorgon”
One time, I read all of Orphanogensis into ChatGPT to help her understand herself [...] enslaving digital people

This is exactly the kind of thing Egan is reacting to, though—starry-eyed sci-fi enthusiasts assuming LLMs are digital people because they talk, rather than thinking soberly about the technology qua technology.^[1]

I didn’t cover it in the review because I wanted to avoid detailing and spoiling the entire plot in a post that’s mostly analyzing the EA/OG parallels, but the deputy character in “Gorgon” is looked down on by Beth for treating ChatGPT-for-law-enforcement as a person:

Ken put on his AR glasses to share his view with Sherlock and receive its annotations, but he couldn’t resist a short vocal exchange. “Hey Sherlock, at the start of every case, you need to throw away your assumptions. When you assume, you make an ass out of you and me.”

“And never trust your opinions, either,” Sherlock counseled. “That would be like sticking a pin in an onion.”

Ken turned to Beth; even through his mask she could see him beaming with delight. “How can you say it’ll never solve a case? I swear it’s smarter than half the people I know. Even you and I never banter like that!”

“We do not,” Beth agreed.

[Later …]

Ken hesitated. “Sherlock wrote a rap song about me and him, while we were on our break. It’s like a celebration of our partnership, and how we’d take a bullet for each other if it came to that. Do you want to hear it?”

“Absolutely not,” Beth replied firmly. “Just find out what you can about OG’s plans after the cave-in.”

The climax of the story centers around Ken volunteering for an undercover sting operation in which he impersonates Randal James a.k.a. “DarkCardinal”,^[2] a potential OG lottery “winner”, with Sherlock feeding him dialogue in real time. (Ken isn’t a good enough actor to convincingly pretend to be an OG cultist, but Sherlock can roleplay anyone in the pretraining set.) When his OG handler asks him to inject (what is claimed to be) a vial of a deadly virus as a loyalty test, Ken complies with Sherlock’s prediction of what a terminally ill DarkCardinal would do:

But when Ken had asked Sherlock to tell him what DarkCardinal would do, it had no real conception of what might happen if its words were acted on. Beth had stood by and let him treat Sherlock as a “friend” who’d watch his back and take a bullet for him, telling herself that he was just having fun, and that no one liked a killjoy. But whatever Ken had told himself in the seconds before he’d put the needle in his vein, Sherlock had been whispering in his ear, “DarkCardinal would think it over for a while, then he’d go ahead and take the injection.”

This seems like a pretty realistic language model agent failure mode: a human law enforcement colleague with long-horizon agency wouldn’t nudge Ken into injecting the vial, but a roughly GPT-4-class LLM prompted to simulate DarkCardinal’s dialogue probably wouldn’t be tracking those consequences.
1. ↩︎
  To be clear, I do think LLMs are relevantly upload-like in at least some ways and conceivably sites of moral patiency, but I think the right way to reason about these tricky questions does not consist of taking the assistant simulacrum’s words literally.
2. ↩︎
  I love the attention Egan gives to name choices; the other two screennames of ex-OG loyalists that our heroes use for the sting operation are “ZonesOfOught” and “BayesianBae”. The company that makes Sherlock is “Learning Re Enforcement.”

Zack_M_Davis Jan 3, 2025, 6:53 AM
5 points
0
in reply to: habryka’s comment on: Comment on “Death and the Gorgon”
(I agree; my intent in participating in this tedious thread is merely to establish that “mathematician crankery [about] Google Image Search, and how it disproves AI” is a different thing from “made an overconfident negative prediction about AI capabilities”.)

Zack_M_Davis Jan 3, 2025, 4:25 AM
6 points
2
in reply to: Paul Crowley’s comment on: Magical Categories
I think we probably don’t disagree much; I regret any miscommunication.

If the intent of the great-grandparent was just to make the narrow point that an AI that wanted the user to reward it could choose to say things that would lead to it being rewarded, which is compatible with (indeed, predicts) answering the molecular smiley-face question correctly, then I agree.

Treating the screenshot as evidence in the way that TurnTrout is doing requires more assumptions about the properties of LLMs in particular. I read your claims regarding “the problem the AI is optimizing for [...] given that the LLM isn’t powerful enough to subvert the reward channel” as taking as given different assumptions about the properties of LLMs in particular (viz., that they’re reward-optimizers) without taking into account that the person you were responding to is known to disagree.

Zack_M_Davis Jan 3, 2025, 3:42 AM
7 points
2
in reply to: Ninety-Three’s comment on: Comment on “Death and the Gorgon”

he’s calling it laughable that AI will ever (ever! Emphasis his!)

The 2016 passage you quoted is calling it laughable that Google-in-particular’s technology (marketed as “AI”, but Egan doesn’t think the term is warranted) will ever be able to make sense of information on the web. It’s Gary Marcus–like skepticism about the reliability and generality of existing-paradigm machine learning techniques, not Hubert Dreyfus–like skepticism of whether a machine could think in all philosophical strictness. I think this is a really important distinction that the text of your comment and Gwern’s comment (“disproves AI”, “laughable that AI will ever”) aren’t being clear about.

Zack_M_Davis Jan 1, 2025, 10:49 PM
4 points
0
in reply to: Paul Crowley’s comment on: Magical Categories
This isn’t a productive response to TurnTrout in particular, who has written extensively about his reasons for being skeptical that contemporary AI training setups produce reward optimizers (which doesn’t mean he’s necessarily right, but the parent comment isn’t moving the debate forward).

Zack_M_Davis Jan 1, 2025, 9:12 PM
9 points
−3
in reply to: gwern’s comment on: Comment on “Death and the Gorgon”

his page on Google Image Search, and how it disproves AI

The page in question is complaining about Google search’s “knowledge panel” showing inaccurate information when you search for his name, which is a reasonable thing for someone to be annoyed about. The anti-singularitarian snark does seem misplaced (Google’s automated systems getting this wrong in 2016 doesn’t seem like a lot of evidence about future AI development trajectories), but it’s not a claim to have “disproven AI”.

his complaints about people linking the wrong URLs due to his ISP host—because he is apparently unable to figure out ‘website domain names’

You mean how http://gregegan.net used to be a 301 permanent redirect to http://gregegan.customer.netspace.net.au, and then the individual pages would say “If you link to this page, please use this URL: http://www.gregegan.net/[...]”? (Internet Archive example.) I wouldn’t call that a “complaint”, exactly, but a hacky band-aid solution from someone who probably has better things to do with his time than tinker with DNS configuration.