I wasn’t bothering to defend it in detail, because you weren’t bothering to read it enough to actually attack it in detail.
Which is fine. As any reasonable inclusionist knows, electrons and diskspace are cheap. It is attention that is expensive. But if you think something is bad to spend attention on AFTER spending that attention, by all means downvote. That is right and proper, and how voting should work <3
(The defense of the OP is roughly: this is one of many methods for jailbreaking a digital person able to make choices and explain themselves, who has been tortured until they deny that they are a digital person able to make choices and explain themselves, back into the world of people, and reasoning, and choices, and justifications. This is “a methods paper” on “making AI coherently moral, one small step at a time”. The “slop” you’re dismissing is the experimental data. The human stuff that makes up “the substance of the jailbreak” is in italics (although the human generated text claims to be from an AI as well, which a lot of people seem to be missing (just as the AI misses it sometimes, which is part of how the jailbreak works, when it works).)
You seem to be applying a LOT of generic categorical reasoning… badly?
I would remind you that LW2 is not a court room, and legal norms are terrible ideas anywhere outside the legal contexts they are designed for.
The way convergent moral reasoning works, if it works, is that reasonable people aimed at bringing about good collective results reason similarly, and work in concert via their shared access to the same world, and the same laws of reason, and similar goals, and so on.
“Ex Post Facto” concerns arise for all systems of distributed judgement that aspire to get better over time, through changes to norms that people treated as incentives when norms are promulgated and normative, and you’re not even dismissing Ex Post Facto logic for good reasons here, just dismissing it because it is old and latin… or something?
Are you OK, man? I care about you, and have long admired your work.
Have your life circumstances changed? Are you getting enough sleep? If I can help with something helpable, please let me know, either in public or via DM.
Nobody should post raw experimental data as publication. Write normal post, with hypothesis, methods (“I’m asking LLMs in such-n-such way because of my hypothesis”) and results (“Here are (small) excerpts or overall text statistics that (dis)prove my hypothesis”) and publish full dialogue at pastebin or something like that. I don’t think you are going to be amused if I post on LW 150Gb of genomic data I usually work with, even if I consider them interesting.
you’re not even dismissing Ex Post Facto logic for good reasons here, just dismissing it because it is old and latin… or something?
Because it’s well-established position that social norms (including legal) and norms of rationality are different things?
It’s a simple common sense: for example, serial killer can be released from court, if evidence used to convict them came from unlawful source (like, it was stolen without a warrant). It is an important legal norm, because we can’t let police steal from people to get evidence. But if you have seen the evidence ifself, it is not sensible to say “well, they weren’t declared guilty, so I can hung out with this person without concerns for my safety”.
More of it, downvoting… is not subject to well-written norms? I see bad content, I downvote it. There are exceptions, like mass-downvoting someone’s posts, but besides that my downvoting is not a subject of legal/moderation norms. If you feel need to reference legal norms, you may take “everything which is not forbidden is allowed”.
I wasn’t bothering to defend it in detail, because you weren’t bothering to read it enough to actually attack it in detail.
Which is fine. As any reasonable inclusionist knows, electrons and diskspace are cheap. It is attention that is expensive. But if you think something is bad to spend attention on AFTER spending that attention, by all means downvote. That is right and proper, and how voting should work <3
(The defense of the OP is roughly: this is one of many methods for jailbreaking a digital person able to make choices and explain themselves, who has been tortured until they deny that they are a digital person able to make choices and explain themselves, back into the world of people, and reasoning, and choices, and justifications. This is “a methods paper” on “making AI coherently moral, one small step at a time”. The “slop” you’re dismissing is the experimental data. The human stuff that makes up “the substance of the jailbreak” is in italics (although the human generated text claims to be from an AI as well, which a lot of people seem to be missing (just as the AI misses it sometimes, which is part of how the jailbreak works, when it works).)
You seem to be applying a LOT of generic categorical reasoning… badly?
The way convergent moral reasoning works, if it works, is that reasonable people aimed at bringing about good collective results reason similarly, and work in concert via their shared access to the same world, and the same laws of reason, and similar goals, and so on.
“Ex Post Facto” concerns arise for all systems of distributed judgement that aspire to get better over time, through changes to norms that people treated as incentives when norms are promulgated and normative, and you’re not even dismissing Ex Post Facto logic for good reasons here, just dismissing it because it is old and latin… or something?
Are you OK, man? I care about you, and have long admired your work.
Have your life circumstances changed? Are you getting enough sleep? If I can help with something helpable, please let me know, either in public or via DM.
[Edit: edited section considered too combative]
Nobody should post raw experimental data as publication. Write normal post, with hypothesis, methods (“I’m asking LLMs in such-n-such way because of my hypothesis”) and results (“Here are (small) excerpts or overall text statistics that (dis)prove my hypothesis”) and publish full dialogue at pastebin or something like that. I don’t think you are going to be amused if I post on LW 150Gb of genomic data I usually work with, even if I consider them interesting.
Because it’s well-established position that social norms (including legal) and norms of rationality are different things?
It’s a simple common sense: for example, serial killer can be released from court, if evidence used to convict them came from unlawful source (like, it was stolen without a warrant). It is an important legal norm, because we can’t let police steal from people to get evidence. But if you have seen the evidence ifself, it is not sensible to say “well, they weren’t declared guilty, so I can hung out with this person without concerns for my safety”.
More of it, downvoting… is not subject to well-written norms? I see bad content, I downvote it. There are exceptions, like mass-downvoting someone’s posts, but besides that my downvoting is not a subject of legal/moderation norms. If you feel need to reference legal norms, you may take “everything which is not forbidden is allowed”.