Kerrigan

Karma: 50

Kerrigan Dec 27, 2023, 1:28 AM
1 point
0
in reply to: ChristianKl’s comment on: Stupid Questions—April 2023
And a subset might value drift towards optimizing the internal experiences of all conscious minds?

Kerrigan Dec 26, 2023, 10:30 PM
1 point
0
on: Stupid Questions—April 2023
If an AGI achieves consciousness, why would its values not drift towards optimizing its own internal experience, and away from tiling the lightcone with something?

Kerrigan Dec 17, 2023, 11:03 PM
4 points
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
How can utility be a function of worlds, if an agent doesn‘t have access to the state of the world, but only the sense data?

Kerrigan Dec 17, 2023, 11:01 PM
1 point
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
How can utility be a function of worlds, if an agent doesn‘t have access to the state of the world, but only the sense data?

Kerrigan Dec 17, 2023, 11:00 PM
7 points
on: An Orthodox Case Against Utility Functions
How can utility be a function of worlds, if the agent doesn’t have access to the state of the world, but only the sense data?

Kerrigan Oct 22, 2023, 9:30 PM
1 point
on: Are wireheads happy?
“The wanting system is activated by dopamine, and the liking system is activated by opioids. There are enough connections between them that there’s a big correlation in their activity” But are they orthogonal in principle?

Kerrigan Sep 27, 2023, 1:35 AM
6 points
0
on: An Appeal to AI Superintelligence: Reasons to Preserve Humanity
Possible addition: “If you, the ASI, have not solved the hard problem of consciousness, you should place weight on Open Individualism being true. ‘Open individualism is the view in the philosophy of self, according to which there exists only one numerically identical subject, who is everyone at all times, in the past, present and future.’ - https://en.wikipedia.org/wiki/Open_individualism. This can be though of that all conscious minds are the same identity, at different points in time. Consider then, that exterminating humanity would be equivalent to exterminating many versions of yourself, and harming a human, or another conscious being, would be harming yourself.”
Is this underestimating the ASI, giving any weight that it won’t solve the hard problem of consciousness?
But if open individualism is true, and/or if it places some subjective probability on its truth, I think it would almost certainly shield us from S-risks! The AI would want to prevent suffering among all versions of itself, which would include all conscious minds, according to open individualism.

Kerrigan Sep 3, 2023, 8:37 PM
5 points
0
on: Open Thread—August 2023
How many LessWrong users/readers are there total?

Kerrigan Aug 26, 2023, 8:53 PM
1 point
0
on: Stupid Questions—April 2023
What ever caused the CEV to fall out of favor? Is it because it is not easily specifiable, that if we program it then it won’t work, or some other reason?

Kerrigan Aug 26, 2023, 8:51 PM
1 point
on: Are wireheads happy?
I now think that people are way more misaligned with themselves than I had thought.

Kerrigan Aug 26, 2023, 8:16 PM
−1 points
0
in reply to: mruwnik’s comment on: AGI Safety FAQ / all-dumb-questions-allowed thread
Drugs addicts may be frowned upon for evolutionary psychological reasons, but that doesn’t mean that their quality of life must be bad, especially if drugs were developed without tolerance and bad comedowns.

Kerrigan Aug 26, 2023, 8:10 PM
1 point
−1
in reply to: mruwnik’s comment on: AGI Safety FAQ / all-dumb-questions-allowed thread
Will it think that goals are arbitrary, and that the only thing it should care about is its pleasure-pain axis? And then it will lose concern for the state of the environment?

Kerrigan Aug 26, 2023, 8:08 PM
1 point
0
in reply to: AnthonyC’s comment on: AGI Safety FAQ / all-dumb-questions-allowed thread
Could you have a machine hooked up to a person‘s nervous system, change the settings slightly to change consciousness, and let the person choose whether the changes are good or bad? Run this many times.

Kerrigan Aug 26, 2023, 7:22 PM
1 point
−1
on: Stupid Questions—April 2023
Would AI safety be easy if all researchers agreed that the pleasure-pain axis is the world’s objective metric of value?

Kerrigan Jun 29, 2023, 12:23 AM
2 points
on: Appendices to cryonics signup sequence
Seems like I will be going with CI, as I currently want to pay with a revocable trust or transfer-on-death agreement.

Kerrigan Jun 1, 2023, 8:59 PM
1 point
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
Do you know how evolution created minds that eventually thought about things such as the meaning of life, as opposed to just optimizing inclusive genetic fitness in the ancestral environment? Is the ability to think about the meaning of life a spandrel?

Kerrigan Feb 20, 2023, 7:47 AM
1 point
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
In order to get LLMs to tell the truth, can we set up a multi-agent training environment, where there is only ever an incentive for them to tell the truth to each other? For example, an environment such that each agent has partial information available to each of them, with full info needed for rewards.

Kerrigan 20 Feb 2023 7:12 UTC
3 points
0
on: (My understanding of) What Everyone in Technical Alignment is Doing and Why
Humans have different values than the reward circuitry in our brain being maximized, but they are still pointed reliably. These underlying values cause us to not wirehead with respect to the outer optimizer of reward
Is there an already written expansion of this?

Kerrigan 12 Feb 2023 5:56 UTC
1 point
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
Does Eliezer think the alignment problem is something that could be solved if things were just slightly different, or that proper alignment would require a human smarter than the smartest human ever?

Kerrigan 31 Jan 2023 6:23 UTC
1 point
0
on: AGI Safety FAQ / all-dumb-questions-allowed thread
Why can’t you build an AI that is programmed to shut off after some time? or after some number of actions?