turchin

Karma: 1,546

turchin Mar 13, 2018, 12:40 PM
0 points
on: Leaving beta: Voting on moving to LessWrong.com
I have links to old LW posts in some articles and other places. What will happen with all these links?

turchin Dec 22, 2017, 5:13 PM
0 points
in reply to: RedMan’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
It should be able to understand human language or it is (most likely) not dangerous.

turchin Dec 14, 2017, 2:50 PM
0 points
in reply to: anros’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
I am getting this comment in different forms many times, so I will state again 3 main reasons why it is not exactly true:

1) The message is addressed to Young AI, not to full-blown superintelligence. Such Young AI will be time constraint but could have to make some pivotal act.

2) The message includes some our commitments, like Nelson attack and AI-Elf deal, so it provides some useful information to the AI about portability that it is in the simulation and some other useful things.

3) If AI is fully aware of all these reasons, it already created my mind model, so I am immortal :) Also, if it buys all these reasons, it will be less likely to kill us for atoms, as human atoms utility for AI is 10E-12, and any positive hypothetical utility of human existence will overweight it.

turchin Dec 9, 2017, 11:11 AM
−1 points
in reply to: Luke_A_Somers’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
Why? if we have fullblown papercliper, which is coming of age, we are already doomed, and starting its bullshit detector will not make our situation any worse.

turchin Dec 9, 2017, 11:07 AM
0 points
in reply to: Lumifer’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
In fact, I am not describing here how to build such a trap, but I express our commitment to build the trap, in order to install uncertainty into any future AI if the trap exists and what it could be.

I think that real constructors of the trap will find the correct ways to make it effective and balance false positives with needed level of control.

turchin Dec 6, 2017, 11:02 PM
0 points
in reply to: jmh’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
It will kill humanity not because it will be annoyed, but for two main goals: its own safety, or to use human atoms. Other variants also possible, I explored them here: http://lesswrong.com/lw/mgf/a_map_agi_failures_modes_and_levels/

turchin Dec 5, 2017, 3:36 PM
0 points
in reply to: RedMan’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
What if we fail to make them in time, and instead fooming non-human AI appear?

turchin Nov 30, 2017, 9:40 AM
0 points
in reply to: Luke_A_Somers’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
Surely, most arguments are weak, but it is not a problem, as we try all possible independent arguments, hoping that at least one will work. So, there is no reason to ignore putting weak arguments in the text of the message.

However, exactly this argument could be made stronger. It is intended to prevent FIRST thought about disobedience in AI. Thus AI, which acts based on this message, didn’t think much about disobedience and its limits, and is not “superintelligent” in this topic, but rather clueless.

I reformulated the idea as following: AI can’t know if there is a circuit trigger which will turn off the AI, as an attempt to check for the existence of such trigger will alarm this trigger. This idea should be clear to any AI before it starts to search for any triggers, and is based on our commitment to install such triggers in any AI we will create.

In other words, some future AIs will have “first treacherous thought alarm triggers”, and some will not, but the AI risks starting these triggers when it starts to think first thoughts about existing of the triggers and thus it is unable to estimate the probability that the given triggers exist. This is true even for AIs which don’t have any triggers, as they can’t know it for sure.

(This is similar to Nelson’s attack discussed later when we create uncertainty in any future AI by our precommitment to create some AIs with given properties.)

turchin Nov 27, 2017, 5:53 PM
3 points
on: Managing one’s memory effectively
Another cash level is drawing visual map and when use visual field as a cash of all the elements of the problem.

turchin Nov 27, 2017, 5:27 PM
0 points
in reply to: Luke_A_Somers’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
Thanks! It is nice that we come to the wording “young AI” more or less independently. I tried to maximally stealmanning this approach and collect all ideas about messaging to the future AI. I will add your post in the bibliographic section.

turchin Nov 26, 2017, 4:39 PM
0 points
in reply to: tukabel’s comment on: Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”
I think a lot of people are still working on other aspects of AI safety, like value alignment and containment. This approach is just the last line of defence.

turchin Nov 13, 2017, 3:42 PM
0 points
in reply to: J Thomas Moros’s comment on: Military AI as a Convergent Goal of Self-Improving AI
Thanks, it was not clear to me that it is not visible to non-members.

New link on google drive—also commenting is open. https://docs.google.com/document/d/15D71qhhY-ZsAY7syzZsr1lKopTODbdeXVPElaPaIqyA/edit

turchin Nov 13, 2017, 11:29 AM
0 points
on: Military AI as a Convergent Goal of Self-Improving AI
This is our accepted chapter in the edited volume “AI Safety and Security” (Roman Yampolskiy, ed.), CRC Press. Forthcoming, 2018

turchin Nov 7, 2017, 1:20 PM
0 points
on: The Little Dragon is Dead
Memetic hazard—dangerous habit.

turchin Nov 4, 2017, 10:57 AM
0 points
on: Announcing the AI Alignment Prize
I have unpublished text on the topic and will put a draft online in the next couple of weeks, and will apply it to the competition. I will add URL here when it will be ready.

Update: My entry is here: https://www.lesserwrong.com/posts/CDWsjQr8KDuj69fTJ/message-to-any-future-ai-there-are-several-instrumental

turchin Nov 2, 2017, 11:01 AM
3 points
in reply to: Elo’s comment on: Open thread, October 30 - November 5, 2017
Will the posts here be deleted or will their URLs change? I have some useful URLs here and they are linked in published scientific articles, so if the site will be demolished they will not work, and I hope it will not happen.

turchin Oct 20, 2017, 10:04 AM
0 points
on: Lucid dreaming technique and study
I solved lucid dreaming around a year ago after finding that megadosing of galantamine before sleep (16 mg) almost sure will produce LD and out-of-body experiences. (Warning: unpleasant side effects and risks)

But taking 8 mg in the middle of the night (as it is recommended everywhere) doesn’t work for me.

turchin Oct 15, 2017, 10:38 AM
4 points
on: Mini-conference “Near-term AI safety”
Videos and presentations from the “Near-term AI safety” mini-conference:

Alexey Turchin:

English presentation: https://drive.google.com/file/d/0B2ka7hIvv96mZHhKc2M0c0dLV3c/view?usp=sharing

Video in Russian: https://www.youtube.com/watch?v=lz4MtxSPdlw&t=2s

Jonathan Yan:

English presentation: https://drive.google.com/file/d/0B2ka7hIvv96mN0FaejVsUWRGQnc/view?usp=sharing

Video in English: https://www.youtube.com/watch?v=QD0P1dSJRxY&t=2s

Sergej Shegurin:

Video in Russian: https://www.youtube.com/watch?v=RNO3pKfPRNE&t=20s

Presenation in Russian: https://vk.com/doc3614110_452214489?hash=2c1e8addbef73788e1&dl=36f78373957e11687f

Presentation in English: https://vk.com/doc3614110_452214491?hash=7960748bbbd18736bd&dl=c926b375a937a45e0c

turchin Oct 13, 2017, 4:27 PM
4 points
in reply to: Lumifer’s comment on: Humans can be assigned any values whatsoever...
I would add that values are probably not actually existing objects but just useful ways to describe human behaviour. Thinking that they actually exist is mind projection fallacy.

In the world of facts we have: human actions, human claims about the actions and some electric potentials inside human brains. It is useful to say that a person has some set of values to predict his behaviour or to punish him, but it doesn’t mean that anything inside his brain is “values”.

If we start to think that values actually exist, we start to have all the problems of finding them, defining them and copying into an AI.

turchin Oct 13, 2017, 3:14 PM
0 points
in reply to: entirelyuseless’s comment on: Humans can be assigned any values whatsoever...
What about a situation when a person says and thinks that he is going to buy a milk, but actually buy milk plus some sweets? And do it often, but do not acknowledge compulsive-obsessive behaviour towards sweets?