RSS

Value Drift

TagLast edit: 20 Nov 2024 21:18 UTC by Dakara

Value drift refers to the idea that over time, the values or goals of a person or an AI system can change, often in ways that weren’t originally intended.

For humans, this might happen as life experiences, personal growth, or external influences cause someone’s beliefs to evolve.

For AI, it could occur if the system starts to interpret its goals differently as it learns and interacts with the world.

Schel­ling fences on slip­pery slopes

Scott Alexander16 Mar 2012 23:44 UTC
592 points
250 comments6 min readLW link

Un­der­stand­ing and avoid­ing value drift

TurnTrout9 Sep 2022 4:16 UTC
48 points
11 comments6 min readLW link

Straight-edge Warn­ing Against Phys­i­cal Intimacy

Raphaëll23 Nov 2020 21:35 UTC
17 points
42 comments5 min readLW link

Would I think for ten thou­sand years?

Stuart_Armstrong11 Feb 2019 19:37 UTC
25 points
13 comments1 min readLW link

Let Values Drift

Gordon Seidoh Worley20 Jun 2019 20:45 UTC
4 points
19 comments8 min readLW link

Pre­dict­ing Parental Emo­tional Changes?

jefftk6 Jul 2022 13:50 UTC
39 points
11 comments2 min readLW link
(www.jefftk.com)

Some dis­junc­tive rea­sons for ur­gency on AI risk

Wei Dai15 Feb 2019 20:43 UTC
36 points
24 comments1 min readLW link

[Question] Is value drift net-pos­i­tive, net-nega­tive, or nei­ther?

MarisaJurczyk5 May 2019 2:37 UTC
5 points
3 comments1 min readLW link

REACH Meetup – Value Drift

Raemon30 May 2018 4:53 UTC
4 points
0 comments1 min readLW link

Up­com­ing sta­bil­ity of values

Stuart_Armstrong15 Mar 2018 11:36 UTC
15 points
15 comments2 min readLW link

Gandhi, mur­der pills, and men­tal illness

erratio13 Oct 2010 9:16 UTC
34 points
16 comments1 min readLW link

Ma­hatma Arm­strong: CEVed to death.

Stuart_Armstrong6 Jun 2013 12:50 UTC
33 points
62 comments2 min readLW link

New Hackathon: Ro­bust­ness to dis­tri­bu­tion changes and ambiguity

Charbel-Raphaël31 Jan 2023 12:50 UTC
11 points
3 comments1 min readLW link
No comments.