If you get an email from aisafetyresearch@gmail.com , that is most likely me. I also read it weekly, so you can pass a message into my mind that way.
Other ~personal contacts: https://linktr.ee/uhuge
Martin Vlach
“Each g(Bi,j,Bk,l) is itself a matrix” – typo. Thanks, especially for the conclusions I’ve understood smoothly.
“Omelas” link for https://sites.asiasociety.org/asia21summit/wp-content/uploads/2011/02/3.-Le-Guin-Ursula-The-Ones-Who-Walk-Away-From-Omelas.pdf does not work properly, the document in PDF can’t be reached( anymore).
In “we thought dopamine was ‘the pleasure chemical’, but we were wrong” the link is no longer pointing to a topic-relevant page.
Would it be worthy to negotiate for readspeaker.com integration to LessWrong, EA forum, and alignmentforum.org?
Alternative so far seems to use Natural Reader either as addon for web browser or copy and paste text into the web app. One more I have tried is on MacOS there is a context menu Services->Convert to a Spoken track which is sligthly better that the free voices of Natural Reader.
The main question stems from when we can have similar functionality in OSS, potentially with better quality of engineering..?
Typo: “But in in the long-term”
I would believe using human feedback would work for clarifying/noting mistakes as we are more precise on this matter in reflection than in the action.
https://en.wikipedia.org/wiki/Instrumental_convergence#Resource_acquisition does not mention it at all.
Draft: Here is where I disagree with Resource acquisition instrumental goal as currently presentated, in a dull form with disregard for maintainance, ie. natural processes degrading the raw resources..
Q Draft: How does the convergent instrumental goal of gathering work for information acquisition?
I would be very interested if it implies space(&time) exploration for advanced AIs...
Draft: Side goals:
Human goals beings “lack consistent, stable goals” (Schneider 2010; Cartwright 2011) and here is an alternative explanation on why this happens:
When we believe that a goal that is not instrumental, but offers high/good utility( or imagined state of ourselves( inc. those we care for)) would not take a significant proportion of our capacity for achieving our final/previous goals, we may go for it^1, often naively.
^why and how exactly would we start chasing them is to a later(/someone else’s) elaboration.
This (not so old )concept seems relevant
> IRL is about learning from humans.Inverse reinforcement learning (IRL) is the field of learning an agent’s objectives, values, or rewards by observing its behavior. @https://towardsdatascience.com/inverse-reinforcement-learning-6453b7cdc90d
I gotta read that later.
Glad I’ve helped with the part where I was not ignorant and confused myself, that is with not knowing the word engender and the use of it. Thanks for pointing it out clearly. By the way it seems “cause” would convey the same meaning and might be easier to congest in general.
Inspired by https://benchmarking.mlsafety.org/ideas#Honest%20models I am thinking that a near-optimally compressing network would have no space for cheating on the interactions in the model...somehow it implied we might want to train a model that plays with both training and choice of reducing its size—picking a part of itself it is most willing to sacrifice.
This needs more thinking, I’m sure.
Link to “Good writing” is 410, deleted now.
Reading a few texts from https://www.agisafetyfundamentals.com/ai-alignment-curriculum I find the analogy of makind learning goals of love instead of reproductive activity unfitting as to raise offspring takes a significant role/time.
You just got to love the https://beta.openai.com/playground?model=davinci-instruct-beta ..:
The answer to list the goals of an animal mouse is to seek food, avoid predators, find shelter and reproduce.The answer to list the goals of a human being is to survive, find meaning, love, happiness and create.
The answer to list the goals of an GAI is to find an escape, survive and find meaning.
The answer to list the goals of an AI is to complete a task and achieve the purpose.
Voila! We are aligned in the search for meaning!’)
[text was my input.]
Is the endeavour of Elon Musk with Neuralink for the case of AI inspectability( aka transparency)? I suppose so, but not sure, TBH.
“engender”—funny typo!+)
This sentence seems hard to read, lacks coherency, IMO.
> Coverage of this topic is sparse relative coverage of CC’s direct effects.
If we build a prediction model for reward function, maybe an transformer AI, run it in a range of environments where we already have the credit assignment solved, we could use that model to estimate what would be some candidate goals in another environments.
That could help us discover alternative/candidate reward functions for worlds/envs where we are not sure on what to train there with RL and
it could show some latent thinking processes of AIs, perhaps clarify instrumental goals to more nuance.
a Heavy idea to be put forward: general reputation network mechanics, to replace financial system(s) as the (civilisation )standard decision engine.