Coincidentally, that scene in The Big Short takes place on January 11 (2007) :D
soth02
I read it as a joke, lol.
https://www.lesswrong.com/posts/jnyTqPRHwcieAXgrA/finding-goals-in-the-world-model
Could it be possible to poison the world model an AGI is based on to cripple its power?
Use generated text/data to train world models based on faulty science like miasma, phlogiston, ether, etc.
Remove all references to the internet or connectivity based technology.
Create a new programming language that has zero real world adoption, and use that for all code based data in the training set.
There might be a way to elicit how aligned/unaligned the putative AGI is.
Enter into a Prisoner’s Dilemma type scenario with the putative AGI.
Start off in the non-Nash equilibrium of cooperate/cooperate.
The number of rounds is specified at random and isn’t known to participants. (possible variant is declare false last rounds, and then continue playing for x rounds).
Observe when/if the putative AGI defects in the ‘last’ round.
Does there have to be a reward? This is using brute force to create the underlying world model. It’s just adjusting weights right?
Brute force alignment by adding billions of tokens of object level examples of love, kindness, etc to the dataset. Have the majority of humanity contribute essays, comments, and (later) video.
I wonder what kind of signatures a civilization gives off when AGI is nascent.
Develop a training set for alignment via brute force. We can’t defer alignment to the ubernerds. If enough ordinary people (millions? tens of millions?) contribute billions or trillions of tokens, maybe we can increase the chance of alignment. It’s almost like we need to offer prayers of kindness and love to the future AGI: writing alignment essays of kindness that are posted to reddit, or videos extolling the virtue of love that are uploaded to youtube.
AI presents both staggering opportunity and chilling peril. Developing intelligent machines could help eradicate disease, poverty, and hunger within our lifetime. But uncontrolled AI could spell the end of the human race. As Stephen Hawking warned, “Success in creating AI would be the biggest event in human history. Unfortunately, it might also be the last, unless we learn how to avoid the risks.”
AI safety is essential for the ethical development of artificial intelligence.”
“AI safety is the best insurance policy against an uncertain future.”
“AI safety is not a luxury, it’s a necessity.”
While it is true that AI has the potential to do a lot of good in the world, it is also true that it has the potential to do a lot of harm. That is why it is so important to ensure that AI safety is a top priority. As Google Brain co-founder Andrew Ng has said, “AI is the new electricity.” Just as we have rules and regulations in place to ensure that electricity is used safely, we need to have rules and regulations in place to ensure that AI is used safely. Otherwise, we run the risk of causing great harm to ourselves and to the world around us.
I’m soliciting input from people with more LLM experience to tell me why this naive idea will fail. I’m hoping it’s not in the category of “not even wrong”. If there’s a 2%+ shot this will succeed, i’ll start coding.
From what I gather, the scrapers look for links on reddit to external text files. I could also collate submissions, zip them and upload to github/IPFS. Which ever format is easiest for inclusion into a Pile.
There is a problem in that any group that is generating alpha would likely lose alpha/person if they allow random additional people into their group.
Think Renaissance Medallion fund. It’s been closed to outside investment since near its inception 30 years ago. Prerequisites for the average person joining would be something like true-genius level Phd in a related STEM field.
An analogue which is closely related is poker players who use solvers to improve their game. The starting stakes are a bit lower. The solvers are like a few thousand dollars + equipment to run them, a class on how to use them runs a similar couple thousand bucks, and then there is the small matter of memorizing the shape of a few thousand tables. As a side note, I think poker is inherently limited because at the top of the heap, you are fighting for single digit to tens of millions of dollars, which is somewhat chump change in the ultimate scheme of things.
Magic the Gathering is similar (cards+variance+strategy/tactics as alpha).
Crypto is similar because of the variance/volatility. There was a decent pipeline of people who went from MtG->Poker->crypto. However, I don’t think crypto groups are what you are looking for because at this point, the alpha is you.
There is also the superforecaster group. You can try metaculus.com or reading https://www.amazon.com/Superforecasting-Science-Prediction-Philip-Tetlock/dp/0804136718
I’m not sure what the end goal is for individual forecasters. On metaculus you accumulate points for correct predictions and there is a rankings board. So it looks primarily status driven, but it’s hard to put food on the table with this level of status. Maybe when you hit top 100 you get an invite to an exclusive group?