Sky Moo

Karma: 34

Sky Moo Jul 23, 2023, 5:23 PM
1 point
0
on: All AGI Safety questions welcome (especially basic ones) [July 2023]
I have been thinking about this question because llama 2-chat seems to have false positives on safety. e.g. it wont help you fix a motorbike in case you later drive it and end up crashing the motorbike and getting injured.
What is an unsafe LLM vs a safe LLM?

Sky Moo Apr 15, 2023, 3:27 PM
2 points
0
on: All AGI Safety questions welcome (especially basic ones) [April 2023]
What could be done if a rogue version of AutoGPT gets loose on the internet?
OpenAI can invalidate a specific API key, if they don’t know which one they can cancel all of them. This should halt the thing immediately.
If it were using a local model the problem is harder. Copies of local models may be distributed around the internet. I don’t know how one could stop the agent in this situation. Can we take inspiration from how viruses and worms have been defeated in the past?

Sky Moo Apr 14, 2023, 9:04 PM
20 points
0
on: On AutoGPT
This should at least partially answer your question of ‘why would an AI want to destroy humanity?’ it is because humans are going to tell it to do that.
The AutoGPT discord has a voice chat that’s basically active ²⁴⁄₇, people are streaming setting up and trying out AutoGPT in there all the time. The most common trial task they give it is ‘make paperclips’.

Sky Moo Apr 11, 2023, 5:16 PM
1 point
−2
in reply to: JanPro’s comment on: Agentized LLMs will change the alignment landscape
I understand your emotional reaction to ChaosGPT in particular, but I actually think it’s important to keep in mind that ChaosGPT is equally as dangerous as AutoGPT when asked to make cookies, or make people smile. It really doesn’t matter what the goal is, it’s the optimization that leads to these instrumental biproducts that may lead to disaster.

Sky Moo Apr 9, 2023, 3:23 PM
2 points
0
in reply to: Razied’s comment on: GPTs are Predictors, not Imitators
This is an alignment problem: You/LeCunn want semantic truth, whereas the actual loss function has the goal of producing statistically reasonable text.
Mostly. The fine tuning stage puts an additional layer on top of all that, and skews the model towards stating true things so much that we get surprised when it *doesn’t*.
What I would suggest is that aligning an LLM to produce text should not be done with RLHF, instead it may need to extract the internal truth predicate from the model and ensure that the output is steered to keep that neuron assembly lit up.

Sky Moo Apr 8, 2023, 2:06 PM
4 points
0
in reply to: green_leaf’s comment on: Auto-GPT: Open-sourced disaster?
I watched someone play with this tool in discord. I thought it was interesting that they ran the tool as administration because otherwise it didn’t work (on their particular system/setup).

Sky Moo Apr 8, 2023, 2:02 PM
10 points
0
in reply to: bvbvbvbvbvbvbvbvbvbvbv’s comment on: Upcoming Changes in Large Language Models
The goal of this site is not to create AGI.

Sky Moo Apr 8, 2023, 1:37 PM
2 points
0
on: Stupid Questions—April 2023
Here are some questions I would have thought were silly a few months ago. I don’t think that anymore.
I am wondering if we should be careful when posting about AI online? What should we be careful to say and not say, in case it influences future AI models?
Maybe we need a second space, that we can ensure wont be trained on. But that’s completely impossible.
Maybe we should start posting stories about AI utopias instead of AI hellscapes, to influence future AI?

Sky Moo Apr 1, 2023, 3:06 PM
1 point
0
on: All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Here are some questions I would have thought were silly a few months ago. I don’t think that anymore.
I am wondering if we should be careful when posting about AI online? What should we be careful to say and not say, in case it influences future AI models?
Maybe we need a second space, that we can ensure wont be trained on. But that’s completely impossible.
Maybe we should start posting stories about AI utopias instead of AI hellscapes, to influence future AI?

[Question] Best resources to learn philosophy of mind and AI?

Sky MooMar 27, 2023, 6:22 PM

1 point

0 comments1 min readLW link