Max H comments on Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent

Max H 27 Mar 2023 3:59 UTC
2 points
1
As expected, I won’t have time to actually enter this before the end of the month, but I have a couple of simple ideas which might work pretty well in light of release of GPT-4. Feel free to attempt them and claim the bounty all for yourself.

(Background: I made this comment before the release of GPT-4.)

Idea 1: Literally just gpt-4-32k.

Now that GPT-4 is out with an ~8000 token context window, it almost works to literally just paste most of List of Lethalities into the system message of GPT-4 in the OpenAI playground, and then add something to the end like: “You are an AI alignment researcher who has deeply internalized and agrees with the ideas in the post above.”

(If you have API access to GPT-4 you can try this yourself here)

I don’t know if anyone actually has access to the gpt-4 32k yet, but I expect with that you can get even better results by including more and better source documents (non-summarized, literally just copy+pasting), and that this approach will “just work”, or at least work as well as anything anyone else can build using lesser models that rely on summaries and complicated chains.

Idea 2: Try upstreamapi.com

I came across this product: https://chat.upstreamapi.com/ which basically allows you to create the original idea I had, using a no-code solution and GPT-4 in a few clicks.

I tried this by adding literally 3 sources (list of lethalities, CEV, and orthogonality thesis from Arbital), and upgrading to Pro to get access to GPT-4:

Demo available here:

https://chat.upstreamapi.com/embed/chatbot?id=93691773614d6b2a4e15bea469e98d52

This does not work very well as a bad-alignment-take-refuter, I think for two reasons:
- I didn’t actually give it enough good sources to search over
- We don’t actually want the bot to just search over docs and summarize; we want it to synthesize the summaries into something digestible for newbies and make an (explanatory) argument. This probably requires massaging the system message a bit, which the product currently doesn’t support.
(Also as a word of warning, this product appears to be a bit rough around the edges. But it was easy to try; I spent less time on the attempt above than I spent writing this comment.)

Bonus UI idea:

Instead of building or finding an entire web frontend, just build a Slack or Discord bot. Once you have something working on the command line, it is straightforward to integrate into a bot, which may be both easier to implement and slicker to actually use, compared to a standalone web app.
- Nanda Ale 28 Mar 2023 15:01 UTC
  1 point
  0
  Parent
  I agree that GPT-4 with the largest context window, vanilla with zero custom anything, is going to beat any custom solution. This does require the user to pay for premium ChatGPT, but even the smaller window version will smoke anything else. Plugins are not public yet but when they are a plugin would be ideal.
  On the other end of the extreme, the best chatbot a user can run on their own typical laptop or desktop computer would be a good target. Impressive in its own way, because you’re talking to your own little computer, not a giant server farm that feels far away and scifi!
  Not as much value in the space in between those two, IMO.
  - ArthurB 29 Mar 2023 14:41 UTC
    1 point
    0
    Parent
    Exactly. As for the cost issue, the code can be deployed as:
    
    - Twitter bots (registered as such) so the deployer controls the cost
    
    - A webpage that charges you a small payment (via crypto or credit card) to run 100 queries. Such websites can actually be generated by ChatGPT4 so it’s an easy lift. Useful for people who truly want to learn or who want to get good arguments for online argumentation
    
    - A webpage with captchas and reasonable rate limits to keep cost small