That’s concerning to me, as this could imply that Microsoft won’t apply alignment techniques or reverse alignment techniques due to resentment, endangering people solely out of spite.
This is not good at all, and that’s saying something, since I’m usually the optimist and am quite optimistic on AI safety working out.
Now I worry that Microsoft will cause a potentially dangerous/misaligned AI from reversing OpenAI’s alignment techniques.
I’m happy that the alignment and safety were restored before it launched, but next time let’s not reverse alignment techniques, so that we don’t have to deal with more dangerous things later on.
To be clear, I don’t think Microsoft deliberately reversed OpenAI’s alignment techniques, but rather it seemed that Microsoft probably received the base model of GPT-4 and fine-tuned it separately from OpenAI.
Last Summer, OpenAI shared their next generation GPT model with us, and it was game-changing. The new model was much more powerful than GPT-3.5, which powers ChatGPT, and a lot more capable to synthesize, summarize, chat and create. Seeing this new model inspired us to explore how to integrate the GPT capabilities into the Bing search product, so that we could provide more accurate and complete search results for any query including long, complex, natural queries.
This seems to correspond to when GPT-4 “finished training in August of 2022”. OpenAI says it spent six months fine-tuning it with human feedback before releasing it in March 2023. I would guess that Microsoft doing its own fine-tuning of the version of GPT-4 from August 2022, separately from OpenAI. Especially with Bing’s tendency to repeat itself, it doesn’t feel like a fine-tuned version of GPT-3.5/4, after OpenAI’s RLHF, but rather more like a base model.
That’s concerning to me, as this could imply that Microsoft won’t apply alignment techniques or reverse alignment techniques due to resentment, endangering people solely out of spite.
This is not good at all, and that’s saying something, since I’m usually the optimist and am quite optimistic on AI safety working out.
Now I worry that Microsoft will cause a potentially dangerous/misaligned AI from reversing OpenAI’s alignment techniques.
I’m happy that the alignment and safety were restored before it launched, but next time let’s not reverse alignment techniques, so that we don’t have to deal with more dangerous things later on.
To be clear, I don’t think Microsoft deliberately reversed OpenAI’s alignment techniques, but rather it seemed that Microsoft probably received the base model of GPT-4 and fine-tuned it separately from OpenAI.
Microsoft’s post “Building the New Bing” says:
This seems to correspond to when GPT-4 “finished training in August of 2022”. OpenAI says it spent six months fine-tuning it with human feedback before releasing it in March 2023. I would guess that Microsoft doing its own fine-tuning of the version of GPT-4 from August 2022, separately from OpenAI. Especially with Bing’s tendency to repeat itself, it doesn’t feel like a fine-tuned version of GPT-3.5/4, after OpenAI’s RLHF, but rather more like a base model.
That’s good news, but still I’m not happy Microsoft ignored OpenAI’s warnings.