I didn’t have in mind o1, these exact results seem consistent. Here’s an example I had in mind:
Claude 3.5 Sonnet (old) scores 48% on ProtocolQA, and 7.1% on BioLP-bench
GPT-4o scores 53% on ProtocolQA and 17% on BioLP-bench
Igor Ivanov
Good post.
The craziest thing for me is that the results of different evals, like ProtocolQA and my BioLP-bench, that suppose to evaluate similar things, are highly inconsistent. For example, two models can have similar scores on ProtocolQA, but one scores twice as much answers on BioLP-bench as the other. It means that we might not measure things we think we measure. And no one knows what causes this difference in the results.
This is an amazing overview of the field. Even if it won’t collect tons of upvotes, it is super important, and saved me many hours. Thank you.
I tried to use the exact quotes while describing things that they sent me because it’s easy for me to misrepresent their actions, and I don’t want tit to be the case.
Totally agree. But in other cases, when the agent was discouraged against dceiving, it did it too.
Thanks for your feedback. It’s always a pleasure to see that my work is helpful for people. I hope you will write articles that are way better than mine!
Thanks for your thoughtful answer. It’s interesting how I just describe my observations, and people make conclusions out of it that I didn’t think of
For me it was a medication for my bipolar disorder quetiapine
Thanks. I got a bit clickbaity in the title.
Thanks for sharing your experience. I wish you to stay strong
The meaninglessness comes from the idea akin to to “why bother with anything if AGI will destroy everything it”
Read Feynman’s citation from the beginning. It describes his feelings about atom bomb that are relevant for some people’s thoughts about AGI.
Thanks
Your comment is somewhat along the lines of the stoic philosophy.
Hi
In this post you asked to leave the names of therapists familiar with alignment.
I am such a therapist. I live in the UK. That’s my website.
I recently wrote a post about my experience as a therapist with clients working on AI safety. It might serve as indirect proof that I really have such clients.
This is tricky. May it exacerbate your problems?
Anyway. If there’s a chance I can helpful for you, let me know.
These problems are not unique to AI safety, but they are present way more often with my clients working on AI safety, than with my other clients.
Thanks. I am not a native English speaker, and I use GPT-4 to help me catch mistakes, but it seems like it’s not perfect :)
Thanks for sharing your experience. My experience is that talking with non-AI safety people is similar to talks about global warming. If someone tells me about that, I say that this is an important issue, but I honestly don’t invest that much effort to fight against it.
This is my experience, and yours might be different.
I totally agree that it might be good to have such a fire alarm as soon as possible, and looking at how fast people make GPT-4 more and more powerful makes me think that this is only a matter of time.
And I’m unsure that experts are comparable, to be frank. Due to financial limitations, I used graduate students in BioLP, while the authors of LAB-bench used PhD-level scientists.