Govind Pimpale

Karma: 307

Forecasting Frontier Language Model Agent Capabilities

Govind Pimpale, Axel Højmark, Jérémy Scheurer and Marius Hobbhahn

Feb 24, 2025, 4:51 PM

35 points

0 comments5 min readLW link

(www.apolloresearch.ai)

Do models know when they are being evaluated?

Govind Pimpale, Giles, Joe Needham and Marius Hobbhahn

Feb 17, 2025, 11:13 PM

59 points

3 comments12 min readLW link

Current safety training techniques do not fully transfer to the agent setting

Simon Lermen and Govind Pimpale

Nov 3, 2024, 7:24 PM

158 points

9 comments5 min readLW link

~80 Interesting Questions about Foundation Model Agent Safety

RohanS and Govind Pimpale

Oct 28, 2024, 4:37 PM

46 points

4 comments15 min readLW link

Analyzing DeepMind’s Probabilistic Methods for Evaluating Agent Capabilities

Axel Højmark, Govind Pimpale, Arjun Panickssery, Marius Hobbhahn and Jérémy Scheurer

Jul 22, 2024, 4:17 PM

69 points

0 comments16 min readLW link