Discussion about this post

User's avatar
Avi's avatar
May 4Edited

Hi Ryan, thanks for this breakdown. To your open questions on compute trends:

- I'm dubious that RL is well described as eliciting capabilities latent in the base model. Maybe there are reasons to think this and I just haven't heard them but seems to run counter to a) prior success of RL in models before frontier LM's, and b) compute investment inside OAI. I doubt it caps but bottleneck or slowdown seem very possible.

- For the speed of scale up on RL environments I'm not privy to plans from the likes of Mechanize/Scale/frontier labs but my guess would be you gold rush on domains that are economically attractive and have tight feedback loops like software engineering or else race on R&D speedup while other capabilities lag behind and models become more specialized. Seems plausible to me to get top-human expert AI researchers while other abilities are maybe just marginally better than current SOTA. Expecting more specialized models with much wider disparity in ability across domains if pretraining paradigm gives way to RL paradigm but not too sure.

Expand full comment
A$AL's avatar

Thanks bro

Expand full comment

No posts