I was under the impression that PPO was a recently invented algorithm? Wikipedia says it was first published in 2017, which if true would mean that all pre-2017 talk about reinforcement learning was about other algorithms than PPO.
Wikipedia says:
PPO was developed by John Schulman in 2017,[1] and had become the default reinforcement learning algorithm at American artificial intelligence company OpenAI.
Wikipedia says: