Though I’m generally using stable baseline algorithms for training locomotion tasks, I am sometimes drawn back to evolutionary algorithms, and especially Map Elites, which has now been upgraded to incorporate a policy gradient.
The archiving of behaviours is what attracts me to Map Elites.
PGA Map Elites based on top of QDGym, which tracks Quality Diversity, is probably worth a look.