Designing policy optimization algorithms for multi-agent reinforcement learning