[1]

N. E. Fard and R. Selmic, “Consensus of Multi-agent Reinforcement Learning Systems: The Effect of Immediate Rewards”, J Robot Control (JRC), vol. 3, no. 2, pp. 115–127, Feb. 2022.