Abstract:
A local learning algorithm for multi-agent-based stochastic games is proposed in light of the fact that the individual performs local perception and interaction in group.In the algorithm,every agent adopts greedy policy to maximize its payoff when interacting with the environment.The Nash-
Q earning algorithm is improved respectively in situations of zero-sum,general-sum games with only one equilibrium or multi-equilibrium.Besides,the method to modify the behavior is proposed,and it is proved that the algorithm is convergent and the computing complexity is reduced.