[1] |
Bnsoniu L,Babuska R,De Schutter B.A comprehensive survey of multiagent reinforcement learning[J].IEEE Transactions on Systems,Man,and Cybernetics,Part C-Applications and Reviews,2008,38(2):156~172
|
[2] |
刘建昌,林琳.基于CMAC再励学习控制的电梯群控调度方法[J].信息与控制,2005,34(4):495~499
|
[3] |
Jodogne S,Briquet C,Piater J H.Approximate policy iteration for closed-loop learning of visual tasks[A].Lecture Notes in Artificial Intelligence (vol.4212)[M].Berlin,Germany:Springer-Verlag,2006.210~221
|
[4] |
Lagoudakis M G,Parr R.Least-squares policy iteration[J].Journal of Machine Learning Research,2004,4(6):1107~1149
|
[5] |
Wang X S,Cheng Y H,Yi J Q.A fuzzy actor-critic reinforcement learning network[J].Information Sciences,2007,177(18):3764~3781
|
[6] |
段勇,徐心和.基于模糊神经网络的强化学习及其在机器人导航中的应用[J].控制与决策,2007,22(5):525~529,534
|
[7] |
高阳,陈世富,陆鑫.强化学习研究综述[J].自动化学报,2004,30(1):86~100
|
[8] |
程玉虎,王雪松,易建强,等.基于自组织模糊RBF网络的连续空间Q学习[J].信息与控制,2008,37(1):1~8
|
[9] |
Mahadevan S.Proto-value functions:developmental reinforcement learning[A].Proceedings of the International Conference on Machine Learning[C].New York,USA:ACM,2005.553~560
|
[10] |
Sugiyama M,Hachiya H,Towell C,et al.Value function approximation on non-linear manifolds for robot motor control[A].Proceedings of the IEEE International Conference on Robotics and Automation[C].Piscataway,NJ,USA:IEEE,2007.1733~1740
|
[11] |
Glaubius R,Namihira M,Smart W D.Speeding up reinforcemerit learning using manifold representations:Preliminary results[A/OL].IJCAI Workshop Reasoning with Uncertainty in Roboties[C].http://www.cs.wustl.edu/~rlg1/Papers/glaubius2005rense.pdf,2008-11-30
|
[12] |
Tenenbaum J B,de Silva V,Langford J C.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319~2323
|
[13] |
Dijkstra E W.A note on two problems in connexion with graphs[J].Numerische Mathematik,1959,1(1):269~271
|
[14] |
段凡丁.关于最短路径的SPFA快速算法[J].西南交通大学学报,1994,29(2):207~212
|
[15] |
Xu X,He H,Hu D.Efficient reinforcement learning using recursive least-squares methods[J].Journal of Artificial Intelligence Research,2002,16:259~292
|
[16] |
Ljung L,Torsten S.Theory and Practice of Recursive Identification[M].Cambridge,MA,USA:MIT Press,1983.
|