Abstract:
It is difficult to apply the common knowledge transfer method to the tasks that the state transfer probability changes with the parameters,as the knowledge obtained by extracting the common features of optimal policy is usually related to parameters.To solve this problem,this paper proposes a hierarchical option algorithm based on qualitative fuzzy networks.The algorithm learns a sub-optimal policy which is defined by qualitative actions,extracts the common features of suboptimal policy to obtain knowledge unrelated to parameters,and achieves knowledge transfer.Experiment results of inverted pendulum system are presented to prove that the qualitative fuzzy network can describe the common control rules of the inverted pendulum systems with different parameter values and extends the common knowledge transfer method from parameter related tasks to parameter unrelated ones.