Abstract:The ability of animals to learn behavioral decisions in a specific environment is an important basis for their survival, therefore,how to accurately evaluate the learning ability between using past experience and valuing future reward of animals in Markov decision-making tasks is important for animal behavior and psychology. A Markov decision task with state transition probability was designed, Pigeons were trained to choose between two options in different states and consider future benefits to maximize cumulative rewards. At the end of the experiment, Q-learning model of pigeons’ behavioral decision-making was modeled, and the learning rate was used to evaluate their ability to make choices based on past experience, and the discount factor was used to evaluate their emphasis on future rewards. The results shows that the learning ability between using past experience and valuing future reward of pigeons in Markov decision-making tasks could be evaluated by Q-learning model parameters.