Q-Learning is a popular agent learning algorithm but it has several weaknesses such as slow convergence in large state space Generalization methods that try to reduce the size of state space may produce a perceptual aliasing problem. In this paper we use only two states for a hunter that tries to catch a random or intelligent prey in a 10*10 square domain. We show that because of perceptual aliasing, random action selection fairs better than the strategies found by Q-learning when prey acts randomly. Furthermore a novel multi-step action selection technique is introduced to decrease the exploration steps that hunter needs to catch the prey. Results show that the proposed algorithm improves number of actions taken for catching both intelligent and random preys by over 50%