In this paper the problem of designing smart supplier agents for electricity markets using Reinforcement Learning (RL) algorithm is discussed. With
the help of these agents we are able to run and simulate a competition among several suppliers of electric energy in forward electricity markets. The goal of each supply agent SA) is to maximize its revenue for the entire trading period (e.g. 24 hours in a day-ahead market). We use a temporal difference (TD) method for each SA to basically learn the market environment and the opportunities that give it the maximum revenue. Each SA tries to adjust its strategy in each period using exploitation and comparing the revenue of that period with the average of the previous periods or its target revenue. An IEEE 30-bus system with six supply-agents is used for market simulation studies