Trading robot optimization algorithms: an effective way to make a million in hindsight


I read a reputable book on trading strategies and wrote my trading robot. To my surprise, the robot does not make millions, even when trading virtually. The reason is obvious: a robot, like a racing car, needs “tuning”, the selection of parameters adapted to a specific market, a specific time period.

Since a robot has enough tuning parameters, it is too time-consuming to try all possible combinations in search of the best one. At
my time, when solving the optimization problem, I did not find a reasonable algorithm for searching for a quasi-optimal vector of trading robot parameters. So I decided to compare several algorithms myself…

Short formulation of the optimization problem

We have a trading algorithm. Input data – price history of one-hour interval for one year of observations. Output – P – profit or loss, a scalar quantity.

The trading algorithm has 4 adjustable parameters:

  1. Mf is the period of the “fast” moving average,
  2. the period of the “slow” moving average
  3. T – TakeProfit, the target level of profit on each individual trade,
  4. S – StopLoss, the target level of loss on each individual trade.

For each of the parameters we give a range and a fixed step of variation, a total of 20 values for each of the parameters.

Thus, we can search for the maximum profit (P) for one parameter on one array of input data:

  1. varying one parameter, for example P = f(Ms), producing up to 20 backtests,
  2. varying two parameters, e.g. P = f(Ms, T), producing up to 20 * 20 = 400 backtests,
  3. varying three parameters, e.g. P = f(Mf, Ms, T), producing up to 20 * 20 * 20 = 8,000 backtests,
  4. varying each of the parameters, P = f(Mf, Ms, T, S) and producing up to 20^4 = 160,000 backtests.

Most trading algorithms, however, require several orders of magnitude more time to run a single test. This leads us to the task of finding a quasi-optimal vector of parameters without the need to try all possible combinations of them.

More about trading and trading robots

Trading on the stock exchange, betting in a Forex dealing center, stray bets on “binary options”, speculation on cryptocurrencies – a kind of “diagnosis”, with several possible variants of the “disease”. It is a very common scenario when a player, having played with “intuition”, comes to automatic trading. Don’t get me wrong, I don’t want to put accents so that “technologically and mathematically” strict trading robots are opposed to naive and helpless “manual” trading. For example, I myself am convinced that any attempt to profit from an efficient market (read any transparent and liquid market) through speculation, whether discretionary or fully automated, is a priori doomed to failure. Unless, of course, the random luck factor is allowed.

Nevertheless, trading, and in particular, algo(rhythmic) trading, is a popular hobby for many people. Some program trading robots by themselves, others go even further and create their own platforms for writing, debugging and backtesting trading strategies, others download / buy ready-made electronic Expert Advisors. But even those who don’t write their own trading algorithms should have an idea of how to deal with this “black box”, in order to extract profit from it in accordance with the author’s idea. In order not to be unsubstantiated I will give further observations on the example of a simple trading strategy:

A simple trading robot

The trading robot analyzes the dynamics of the gold price, in dollars per troy ounce, and makes a decision to “buy” or “sell” a certain amount of gold. For simplicity, we will assume that the robot always trades one troy ounce.

For example, at the time of purchase, the value of a troy ounce of gold was 1075.00 USD. At the moment of subsequent sale (closing of the transaction), the price rose to 1079.00 USD. The profit on this transaction was 4 USD.

If the robot “sold” gold for 1075.00 USD, and subsequently completed (closed) the transaction by “buying back” gold at 1079 USD, the profit on this transaction will be negative – minus 4 USD. Actually, it does not matter for us how the robot sells gold, which it does not possess, in order to “buy back” it later. The broker/dealing center allows the trader to “buy” and “sell” the asset in one way or another, earning (or, more often, losing) on price differences.

The input data for the robot has been defined – it is, in fact, a time series of gold prices (quotes). If you think that my example is too simple, not realistic – I can assure you: most of trading robots (and traders, as well) in their trading are based solely on the statistics of the traded product price. In any case, when it comes to parametric optimization of a trading strategy, there is no fundamental difference between a trading robot based on a price vector and a robot that uses a terabyte array of assorted market analytics. The main thing is that both these robots can (must) be tested on historical data. The algorithms must be deterministic: that is, on the same input data (model time, if necessary, we can also take as a parameter), the trading robot must show the same result.

The black (thick) curve on the chart is the hourly XAUUSD price measurements. The two thin broken lines, red and blue – the average price values with averaging periods of 5 and 10, respectively. In other words, moving averages (Moving Average, MA) with periods of 5, 10. For example, to calculate the ordinate of the last (right) point of the red curve, I took the average of the last 5 price values. Thus, each moving average is not only “smoothed” relative to the price curve, but also lags relative to it by half of its period.

Rule for opening a trade

The robot has a simple rule for making buy/sell decisions:
– as soon as the moving average with a short period (the “fast” MA) crosses the moving average with a long period (the “slow” MA) from the bottom upwards, the robot buys the asset (gold).

As soon as the “fast” MA crosses the “slow” MA from the top downwards, the robot sells the asset. In the picture above, the robot will make 5 trades: 3 sales at time points 7, 31 and 50 and two purchases (at points 16 and 36).

The robot is allowed to open an unlimited number of trades. For example, at some point the robot can have several incomplete buys and sells at the same time.

Closing rule

The robot closes the deal as soon as:

  • the profit on the trade exceeds the specified percentage threshold – TakeProfit,
  • or the loss on the trade, in percent, exceeds the corresponding value – StopLoss.

Let’s assume that StopLoss is equal to 0.2%.
The deal – “sale” of gold at a price of 1061.50.
As soon as the gold price rises to 1061.50 + 1061.50 * 0.2% / 100% = 1063.12%, the loss on the deal will obviously be equal to 0.2% and the robot will close the deal automatically.

The robot makes all decisions to open/close a trade at discrete points in time – at the end of each hour, after the next XAUUSD quote has been published.

Yes, the robot is very simple. At the same time, it meets 100% of the requirements:

  1. the algorithm is deterministic: every time we simulate the robot’s work on the same price data, we will get the same result,
  2. has a sufficient number of adjustable parameters, namely: the period of the “fast” and “slow” moving average (natural numbers), TakeProfit and StopLoss – positive real numbers,
  3. Changes in each of the 4 parameters, in general, have a non-linear effect on the trading characteristics of the robot, in particular on its profitability,
  4. The profitability of the robot on the price history is considered elementary software code, and the calculation itself takes fractions of a second for a vector of thousands of quotes,
  5. Finally, which is irrelevant, the robot, for all its simplicity, in reality is no worse (though probably not better) than the “Grail” sold by the author on the Internet for an immodest sum.

Fast search for a quasi-optimal set of input parameters

Using the example of our simple robot, we can see that a complete enumeration of all possible parameter vectors of the robot is too costly even for 4 varying parameters. The obvious alternative to a complete search is to select the parameter vectors by a particular strategy. We consider only a fraction of all possible combinations in search of the best one, in which the TF approaches the highest (or the lowest, depending on what TF we have chosen and what result we are seeking) value.

We will consider three algorithms for finding a quasi-optimal TF value. For each algorithm we will set a limit of 40 tests (out of 400 possible combinations).

Monte Carlo method

or randomly choosing M uncorrelated vectors from a possible number of sets equal to N. The method is probably the simplest possible one. We will use it as a starting point for further comparison with other optimization methods.

Example 1

The graph shows the dependencies of the profit (P) of our trading robot trading EURUSD, obtained on the annual segment of the history of hourly price measurements, on the value of the parameter – the period of the “slow” moving average (M). All other parameters are fixed and are not subject to optimization.

TF (profit) reaches the maximum of 0.27 at the point M = 12. We will have to perform 20 test iterations to guarantee the maximum profit value. An alternative is to conduct fewer tests of the trading robot with a randomly selected value of M on the interval [9, 20]. For example, after 5 iterations (20% of all tests) we have found a quasi-optimal vector (obviously a one-dimensional vector) of parameters: M = 18 with TF value (M) equal to 0.18:

The remaining values in the graph were hidden from our optimization algorithm by the “fog of war”.

Optimization of one of four parameters of our trading robot when the other parameters are fixed does not let us see the whole picture. Maybe, a maximum value of TF equal to 0.27 is not the best indicator when we vary values of other parameters?

This is how the dependence of profit on the moving average period changes at various values of TakeProfit
parameters on the interval [0.2…0.8].

Monte Carlo method: optimization of two parameters

Dependence of the trading robot profit on two parameters can be graphically represented as a surface:

T (TakeProfit) and M (moving average period)
parameters are plotted on two axes, the third axis is the profit value.

Having performed 400 tests on one year data interval (~6000 hours of EUR/USD quotes) for our trading robot, we will obtain surface of the following form:

or, on a plane where TF values (profit, P) are represented by color:

Selecting arbitrary points in the plane, in this example the algorithm did not find the optimal value, but came pretty close to it:

How effective is the Monte Carlo method in finding the maximum TF? I performed 1,000 iterations to find the maximum TF on the original data from the example above and got the following statistics:

  • the average value of the maximum TF found during 1,000 optimization iterations (40 random parameter vectors [M, T] out of 400 possible combinations) was 0.231 or 95.7% of the global maximum TF (0.279).

Obviously, one sample is not an indicator when comparing methods of parametric optimization of trading robots. But this estimate is sufficient for the time being. Let’s move on to the next method – the gradient descent method.

Gradient descent method

Formally, as the name implies, the method is used to find the minimum of TF.
According to the method, we choose a starting point with coordinates [x0, y0, z0, …]. In the example of optimization of one parameter it can be a randomly chosen point

with coordinates [5] and a TF value equal to 148. Three simple steps follow:

  1. check the TF values in the vicinity of the current position (149 and 144)
  2. move to the point with the lowest TF value
  3. if there is no such an extremum, the local extremum is found, the algorithm is complete

To optimize the TF as a function of two parameters we apply the same algorithm. If before we calculated TF in two neighboring points , now we check 4 points:

The method is definitely good when the TF has only one extremum in the test space. If there are several extrema, we will have to repeat the search several times in order to increase the probability of finding the global extremum

In our example, we are looking for the maximum of the TF. In order to stay within the algorithm definition, we may consider that we are searching for the minimum “minus TF”. Same example, profit of a trading robot as a function of the moving average period and TakeProfit value, one iteration:

In this case a local extremum has been found far from the global maximum in TF. An example of several iterations of the search for TF extremum by the gradient descent method, the TF value has been calculated 40 times (40 points out of 400 possible):

Now let us compare the efficiency of search for global maximum of TF (profit) on our initial data by Monte Carlo and gradient descent algorithms. In each case there are 40 tests (TF calculations). We performed 1,000 iterations of optimization by each method:

Monte Carlo

gradient descent

the average of the obtained quasi-optimal value of the TF



the obtained value from the maximum of the TF



As we can see from the table, in this example the gradient descent method copes worse with its task – to find the global extremum of TF – the maximum of profit. But let us not be in a hurry to discount it.

Parametric stability of the trading algorithm

Finding the coordinates of the global maximum / minimum of TF is often not the goal of optimization. Suppose a “sharp” peak is encountered on the chart – a global maximum, the TF value in the vicinity of which is much lower than the peak value:

Suppose we have selected the trading robot settings corresponding to the maximum TF. If we only slightly change the value of just one of the parameters – moving average and / or TakeProfit – the trading robot’s profitability will sharply decrease (become negative). In real trading, we can at least expect that the market to be used by our trading robot will be significantly different from the period of history on which we have optimized our trading algorithm.

Therefore, when selecting “optimal” settings of a trading robot, it is worth getting an idea of how sensitive the robot is to the changes in the settings in the vicinity of the point of extremum of TF.

Obviously, the gradient descent method tends to give us TF values in the vicinity of an extremum. The Monte Carlo method, rather, beats the areas.

Multiple manuals for testing automatic trading strategies recommend to check the target values of the robot in the vicinity of the found parameter vector after optimization is completed. But these are additional tests. In addition, what if the profitability of the strategy decreases when the settings are changed insignificantly? Obviously, we will have to repeat the testing process.

It would be useful to have an algorithm that would enable us to evaluate the robustness of a trading strategy to the change of settings within a narrow range of the identified peaks simultaneously with the search of TF extremum. For example, to search not directly for the maximum of TF, but for the weighted average value that takes into account the neighboring values of the target function, where the weight is inversely proportional to the distance to the neighboring value (for optimization of two parameters x, y and the target function P).

In other words, when choosing a quasi-optimal vector of parameters, the algorithm will estimate a “smoothed” target function:

The “naval battle” method

Trying to combine the advantages of both methods (Monte Carlo and gradient descent method), I tried an algorithm similar to that of the sea battle game:

  • First, I do a few “strokes” over the entire area
  • then, in the places of “hits” I open massive fire.

In other words, the first N tests are performed on random uncorrelated vectors of input parameters. From them M best results are selected. In the vicinity of these tests (plus – minutes 0…L to each of the coordinates) another K tests are conducted.

For our example (400 points, 40 tests in total) we have:

Again let us compare the efficiency of 3 optimization algorithms:

Monte Carlo

gradient descent

“naval combat.”

Average value of found extremum of TF in percent of global value.
40 tests, 1 000 optimization iterations




The result is encouraging. Of course, the comparison was carried out on one particular data sample: one trading algorithm on one time series of the value of the euro against the U.S. dollar. But before comparing algorithms on more samples of initial data I’m going to tell you about another unexpectedly (unreasonably?) popular algorithm for optimizing trading strategies – genetic optimization algorithm (GA).