NN median target predictions and how M5 charts react to them

To me, the target predictions (in blue), which were taken from a NARX neural network predicting the median price of the H4 bars of this USDCAD data, do not look like targets but more like support/resistance lines. Every 48 M5 closing points (one H4 closing point) the target price is updated with the network’s prediction.

I can’t make out what to possibly use this for, but they could be used as support/resistance guides. Any suggestions?

Advertisements

Part #2 – a decision support system

Error rate with the latest NARX simulations was moderately low. It was measured by dividing the difference between y-actual and y-predicted for the median price by the average length of the candle.

with H4 and H8 constructed bars, the lowest errors with respect to candle size was 18-21% in the case of some USDCAD data, and EURSEK (a very impressive 15%) which means that the average error between predicted and actual median was around only 15% of the average length of a bar.

Part #2 starts now:

Using that target prediction to create a profitable decision support system.

The main algorithm that I’m about to try is:

1. use NN to predict target for either H4 or H8

2. when target prediction is obtained, monitor the progression of the price:

if price moves away from target median by x pips (and a confirmation from indicator is obtained, if any), enter market.

3. exit market if either:

TP is hit

SL is hit

Target price candle is closed.

Data/Targets modification solutions

After some brainstorming I came up with a few possible solutions that might help in giving the input data more meaning and relation to the targets, and making the targets cleaner, more “interpret-able” and “guessable”.

I. Possible additions to the trading strategy

1. Large moving averages, such as SMA(100), SMA(200), SMA(300) to determine the overall trend, to help filter out whipsaws. Possible uses are also the slopes of those SMA’s and other characteristics that the NN can deduce on its own.

2. Stochastics, RSI, MACD, either left on their own or possibly combined into one indicator giving an output value of the [-1,+1] range. There can be several values of of each, not strictly limited to the defaults of RSI(14) and Stochastics (8,3,3) or (5,3,3), nor the MACD of (12,26,9). Of course I will need to create extra TA functions in MATLAB.

3. ADX (including ATR, both needing to be built), StdDev, both need to be created as well.

Most of those are trend definers such as ADX and the large value SMA’s, while others are confirmators and come second in order of importance, such as MACD, RSI, Stochastics. StdDev is just to measure how far away is a price straying, to help predict where the trend is going.

a uSDJPY H1 chart with SMA(100) in green, SMA(200) in yellow, with ADX, MACD, Stochastics and StdDev in widnows below.

A USDJPY H1 chart with SMA(100) in green, SMA(200) in yellow, with ADX, MACD, Stochastics and StdDev in widnows below.

II. Possible changes to the target data

1. Form target data into optimal TP/SL and not rely on an exit signal but on hitting either of them. Problems to solve would be to use multiple entries or not. Try to create a mechanism to use trailing stop as well.

2. Smooth TP data, making it more prediction friendly, instead of pure noise. That means a change to the output values in a way that will not distort their meaning but make them more uniform.

III. Changes into input methods

(By modified I mean that the data will have to pass through several filters and functions to create a certain meaning out that is not clearly deduced out of its raw value, before it’s fed to the network. By raw, I mean that the data will be presented as context-free numbers).

Method 1: Raw Inputs + Raw Targets

Method 2: Raw Input + Modified Targets

Method 3: Modified Inputs + Raw Targets

Method 4: Modified Inputs + Modified Targets

I’m more inclined towards Method 4, I’m also planning to use a NN to help attach a meaning to a combination of indicators.

Early network training results

Described here are some of the early (still yet unsatisfactory) attempts that I have obtained from using a two-layered neural network (one hidden layer and one output layer) with the sets of data that I have downloaded from dukascopy.com.

1. Network structure: feed-forward back-propagation, with the hidden layer containing somewhere between 20-50 neurons (Anything much above 40 will cause overfitting, or just plain unfitting).

2. Learning algorithms used: Levengerg-Marquandt, Scaled Conjugated Gradient , some others on a few instants. Transfer function is tanh(x) or as referred to in Matlab, tansig.

3. Data sets used are 5000 hourly bars from USDJPY, 3700 daily bars from Nasdaq-100, and 30,000 Minute bars from EURUSD. All contain Saturday and Sunday data. After cleaning up and isolating the cross points of SMA(10)-SMA(20) (in the Nasdaq case it was SMA(3)-SMA(8)) along with the MACD and RSI data at those points, I ended up with around 193 data items for USDJPY, 142 data items for Nasdaq, and around 1200 data items for EURUSD. Another test on EURUSD with the crossing of SMA(50) and SMA(100) yielded around 200 data items. A smaller data pool but way less noisy.

around 70-80% of the data was used for training, 10-15% for validation and 10-15% for testing.

Results were similar after a few epoch’s of training. Better with around 40 nodes in the hidden layer, however (in the USDJPY test) in a target vector where elements vary considerably between 5 to 150 pips (with a few noisy results above 250-300 pips), the mean square error was around 0.4-0.7, which after applying square root to, would give 50-80 pips away from targets. Still not an acceptable result.

The diagram below represents one of the tests:

Screenshot-Performance (plotperform)

However, for a data that is completely random, and inputs that are not supposed to be very intelligent in telling how the market trend is going to be, this is not bad at all. Of course many improvements will need to take place on both data quality and network structure before a satisfactory result can be reached. I’m thinking of the following:

1. Smooth target data: make it less noisy, clip very high targets so that over-fitting does not occur.

2. Increase either data pool so that I obtain more crossing points, or change the entry system rules to get more crossing points. The results where I had over 1000 items was considerably better than the cases with under 200 items.

3. Increase number of inputs to the network. Right now I only have 3 elements as inputs, MACD histogram, MACD signal, and RSI. This is clearly not enough and also pretty weak in determining a trend. The point though was to try out a simulation. More quality technical analysis tools are needed, such as surrounding point moving averages, several RSI’s, slow and fast stochastics, support and resistance, ADX and trend-angles, bollinger bands, etc…

4. Also, for approximation tasks, in previous research, it is best to use 2 hidden layers instead of one, with a higher number of inputs. a schema of 40-20-10-1 or 20-10-5-1 is preferred to the 3-40-1 I’m forced to use nowadays.

With all those improvements, I expect to see some advanced results with the following simulations. I just need to import way more CSV data (and clean it up), and learn more about creating custom neural networks (Chapter 12 of my 900+ book) and either create or import the unavailable technical indicators in matlab (such as ADX). On a second thought, they can be done with Qtstalker, if only I could find a nice way to export them.

MATLAB code to extract TP values and indicator data from an FTS


% moving prices to a matrix %
prc = fts2mat (TS);

% getting macd %
macdF = macd(TS);
macd_m = fts2mat (macdF);

% getting rsi %
rsiF = rsindex(TS);
rsi_m = fts2mat (rsiF);

% getting the MA's %
mov1F = tsmovavg(TS,'s',10);
mov2F = tsmovavg(TS,'s',20);
mov1_m = fts2mat (mov1F.CLOSE);
mov2_m = fts2mat (mov2F.CLOSE);

% MA's difference for crossing points %
mvdiff = mov1_m - mov2_m;

% getting the 1's and 0's %
[ro,co] = size(mvdiff);

pol(1)=0;

for i=2:ro
pol(i)=hardlim(mvdiff(i)/mvdiff(i-1)*-1);
end

pol = pol';

% getting the optimal TP levels %
for j=1:ro
if pol(j)==0
TP(j)=0;
elseif pol(j)==1 && mov1_m(j)
k=j+1;
TP(j)=prc(j,3)-prc(k,4);
k=j+2;
while(pol(k-1)~=1 && k<=ro)
if ((prc(j,3)-prc(k,4))>TP(j))
TP(j)= prc(j,3)-prc(k,4);
end
k=k+1;
end
elseif pol(j)==1 && mov1_m(j)>mov2_m(j) % upcrossing %

k=j+1;
TP(j)=prc(k,5)-prc(j,3);
k=j+2;

while(pol(k-1)~=1 && k<=ro)
if((prc(k,5)-prc(j,3)>TP(j)))
TP(j)=prc(k,5)-prc(j,3);
end
k=k+1;
end
end
end

TP = TP';

% creating an unclean ready matrix %
unclean = [macd_m rsi_m TP];

% cleaning out zero rows %
j=1;
for i=1:ro
if (unclean(i,4)~=0)
clean(j,1)=unclean(i,1);
clean(j,2)=unclean(i,2);
clean(j,3)=unclean(i,3);
clean(j,4)=unclean(i,4);
j=j+1;
end
end

% getting inputs and outputs ready %
tinp = clean(:,1:3);
targ = clean(:,4);