MetaTrader 5 Python User Group - how to use Python in Metatrader - page 83

 

Right - wrong prepared....

Where can I read about this? I prepare data for the NS according to my perception of what is important and what is not.

One thing that puzzles me is whether the same type of data should be collected "in a pile" or added to as it comes in?

From which side should the data be collected: from the "older" or from the "newer" ones?

 
Сергей Таболин:

Right - wrong prepared....

Where can I read about this? I prepare data for the NS according to my perception of what is important and what is not.

One thing that puzzles me is whether the same type of data should be collected "in a pile" or added to as it comes in?

From which side should the data be collected: from the "older" or from the "newer"?

In the MoD thread ask, someone will answer. This is the connector topic

 
Сергей Таболин:

The trouble is that normalisation is a lost cause altogether!

Let me explain. There are some data A, B, C...

They are different in terms of significance and so on. Everybody (google) says that normalization should be done by columns (A-A-A, B-B-B, C-C-C) and not by rows. This is logically understandable.

But when new data appears for "prediction" HOW to normalize it if it's only ONE row? And any term in that row can go beyond normalisation on training and test data?

And normalization by strings has no effect!

Actually, after checking these nuances, I had this "cry of the soul" ))))

During normalization the coefficients are saved. To avoid out of range, we should take a big chunk of history and normalize, then apply these coefficients to new data

on non-normalised data, the grid will not learn, or will learn poorly. That's their nature.
 
Maxim Dmitrievsky:

the coefficients are retained during normalisation. To avoid out of range, we need to take a big chunk of history and normalize, then apply these coefficients to new data

It will not learn from non-normalized data, or it will learn poorly. That's their peculiarity.

All this is logical and understandable, but the grid is being trained! Besides there is information that using non-normalized data is more complicated for learning, but it's not critical.

And how not to go out of the ranges? For example, there is a price. There is a range of prices on the training and test data - take 123-324. But the price goes up to 421. How does it fall into that same range?

But we're getting away from the heart of the matter - why, with normal training and testing, is the prediction anything at all?

 

Dear friends, once again my skis aren't moving... I'm asking for help.

I decided to sketch out a little tester to test the trained network prediction.

# Загрузка данных
df_full = pd.read_csv(flname_csv, header=None)
r, c = df_full.shape
border = c - row_signal
test_data = np.array(df_full.values[:, :border])
test_verification = np.array(df_full.values[:, border:])
print(test_data[2], 'len =', len(test_data[2]))
print(test_verification[2], 'len =', len(test_verification[2]))

Everything's fine here.

[3.00000 e+00 7.00000 e+00 1.14656 e+00 1.14758 e+00 1.14656 e+00 1.14758 e+00
 3.00000 e+00 7.00000 e+00 1.27800 e+03 1.27800 e+03 3.00000 e+00 7.00000 e+00
 1.14758 e+00 1.14857 e+00 1.14758 e+00 1.14857 e+00 3.00000 e+00 8.00000 e+00
 2.93000 e+02 6.20000 e+02 3.00000 e+00 8.00000 e+00 1.14857 e+00 1.14960 e+00
 1.14821 e+00 1.14960 e+00 3.00000 e+00 8.00000 e+00 4.78000 e+02 7.23000 e+02
 3.00000 e+00 8.00000 e+00 1.14960 e+00 1.14966 e+00 1.14860 e+00 1.14860 e+00
 3.00000 e+00 8.00000 e+00 2.32100 e+03 2.41100 e+03] len = 40
[1. 0.] len = 2

And the next thing you know...

if num_check_rows > r or num_check_rows == 0:
    num_check_rows = r

model = tf.keras.models.Sequential()
model = tf.keras.models.load_model(flname)
model.summary()

for i in range(num_check_rows):
    b = model.predict(test_data[i])
    a = b[0]
    x = a[0]
    y = a[1]
    x = format(x, '.5f')
    y = format(y, '.5f')
    print(f'ожидалось {test_verification[i]} >>> получили [{x} {y}]')

swear

Model: "sequential_16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 20)                820       
_________________________________________________________________
dropout (Dropout)            (None, 20)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                210       
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 22        
=================================================================
Total params: 1,052
Trainable params: 1,052
Non-trainable params: 0
_________________________________________________________________

Traceback (most recent call last):
  File "M:/PythonProgs/PycharmProjects/NNets/ТестерНС.py", line 79, in <module>
    b = model.predict(test_data[i])
.....
ValueError: Error when checking input: expected dense_input to have shape (40,) but got array with shape (1,)

What's wrong?

 

Having searched the web and looked at the article, on the basis of which I wrote my code, I came to a disappointing conclusion that every author of any article "for beginners" is bound to forget to mention something important...

And here it turned out to be thatStandardScaler is used for teaching the network.But the article does not say a word about what it is and why it is needed.

Moreover,StandardScaler is standardization. Moreover, I want to know how I may implement the same standardization for a single input vector and even less.

Even worse, the "standardization" is carried out by columns from dataset! No, well, for just statistics, that's fine. But for forecasts, it's "***hole"! When new data arrives, do I have to retrain the network just to get the new data in the "standardisation" range?

Bullshit!

By the time this "new network" is trained, the situation may have already changed drastically. So, what the f*ck is the point of it?

So much for Python with a bunch of "sharpened" libraries....

I'd be very grateful if you could change my mind.


P.S. I just want to believe that I didn't waste my time on Python for nothing.
 
Сергей Таболин:

Having searched the web and looked at the article, on the basis of which I wrote my code, I came to a disappointing conclusion that every author of any article "for beginners" is bound to forget to mention something important...

And here it turned out thatStandardScaler is used in training the network.But the article does not say a word about what it is and why it is needed.

Moreover,StandardScaler is a standardization. Moreover, I want to know how I may implement the same standardization for a single input vector and even less.

Even worse, the "standardization" is carried out by columns from dataset! No, well, for just statistics, that's fine. But for forecasts, it's "***hole"! When new data arrives, do I have to re-train the network just to get the new data in the "standardisation" range?

Bullshit!

By the time this "new network" is trained, the situation may have already changed drastically. So, what the f*ck is the point of it?

So much for Python with a bunch of "sharpened" libraries....

I'd be very grateful if you change my mind.


P.S. I just want to believe that I didn't waste my time on Python for nothing.
Maybe because it's just the beginning.
But I've recently become interested in python's capabilities for mt. And about whether it's possible to do so, no one has answered. Added to that, it won't work in mql4. I decided not to waste my time (maybe the situation will change for the better after some time).
 

(I can barely make it out. )))

But now I have another question (for which I started it all):

When I trained the network I got the following results

Score on train data     Score on test data      Test loss       Test acc
0.970960                0.968266                0.199544        0.981424

In other words - the result is bargain!

I started my tester. I got such results

Model: "sequential_16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 15)                465       
_________________________________________________________________
dropout (Dropout)            (None, 15)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 7)                 112       
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 16        
=================================================================
Total params: 593
Trainable params: 593
Non-trainable params: 0
_________________________________________________________________
2021-01-18 17:59:04.495645: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
ожидалось [0. 1.] >>> получили [0.08348 0.08859]
ожидалось [0. 1.] >>> получили [0.08324 0.08838]
ожидалось [0. 0.] >>> получили [0.08667 0.09141]
ожидалось [0. 0.] >>> получили [0.08263 0.08784]
ожидалось [0. 0.] >>> получили [0.09200 0.09218]
ожидалось [0. 0.] >>> получили [0.08351 0.08861]
ожидалось [0. 0.] >>> получили [0.08944 0.09384]
ожидалось [1. 0.] >>> получили [0.08313 0.08828]
ожидалось [1. 0.] >>> получили [0.08432 0.08933]

Process finished with exit code 0

Well, tell me exactly where you can see that the network is trained to 98% correct results????

 

Hello, reading a few pages from the discussion didn't found anything concrete about the following question :


- Is there anything currently working like MetaTraderR or MetaTrader5 packages for MT and R integration ?


Cheers

 

Sorry, I'll continue my epic... )))

After gaining a little more knowledge from the same google, I came to conclusions:

  1. The data should be "balanced". For example, if out of 1000 input data signals appear only on 100, we need at least 1000 - ((1000 - 100) /3) = 700 "empty" data (which do not form signals) to remove.
  2. The rest should preferably be "normalised". Bring it into the range 0-1. But here is an obvious difficulty of choosing the "range" of normalization. What to consider as a minimum and what to consider as a maximum? I don't know yet. I normalize each input array by its range only...

By fulfilling these two conditions, I got a noticeable reduction in the learning curve of the network. In addition, I found that

  1. Some loss functions discarded earlier perform better.
  2. Results with fewer hidden layers began to dominate.
  3. It remains to check the number of training epochs.

Plus there was another question: what should be the response of the network?

  1. One neuron (0-1-2) (no signal-buy-sell)
  2. Two neurons [0-0],[1-0],[0-1]
  3. Three neurons [0-0-1],[1-0-0],[0-1-0]
???

Reason: