Machine learning in trading: theory, models, practice and algo-trading - page 3083

 
Lilita Bogachkova #:

I have a question for machine learning experts. If I use one character's data for training, another character's data for validation and a third character's data for testing, is this a good practice?

Also, I get the following results from the test data: green cells are very good, yellow cells are good, red cells are average.


And also a question about modifying the data to train the model. I noticed that the model has a hard time finding extrema, in my case values above 60 and values below 40.
So I find values above 60 and below 40 in the training data, which I additionally re-add to the training data before feeding it into the model, so the question is: can I improve the accuracy of the model by increasing the training data containing information about extrema?

If you can't tell the difference between the instruments, you can. Or forcibly bring them to such a state by subtracting the difference.

On the second question, no you probably can't. It will be more wrong in other places, because it will pull the gradient towards itself. But it's all individual.
 
Maxim Dmitrievsky #:
If you can't tell the difference between the instruments, you can. Or force them to that state by subtracting the difference.

On the second question, no, you probably can't. It will be more wrong in other places, because it will pull the gradient towards itself.

At the moment it does look that way.


However, before I give up on this idea, I'll see what I get from training the model by mixing different instruments (symbols) together and then creating data containing only extreme values.

 
Lilita Bogachkova #:

At this point, it really does look like this.


However, before I give up on this idea, I'll see what I get from training the model by mixing different instruments (characters) together and then creating data containing only extreme values.

Give it a try, well. I didn't see any difference between one and more than one.
 
Maxim Dmitrievsky #:
If you can't tell the difference between the instruments, you can. Or force them to that state by subtracting the difference.

Practice with different symbols for training, validation and testing nowadays allows you to improve the accuracy of prediction. As a plus for this practice I can mention that there is no limit on the size of the data, you can give as much as you want or need for validation or training.

When testing with a third symbol, you can immediately see if the model is capable of finding universal patterns, rather than getting caught up in narrow market events specific to a particular symbol.

 
Lilita Bogachkova #:

The practice with different symbols for training, validation and testing nowadays allows you to improve the accuracy of prediction. As a plus for this practice I can mention that there is no limit on the size of the data, you can give as much as you want or need for validation or training.

When testing with the third symbol, you can immediately see if the model is capable of finding universal patterns rather than being driven by narrow market events.

If there is no large bias in the data. Different symbols have different dispersion of signs, and the model can drift on them or stick in one position at all. If the signs at least do not change their properties from symbol to symbol, it is possible.
 
I want to hear opinions on fixing training data by removing values that repeat multiple times in a row, such as values that repeat more than 4 times in a row.
seq = remove_repeating_values(seq, 5)
As far as I understand, such equal values tend to reach several tens in case of flat market. Which in my opinion hinders the training of the model.
 
Lilita Bogachkova #:
I would like to hear opinions on correcting the training data by removing values that repeat multiple times in a row, such as values that repeat more than 4 times in a row.
As far as I understand, such equal values tend to reach several tens in case of flat market. Which in my opinion hinders model training.
Usually models randomly pull values, not in a row. And mixing up the sample is a sign of good judgement :) lstm can be discarded and samples can be mixed.
 
Maxim Dmitrievsky #:
Usually models randomise the values, not in a row.

Yes,

X_train, X_val, y_train, y_val = train_test_split(inputs_unique, outputs_unique, test_size=test_size_value,
                                                    random_state=random_state_value)

but the large number with the same values makes me question the overall quality of the data.
Example: seq = ([5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5]) = [5,5,5,5,5,5,5,5][5]; [5,5,5,5,5,5,5,5][5]; [5,5,5,5,5,5,5][5] ....
I don't see the point of feeding the model such training data;

So I'm still sifting out all the data that isn't unique.

inputs_unique, indices = np.unique(inputs, axis=0, return_index=True)
outputs_unique = outputs[indices]

I could be wrong, but it seems wrong to me to also feed the model the following training data:

[1,2,3,4,5] [5];

[1,2,3,4,5] [6];

[1,2,3,4,5] [7];

[1,2,3,4,5] [8];

...

 

Hello everyone. I am trying to train Expert Advisors taken from a large series of articles about neural networks on this site. I get the impression that they are not trainable. I tried to ask the author questions under the articles, but he unfortunately does not answer them practically...(

Accordingly, a question to the forum members - please tell me how much to train a neural network so that it starts to give some (not random) result?

I tried all EAs from articles 27 to the last one - the result is the same - random. I went from 300 to 1000 epochs of training, as indicated by the author. If the Expert Advisor is just with iterations, I did from 100 000 to 20 000 000 iterations and so on 2-3 approaches, still random.

How much should be trained? What is the size of a sufficient training sample (if it is pre-created)?

PS: Simple information on neural networks in google read, in general with neural networks are familiar. All write about 100-200 epochs and there should be already a result (on pictures, figures, classifications).

 
Viktor Kudriavtsev #:

Hello everyone. I am trying to train Expert Advisors taken from a large series of articles about neural networks on this site. I get the impression that they are not trainable. I tried to ask the author questions under the articles, but unfortunately he does not answer them practically...(

Accordingly, a question to the forum members - please tell me how much to train a neural network so that it starts to give some (not random) result?

I tried all EAs from articles 27 to the last one - the result is the same - random. I went from 300 to 1000 epochs of training, as indicated by the author. If the Expert Advisor is just with iterations, I did from 100 000 to 20 000 000 iterations and so on 2-3 approaches, still random.

How much should be trained? What is the size of a sufficient training sample (if it is pre-created)?

PS: Simple information on neural networks in google read, in general with neural networks are familiar. All write about 100-200 epochs and there should be already a result (on pictures, figures, classifications).

Do you have no result on the sample for training?

The cycle of those articles is not a ready-made solution out of the box - nobody will reveal the most valuable thing in machine learning - predictors. So before trying the methods proposed there, you need to develop a set of predictors that can potentially describe price behaviour.

Reason: