How to improve the quality of data on OOS - General

Forester 2023.08.20 06:23 #31891

Aleksey Vyazmikin #:

I understand.

I have another suggestion to you, what if to make more manageable the process of forest construction, and to take as a root for each tree a concrete subsample of the selected quantum segment?

The depth should be 2-3 splits, so that the examples of classifiable class by leaf would be not less than 1%.

I think the model will be more stable.

I.e. if you select 10 quanta/splits, then train 10 trees on examples from these splits? Seems simple to do.
About stability on OOS - experiment will show. My stability is violated when changing the size of the data window (2 months and 4) and when shifting it even by 2% (training on Tuesday instead of Saturday). Trees turn out to be different.

Aleksey Vyazmikin #:

I conducted an experiment with the sample on which I published the gifs, there are already 47% units in the sample, the data were summarised in the table.

...
it turned out that the quality (usefulness) of these quantum segments is worse (less) than the original ones by 10 times.

About such deterioration (in times) I assumed, when I communicated with fxsaber, about mixing by his algorithm. He does not have such a strong difference on his data. Apparently because he does not have all bars in a row in the markup (or rows standing in a row), but with large gaps. If your bars are close, they have very similar past and future, i.e. 20 examples of class 1 can be in a row. By randomising them you make them average 0101010...., and you should change the whole series of 20 "1's" to 20 "0's". Since they are close and can be counted as one example. If it is not so for you, it is so for me (I evaluate all bars in a row, that's why this idea came).

In general, I think that with such a strong difference of 10 times, it is possible not to make 10000 tests. Too clear difference in the first 10 tests (all worse) to assume that another 10000 will raise the result to equality with the original. If it was 3 worse, 3 better, 4 roughly equal, then yes - keep accumulating stats.

If the data is serialised, the problem is that a series of 20 1's somewhere in history will find a series of 20 0's with a similar past. It's market randomisation here. Not in making 111111111 into 010101010.

UPD So I think Monte Carlo in the form of 01010101 for market data won't work for market data (if it goes in series). It's like dividing a rectangle and a square into equal squares and then trying to determine which primary figure the square belonged to)).

Is there a pattern Thoughts on the random Bayesian regression - Has

Maxim Dmitrievsky 2023.08.20 07:09 #31892

Aleksey Vyazmikin #:

I wrote about the strict sequence just as an example for clarity. And, I wrote that the solution of this problem can improve the stability of the model. But the solution can be different.

Even without solving the above mentioned problem - selection of the correct quantum table improves learning, which was tested by me on dozens of samples.

Then I showed how you can quickly do preprocessing for training, cleaning the sample from inconsistent data. You can see on the gifs that you can even get a profitable model on new data with this method.

In the end, the approach works, and its development is my goal.

Therefore, to say that it doesn't is to deny reality.

I do not believe that the price is pure SB, the nature of which cannot be at least partially disassembled. If it is pure SB, then the whole thread is a mistake.

I believe that we should do a conference of machineliners. Obviously with a buffet and somewhere in the UAE. And there in a formal and then informal atmosphere to discuss everything. Otherwise it is inconvenient to do it through the forum.

The programme would look like this: a day conference, a day everyone drinks, the next day everyone fights, pulling each other's breasts, then again conference and so on round and round. On the fly :)

The sponsor and the main speaker would be Saber, then Alexei Nikolaev, then everyone else :)

Looking for patterns How to code? hello all

Aleksey Nikolayev 2023.08.20 08:23 #31893

Aleksey Vyazmikin #:

What does profit have to do with preprocessing data for further classification?

What was the point of your numerous gifs with steadily steepening balance sheets? Maybe you just didn't understand the answer to your question?

Aleksey Nikolayev 2023.08.20 08:26 #31894

Maxim Dmitrievsky #:
I think we should do a machine conference. It would have to include a buffet and be held somewhere in the UAE. And there in a formal and then informal atmosphere to discuss everything. Otherwise it is inconvenient to do it through the forum.

The programme would look like this: one day conference, one day everyone drinks, the next day everyone fights, pulls each other's breasts, then conference again and so on round and round. On the fly :)

The sponsor and the main speaker would be Saber, then Alexei Nikolaev, then everyone else :)

The idea for Saber's money to familiarise himself with his strategies seems great and well thought out. I don't even know what could go wrong 🤔

Maxim Dmitrievsky 2023.08.20 08:27 #31895

Aleksey Nikolayev #:

The idea for Saber's money to familiarise himself with his strategies seems great and well thought out. I don't even know what could go wrong 🤔

😀😀 forgot to add - main sponsor as the most successful. But everyone needs to chip in.

I think it's possible to find people to sponsor his talk.

The point of the conference is probably not to discuss specific strategies, but general approaches, philosophy, tools and so on.

How to check if I am in favour Creating a positive IO

Aleksey Vyazmikin 2023.08.20 10:35 #31896

Forester #:

I.e., if you select 10 quanta/segments, then train 10 trees using examples from these segments? It seems simple to do.
About stability on OOS - experiment will show. My stability is violated when changing the size of the data window (2 months and 4) and when shifting it even by 2% (training on Tuesday instead of Saturday). The trees turn out to be different.

Yes, it's all like that - the approach can be made more complicated of course, but then if you want to.

Right now, if I remember correctly, the predictor in the tree hits just half of the range, without searching for the best place to split?

As for the success of the idea - I absolutely agree, but water doesn't flow under a lying stone either.

Forester #:

I thought about such deterioration (by times) when I talked to fxsaber about mixing with his algorithm. He does not have such a strong difference on his data. Apparently because he does not have all bars in a row in the markup (or rows standing in a row), but with large gaps. If your bars are close, they have very similar past and future, i.e. 20 examples of class 1 can be in a row. By randomising them you make them average 0101010...., and you should change the whole series of 20 "1's" to 20 "0's". Since they are close and can be counted as one example. If it is not so for you, it is so for me (I evaluate all bars in a row, that's why this idea came to me).

In general, I think that with such a strong difference of 10 times, it is possible not to make 10000 tests. Too obvious difference in the first 10 tests (all worse) to assume that another 10000 will raise the result to equality with the original. If it was 3 worse, 3 better, 4 roughly equal, then yes - keep accumulating stats.

If the data is serialised, the problem is that a series of 20 1's somewhere in history will find a series of 20 0's with a similar past. It's market randomisation here. Not in making 111111111 into 010101010.

UPD So I think Monte Carlo in the form of 01010101 for market data won't work for market data (if it goes in series). It's like dividing a rectangle and a square into equal squares and then trying to determine which primary shape the square belonged to)).

Unfortunately, I made a mistake when processing the data (I was redesigning the script for these tests quickly and one nuance was not taken into account), the table is like this as a result

The conclusion is that the data can randomly fall into the ranges of quantum tables and pass the available stability test. Default settings/criteria were used - now I will try to tighten them and see the result.

However, I've written before that about only 30% of quantum cutoffs show their efficiency on the other two samples, so the result was generally to be expected. It was just its strangeness that made me double-check everything. How to improve the selection result is the challenge.

However, the purpose of quantisation is to select a group with a probability shift. It is possible that a stable leaf can be found within it through the split, despite the fact that the group itself will shift to another target on new data.

In the sample I did the experiment on - there is an average of 1 signal per day, I think, so the bars are far apart.

I think it would be more interesting to look at the results of the experiment I suggested above - it should show how often randomly generated target responses fall into the sampled quantum segments. Just this will be the fixed spaced already "chests", as Aleksey Nikolayev suggested in his abstraction.

You can send your sample, I will select quantum segments, and on these data you can experiment with creation of modified forest, or I can give you my sample.

Is there a pattern A great book on Gathering a team to

Aleksey Vyazmikin 2023.08.20 10:38 #31897

Maxim Dmitrievsky #:
I think we should do a machine conference. It would have to include a buffet and be held somewhere in the UAE. And there in a formal and then informal atmosphere to discuss everything. Otherwise it is inconvenient to do it through the forum.

The programme would look like this: one day conference, one day everyone drinks, the next day everyone fights, pulls each other's breasts, then conference again and so on round and round. On the fly :)

The sponsor and the main speaker would be Saber, then Alexei Nikolaev, then everyone else :)

Fourchette - sounds not bad, but the need for violence - well, I have not noticed for myself. I am saddened that I am not understood, but it does not cause such strong aggression in itself.

Maxim Dmitrievsky 2023.08.20 10:39 #31898

Aleksey Vyazmikin #:

Furshet - sounds not bad, but the need for violence - well, I haven't noticed it. It makes me sad that I am not understood, but it does not cause such strong aggression in itself.

Violence is only consensual and when the arguments are over, all civilised people

Aleksandr Slavskii 2023.08.20 10:52 #31899

Maxim Dmitrievsky #:
I think we should do a machine conference. It would have to include a buffet and be held somewhere in the UAE. And there in a formal and then informal atmosphere to discuss everything. Otherwise it is inconvenient to do it through the forum.

The programme would look like this: a day conference, a day everyone drinks, the next day everyone fights, pulling each other's breasts, then again conference and so on round and round. On the fly :)

The sponsor and the main speaker would be Saber, then Alexei Nikolaev, then everyone else :)

I wanted to read about machine learning, and here humourists are honing their skills.

I would like to see humour jokes and other things not related to the topic elsewhere.

Now on the topic.

You write that you think the market is random, what is the basis of this statement?

Do you have any solid grounds to prove the randomness of market price movement?

Do you know how [WARNING CLOSED!] Any newbie Can it be?

Maxim Dmitrievsky 2023.08.20 10:55 #31900

Aleksandr Slavskii #:

I wanted to read about machine learning, and here are humourists honing their skills.

I would like to see humour jokes and other things not related to the topic elsewhere.

Now on topic.

You write that you think the market is random, what is the basis of this statement?

Do you have any strong grounds to prove the randomness of market price movement?

From the information point of view the market is random, if we compare the amount of information in sb and quotes. I did that a few years ago. From a layman's point of view - the market changes, patterns change over time.

This is not humour, I support such events and am ready to participate in them.

MT5 today data can ACTIONS news, forecasts, expectations Trading probability

Machine learning in trading: theory, models, practice and algo-trading - page 3190