Discussion of article "Advanced resampling and selection of CatBoost models by brute-force method" - page 10
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Which ones?
F1, MCC seems better.
Here's the full list
https://catboost.ai/docs/concepts/loss-functions-classification.html
F1, MCC seems to be better
here's the full list
https://catboost.ai/docs/concepts/loss-functions-classification.html
Once again.
Well, yeah, it makes sense sometimes.
A little more staking. Yeah, it makes sense to stack. It's still an open question as to how much.
Great article and titanic work!
Dropped the labels standing after the change of direction in the dataset before feeding them into the mixtures model.
from observation, more models give positive results.
results of the best test in tester and terminal:
Generally, a beautiful work. I used it to test all my targets and threw them in the rubbish)))
Great article and a Herculean effort!
Dropped the labels standing after the change of direction in the dataset before feeding them into the mixtures model.
from observation, more models have a positive result.
results of the best test in the tester and terminal:
In general, beautiful work. With its help I checked all my target ones and threw them in the rubbish)))
yes, you can drop before clustering
thanks for the feedback :)
Z.Ы so you can test all models at once, averaged. Play. Parser for all models has not made yet, still in doubt. But sometimes an ensemble of several models really improves.
ZYZY. you can make an enumeration of different combinations of trained models by the same metric R2, as a development of the theme. Then keep the best ensemble. It is even possible through genetics, if there are many models.I took EURUSD, H1 data from 2015 to 2020 and did split it into three sets:
I have double checked my code and yet I might have done something wrong. Anyhow you might have some idea about the results. Best regards, Rasoul
I took EURUSD, H1 data from 2015 to 2020 and did split it into three sets:
I have double checked my code and yet I might have done something wrong. Anyhow you might have some idea about the results. Best regards, Rasoul
Can you tell me how I can upload my data via csv file?
I tried it this way, but it didn't load.
The format of the file is:
time,close
2020,11,15,1.3587
2020,11,16,1.3472
Can you tell me how I can upload my data via a csv file?
pr = pd.read_csv('pr.csv', sep=';')i.e. this is an example of loading data from the terminal, saving it to a file. And then you can use it in colab
Hi, Rasoul. Try to reduce the training set size. It can depends of different settings, but key trick is that then less train size, better generalisation on new data. In the next article I'll try to explain this effect.
That's good. It would be nice to see in the article a remark about the scope of applicability of this thesis - in particular, to different IO methods. For some reason they recommend 70/30 for NS. And logically, IMHO, 50/50 should give more stable results.
This is good. It would be nice to see a remark in the article about the scope of applicability of this thesis - in particular, to different methods of MO. For some reason 70/30 is recommended for NS. And logically, IMHO, 50/50 should give more stable results.
There is active and passive learning. Passive learning consists in manual markup of data and training on them, in this case there should be a lot of data, but there is a problem of correct markup. That is, the "teacher" has to mark up the data so that it is, conventionally, from the same distribution and generalises well. In this regard, it makes almost no difference what proportion of the trainee\test. It gives you almost nothing, it's just a model check, a check of how well you have labelled the data manually.
In active learning, the model learns to mark up the data in an optimal way. The article is just such a case of partitioning through GMM. That is, both learning with and without a teacher is used. In this case, the model learns to learn from small partitioned data and must partition the remaining data itself in an optimal way. This is a relatively new approach (from around 2017). And I want to look at it in more detail in a follow-up article.
too much "data" in the sentences, I apologise for the tautology )