
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Strongly retraining the basic model as in the article. Two before and after versions:
Optimise now not sl/tp, but inputs by meta_labels (trade/not trade):
Looks like noise optimisation
Optimisation by volatility ranges. Where to trade and where not to trade.
Optimisation by volatility ranges.
How is this parameter calculated?
Where to trade and where not to trade.
Is one interval of volatility values searched or several intervals?
How is this parameter calculated?
Is a single interval of volatility values or several intervals searched?
One interval of fixed width. That is, optimisation of the interval boundaries.
Then an array with the best variants is saved, like in MT5 optimiser, you can choose.
There is even a filter for the minimum number of trades.
If there are well versed in statistics, the question is which is better:
If there are any well versed statisticians, the question is which is better:
Response from Claude. The other models responded frankly poorly (except gpt-o1, I don't have it):
From a mathematical statistical point of view, let's analyse both approaches:
1. Optimisation with n parameters:
- Pros:
* Direct search for optimal parameters
* Possibility to set specific constraints
- Minuses:
* Problem of "curse of dimensionality" when n is large
* High risk of overtraining
* Difficulty in validating the results
2. Building an over-trained model followed by searching for robust intervals:
- Pros:
* Better understanding of stable patterns.
* Ability to identify truly stable dependencies
* More reliable validation of the results
- Minuses:
* Difficulty in defining robustness criteria
* Risk of missing important nonlinear interactions.
From the matstat point of view, the second approach is preferable for the following reasons:
1. Statistical significance:
- Allows a better assessment of the statistical significance of the patterns found
- Gives the opportunity to apply hypothesis testing methods
2. Resilience of results:
- Robust intervals are by definition more resilient to changes in market conditions
- Reduces the risk of data snooping bias
3. Generalisability:
- Лучше работает principle of parsimony ([бритва Оккама](https://ru.wikipedia.org/wiki/%D0%A0%D0%B0%D1%81%D1%81%D1%82%D0%BE%D1%8F%D0%BD%D0%B8%D0%B5_%D0%9B%D0%B5%D0%B2%D0%B5%D0%BD%D1%88%D1%82%D0%B5%D0%B9%D0%BD%D0%B0))
- Higher probability of discovering patterns that really work
4. Control overlearning:
- Explicit separation into a model building phase and a stable area extraction phase
- Возможность применения [кросс-валидации](https://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%BE%D1%81%D1%81-%D0%B2%D0%B0%D0%BB%D0%B8%D0%B4%D0%B0%D1%86%D0%B8%D1%8F)
In general, the second approach is more scientifically sound and better aligned with the principles of statistical inference, although it requires a more rigorous methodology.
Constructing an overfitted model followed by searching for robust intervals
Let's imagine that quotes consist of small intervals with patterns and large intervals of noise. Training all of them together is a very weak pattern detection. So even if you find these intervals later, there will be a shitty model there - far from being optimal as if the model was built on these intervals.
So it's better to look for intervals first and then train on them. This is the third option.
Let's imagine that quotes consist of small intervals with patterns and large intervals of noise. Training all of them together is a very weak detection of regularities. So even if you find these intervals later, there will be a shitty pattern there.
Reply from Claude.
From a matstat point of view, the second approach is preferable