Challenges in feature selection and model training persist due to uncertainties.

[Deleted] 2025.01.09 16:34 #36781

Searching for TCs by feature enumeration and through a model now seems rather dead-end because there are too many uncertainties in both the feature selection process and model training.

Also good cv does not promise profitability because the metrics are different.

Prada's theories are good in an academic sense because they explain something, but in a practical sense nothing works for me

MT4 doesn't have long What to feed to Artificial Intelligence

Aleksei Kuznetsov 2025.01.09 17:09 #36782

СанСаныч Фоменко #:

Can't agree with that at all - just a random selection of strings.

Let me clarify: in classical crossvalidation, rows are shuffled before they are divided into intervals, so as not to test on the rows neighbouring those on which they were trained. That would be peeking. It is done so in Alglib, I switched it off for myself

SanSanych Fomenko #:

There is advice everywhere that you need to scale predictors, for example into the interval [0:1]. The formula for mapping to the interval is based on min and max values of the predictor.

It is for neural networks to scale. Tree models don't need it.

SanSanych Fomenko #: The new predictor value will ALWAYS be interval [min : max]?

It won't, that's one of the reasons why I stopped using neural networks.
Although you can just equate everything outside the bounds.
The same is done in trees, if the melons are outside the boundaries, they just fall into the outermost/boundary leaves. But the calculations are more accurate.

Neuromongers, don't pass by What to feed to again on the tester

Aleksei Kuznetsov 2025.01.09 17:21 #36783

Maxim Dmitrievsky #:

The bot I threw into the group works without normalisation (just prices or MA on input), and the model does not stick in one position (0 or 1) on new data

but it can also work on increments, but then you need many times more attributes, otherwise it generalises worse

cv is not used in estimation and in general almost nothing from the classics is used :)

Strange...
Do you just train on prices (e.g. 1,11000)? There will be many times less examples of a specific price than increments.
For example, a price shift from 1.1 to 1.11 on history could be only 100 times, but increments of 0.01 - thousands (and on 1.1 and 1.2 and 1.3 ....), i.e. there will be many examples and the estimation is more reliable and there is something to generalise from (from thousands of examples instead of 100).

From theory to practice Machine learning for robots First sacred cow: "If

[Deleted] 2025.01.09 17:26 #36784

Forester #:

Strange...
Are you just training on prices (e.g. 1,11000)? There will be many times less examples of a particular price than increments.
For example, a price shift from 1.1 to 1.11 on history could be only 100 times, but increments of 0.01 - thousands (and on 1.1 and 1.2 and 1.3 ....), i.e. there will be many examples and the estimation is more reliable and there is something to generalise from (from thousands of examples instead of 100).

Yes, but I have the whole sample divided into states and training on each state separately. For example 10 states - 10 models. Then the best one is selected.

What exactly falls into each state is a rhetorical question.

There will be more examples of increments, and there will be more confusion in predictions due to uncertainty. Therefore, more features are needed. In the end, more features = more unique examples that are no longer repeated, just like with raw prices.

Discussing the article: "Time Indicators: Professional ZigZag Bayesian regression - Has

[Deleted] 2025.01.09 17:43 #36785

Forester #:

here is the first picture in the article and the description above it

Показатель склонности (Propensity score) в причинно-следственном выводе

www.mql5.com

В статье рассматривается тема матчинга в причинно-следственном выводе. Матчинг используется для сопоставления похожих наблюдений в наборе данных. Это необходимо для правильного определения каузальных эффектов, избавления от предвзятости. Автор рассказывает, как это помогает в построении торговых систем на машинном обучении, которые становятся более устойчивыми на новых данных, на которых не обучались. Центральная роль отводится показателю склонности, который широко используется в причинно-следственном выводе.

fxsaber 2025.01.09 18:58 #36786

Forester #:

Using it to learn you know the future 16 hours ahead of time

I didn't realise they could train on time overlapping trades.

Aleksei Kuznetsov 2025.01.09 19:21 #36787

fxsaber #:

I didn't realise they could train on overlapping timed trades.

It's not a turnaround strategy at its core that has no crossovers.
You can mark each bar and with long TP/SL there will be dozens or even hundreds of trades live at the same time.
I started with this https://www.mql5.com/ru/code/903
Now I work with modification, but the essence is the same - a lot of trades, then filter.

PS. The result is not yet in the form of an earning robot. But there is a prospect.
I tested the reversal on ZigZag. It earns something only on strong movements, but they are few. I get 1-2 trades per month. It's not interesting anymore. I am looking for more intensive trading.

Sampler

www.mql5.com

Индикатор i_Sampler рассчитывает идеальные входы, предназначен для обучения нейросети.

The miracle of the Problem with EA validation Sparring on MetaQuotes-Demo demo

Aleksei Kuznetsov 2025.01.09 19:24 #36788

Maxim Dmitrievsky #:

Yes, but I have the whole sample divided into states and training on each state separately. For example, 10 states, 10 models. Then the best one is selected.

What exactly falls into each state is a rhetorical question.

There will be more examples of increments, and there will be more confusion in forecasts due to uncertainty. Therefore, more features are needed. In the end, more features = more unique examples that are no longer repeated, just like with raw prices.

I'll try training on pure prices sometime for comparison.

[Deleted] 2025.01.09 20:44 #36789

Forester #:

I'll try training on pure prices sometime for comparison.

+ flat pair to get a feel for it.

they are less likely to go beyond the training range, if the TS is not adapted to training on prices

on trending ones you can build a regression line for the whole history and continue it on new data. Take the difference with prices. It will break down slowly, but prices (signs) will be in some kind of ranges.

Discussing the article: "Developing Elite indicators :) From theory to practice

mytarmailS 2025.01.10 06:40 #36790

Maxim Dmitrievsky #:

+ flat pair to experience.

they are less likely to go beyond the learning ranges if the TS is not adapted to learning on prices

on trending ones you can build a regression line for the whole history and continue it on new data. Take the difference with prices. It will break down slowly, but prices (signs) will be in some kind of ranges.

Or just use regression for learning, it doesn't need normalisation for signs with absolute values

https://stats.stackexchange.com/questions/613029/why-random-forest-loses-stability-on-new-data-but-gmdh-works-great

In this example, GMDH is an analogue of linear regression.

Machine learning in trading: theory, models, practice and algo-trading - page 3679