Machine learning in trading: theory, models, practice and algo-trading - page 2537

 
Aleksey Vyazmikin #:

What if the target is not randomly set?

How? Duplicate one of the inputs to the output? It's a good learning curve. I think I've even tried it.

Aleksey Vyazmikin #:

Here was an experiment - I usually have a sample divided into 3 parts, so I took, combined into one sample and trained a model of 100 trees, then saw what predictors were not used and blocked them. Then, as usual, I trained the model with a stop on retraining in the second sample, and compared the results in the third sample with the variant when I train without excluding predictors. It turned out that the results were better on the selected predictors, and here I find it difficult to conclude this effect thoughts such "the selection of different predictors occurs because of the difference in samples on the interval, by training on the whole sample we automatically select predictors that do not lose their significance over time."

Yes, you selected something that will have an impact in the future. It may even have had little influence in the past, but because of its good influence in the future on the average of the entire sample was selected.

Aleksey Vyazmikin #:

However, does this mean that the larger the sample, the more robust the model over a long horizon? Can predictors be selected for learning in this way, i.e., doesn't it promote over-learning?

As they say - the market is changing. New players come in, create new robots, and turn off the old ones, etc. I think that on a very large sample, the model will arrive at an average result for all those changes, probably zero. I think you have to adjust the depth of training for maximum efficiency and retrain regularly. I myself am experimenting with the same period (e.g. 2 year test, retraining on Saturdays, trying data size from a few days to a year or two).
Theoretically, it would be better to automatically determine the training sample size for each retraining. But so far I don't know how.

Aleksey Vyazmikin #:
Generally heard the recommendation from founders of CatBoost that it is necessary to find hyperparameters of model and then stupidly to train on all available sample for using model in work.

Creators will not advise bad) I pick up hyperparameters on Valking-Forward (VF), because the sequence of data will remain and here you can just pick up the depth of history for training, by degradation of the influence of old data. You can also use Cross Validation (CV), if the data does not change over time, but this is not about the markets.
After selecting, of course we should learn up to the present moment and use the model for the time you had in the forward area in EF or HF.
By training on the same plot as the test plot, you are adjusting the model and hyperparameters to that 1 test. And by training 10-50 times on KV or VF - you find the best hyperparameters for a large section of history.
Maybe it's better, or maybe I'm just too lazy to pick up hyperparameters once a week)) So what's really better - practice will show.

 
elibrarius #:
Theoretically, it would be better to somehow automatically determine the sample size for training, for each retraining. But so far, I don't know how

TO DETERMINE THE AVERAGE

If the size of the gen scoop is unknown, consider the re-sampling volume

n=(t^2*sigma^2)/delta_y^2

- for random sampling (for stratified and serial samples the formula is a little more complicated)

i.e. it is necessary to define with required confidence probability P and corresponding to it coefficient of confidence (degree of reliability) t=2 for 95% confidence level... allowable max error margin of the average (must be known by the industry expert, if you are a trader) into the divisor... and variance (sigma) which is unknown, but can be known from previous observations...

In general, this is what I wrote my doubts about, when I was talking about a floating window [in principle, you could say "sample size"] and t-statistics to determine a flat trend and the probability of "where we are" - to build up on a reject RS or absorb RS...

Of course, provided that your feature has a normal distribution, and it's kind of the main factor influencing the result (you may have already defined its dy/dx->min)... it's not about multifactor model (I guess in this case you can take max value from calculated ones... imho)

TO DETERMINE THE SHARE OF THE TRAIT

the same, but instead of error and variance indices of the mean, use marginal error of the share(delta_w) and variance of the alternative trait w(1-w)

if the frequency (w) is not even approximately known, the calculations take into account the maximal value of the share variance, 0.5(1-0.5)=0.25

cboe on options in estimation of asymmetry put minutes before date of expiration of 2 closest K_opt (as alternative signs)...

or any other signs for taste and color (if without options)

p.s. roughly like here

p.p.s. it's like that by logic, and how to implement calculation of sample adequacy in model building with unknown gene population is a question of availability of raw data and logic... but 2 years seems to me also a normal range for the gene pool... imho

Определение объема выборки
Определение объема выборки
  • 2013.08.16
  • baguzin.ru
Ранее мы рассмотрели методы построения доверительного интервала для математического ожидания генеральной совокупности. В каждом из рассмотренных случаев мы заранее фиксировали объем выборки, не учитывая ширину доверительного интервала. В реальных задачах определить объем выборки довольно сложно. Это зависит от наличия финансовых ресурсов...
 
JeeyCi #:

TO DETERMINE THE MEAN

if the size of the gen. scoop is unknown - consider the volume of resampling for a random sample (for stratified and serial samples the formula is a little more complicated)

n=(t^2*sigma^2)/delta_y^2

i.e. it is necessary to define the required confidence probability P and the corresponding confidence coefficient (degree of reliability) t=2 for 95% confidence level... allowable max marginal error of the average (must be known by the industry expert, if you are a trader) into the divisor... and variance (sigma) which is unknown, but can be known from previous observations...

In general, this is what I wrote my doubts about, when I was talking about a floating window [in principle, you could say "sample size"] and t-statistics to determine a flat trend and the probability of "where we are" - to build up on a reject RS or absorb RS...

Of course, provided that your feature has a normal distribution, and it's kind of the main factor influencing the result (you may have already defined its dy/dx->min)... it's not about multifactor model (I guess in this case you can take max value from calculated ones... imho)

TO DETERMINE THE SHARE OF THE TRAIT

the same, but instead of error and variance measures of the mean, use marginal error of the share (delta_w) and variance of the alternative trait w(1-w)

cboe on options in the estimation of asymmetry put minutes to the expiration date of the 2 closest K_opt (as alternative attributes)...

or any other signs for taste and color (if without options)

p.s. roughly like here

p.p.s. it's like that by logic, and how to implement calculation of sample adequacy in model building with unknown gene population is a question of availability of raw data and logic... but 2 years seems to me also a normal range for the gene pool... imho

For determining the average:
(High+Low)/2
 
Vladimir Baskakov #:
FOR the definition of the average:
(High+Low)/2

I don't mean to upset you, but "average", (high+low)/2, strictly speaking cannot be called at all, there are more academic names for such things. The moments of events are unknown and are uneven and irregular.

 
Maxim Kuznetsov #:

I don't mean to upset you, but "average", (high+low)/2, strictly speaking cannot be called at all, there are more academic names for such things. The moments of the events are unknown and are irregular and irregular.

I think this is the most average of the averages
 
Maxim Kuznetsov #:

The timing of events is unknown and irregular and irregular.

Indeed, out of habit I lose sight of " events" while I'm considering "signs"... - I keep forgetting... Thank you for reminding me of the word...! - that's where Bayes' theorem comes in, judging by the logic, so I guess

 
Maybe it's stupid, but I don't like to use anything other than close. When I have a series of observations (sorry) of close, I always know that there is a fixed period of time between observations (it is always one, stable, and known to me). And when I use low / high and different calculations with them, I find..... a random period of time between observations? which is always different, from one observation to another.
 
LenaTrap #:
Maybe it's stupid, but I don't like using anything other than close. When I have a series of observations (sorry) of close, I always know that there is a fixed period of time between observations (it is always one, stable, and known to me). And when I use low / high and different calculations with them, I find..... a random period of time between observations? which is always different, from one observation to another .

about randomness and always different is of course machanuto...it is actually the whole purpose of studying all this hullabaloo - to determine high/low more or less accurately in time and price :-)

 
LenaTrap #:
Maybe it's stupid, but I don't like using anything other than close. When I have a series of observations (sorry) of close, I always know that there is a fixed period of time between observations (it is always one, stable, and known to me). And when using low / high and different calculations with them, there is..... a random time interval between observations? which is always different, from one observation to the next.

If to approach strictly mathematically, it is necessary to use Open, because only for it the moment of arrival of its tick is Markovian - it is uniquely defined as an opening (under the assumption of ideal hours and absence of missing quotes). Close at the moment of the arrival of its tick cannot be uniquely defined as closing until the end of the timeframe segment.

But more often it is accepted to work with the close. Probably, it was adopted from the times when they worked with daily quotes.

 
Aleksey Nikolayev #:

If to approach strictly mathematically, it is necessary to use Open, because only for it the moment of arrival of its tick is Markovian - it is uniquely defined as an opening

technically, close is the only price with reliable time, i.e. at the moment of replacement of one bar by another the price is exactly equal to close.

If this first tick is 10 minutes after the bar change, it means that the open will be at this moment.

Reason: