Machine learning in trading: theory, models, practice and algo-trading - page 2219

 
Maxim Dmitrievsky:

Have you tried any other clustering besides GMM?

 
mytarmailS:

Have you tried any other clustering besides GMM?

You don't need clustering, you need density estimation. Encoder and GAN will do.

There are special techniques for working with tailed distributions in MO, but I haven't quite gotten to them yet.

For example, there is such a trick. For a tailed distribution (while increments form exactly such a distribution), the sample size for training must be nearly infinite for something to work on new data. And it's been proven. What do you think?

 
Maxim Dmitrievsky:

It is not clustering that is needed, but density estimation. Encoder and GAN will do.

There are special techniques for working with tail distributions in MO, but I haven't quite gotten to them yet. This is literally the newest thing there is.

I just can't figure it out.

I've trained the model on two clusters.

> gm <- ClusterR::GMM(X,gaussian_comps = 2)
> gm
$centroids
            [,1]       [,2]       [,3]
[1,] -0.24224591 -0.5103346  0.7653689
[2,]  0.07675401  0.1668665 -0.2967750

$covariance_matrices
         [,1]      [,2]      [,3]
[1,] 1.169446 0.5971381 0.5771400
[2,] 1.006148 0.7724611 0.8297428

$weights
[1] 0.2505878 0.7494122

$Log_likelihood
            [,1]      [,2]
 [1,]  -4.060188 -3.111429
 [2,]  -6.105358 -3.516479
 [3,]  -4.301979 -4.310115
 [4,]  -3.752352 -3.583401
 [5,]  -3.172447 -3.302278
 [6,]  -7.849530 -5.254127
 [7,]  -3.055816 -3.157801
 [8,]  -5.307695 -2.795444
 [9,] -11.721658 -6.764240
[10,] -10.575876 -5.565554
[11,]  -6.760511 -5.193087
[12,]  -3.978182 -5.066543
[13,]  -2.577926 -4.418768
[14,]  -4.398716 -3.614050
[15,]  -4.082245 -5.268694
[16,]  -2.918141 -2.901401
[17,]  -9.153176 -4.797331
[18,]  -5.678321 -3.599856
[19,]  -4.500670 -2.622113
[20,]  -2.965878 -4.415078
[21,]  -4.453389 -4.152286
[22,]  -5.365306 -4.368355
[23,]  -8.533327 -3.813763
[24,]  -4.142515 -2.811048
[25,]  -7.174136 -5.631351
[26,]  -5.063518 -3.491408
[27,]  -4.935992 -8.336194
[28,]  -4.210241 -5.869093
[29,]  -3.605818 -2.577456
[30,]  -3.670845 -5.686447
[31,]  -2.733389 -5.010803
[32,]  -3.730563 -2.646749
[33,]  -3.201767 -3.689452
[34,]  -4.879268 -3.111545

which is distribution

$centroids

or

$covariance_matrices

and how to simulate them (breed similar ones)

 
Maxim Dmitrievsky:

And it's proven. What do you think?

It's a bomb.

Where did you read it?

 
mytarmailS:

Bomb.

Where did you read it?

I've seen some articles.

check them out.

https://venturebeat.com/2020/08/14/how-to-improve-ai-economics-by-taming-the-long-tail-of-data/

How to improve AI economics by taming the long tail of data
How to improve AI economics by taming the long tail of data
  • 2020.08.14
  • Matt Bornstein, Andreessen Horowitz
  • venturebeat.com
As the CTO of one late-stage data startup put it, AI development often feels “closer to molecule discovery in pharma” than software engineering. This is because AI development is a process of experimenting, much like chemistry or physics. The job of an AI developer is to fit a statistical model to a dataset, test how well the model performs on...
 
mytarmailS:

I just can't figure it out.

I trained the model on two clusters

what is the distribution

or

and how to simulate them (to multiply similar ones)

look for a package which allows you to sample from a trained model

 
Maxim Dmitrievsky:

look for a package that allows you to sample from a trained model

There are three distributions (lines)

Normal Mixture' object   ``#9 Trimodal'' 
       mu sigma    w
[1,] -1.2  0.60 0.45
[2,]  1.2  0.60 0.45
[3,]  0.0  0.25 0.10

Is it supposed to look like this ?

 
mytarmailS:

There are three distributions (lines)

Is it supposed to look like this ?

These are the Gaussian parameters

 
Maxim Dmitrievsky:

It is not clustering that is needed, but density estimation. Encoder and GAN will do.

There are special techniques for working with tail distributions in MO, but I haven't quite gotten to them yet.

For example, there is such a trick. For a tailed distribution (while increments form exactly such a distribution), the sample size for training must be nearly infinite for something to work on new data. And it's been proven. What do you think?

Well, it was just the tailed increments that proved the similarity of the price series to the SB.)))) And as a conclusion, for it to work, we have to look at the whole series, i.e., the future series as well, or if we accept that the series is infinite, the future series will be recognized. Sort of the conclusion is that there are an infinite number of variations on an infinite series and we will train on them and see them.

It is useless for practice, but it is necessary to understand.

ZS and in terms of density after evaluation, you can break it down into sections.
 
Valeriy Yastremskiy:

Well there just by tailed increments proved the similarity of the price series to SB.)))) And as a conclusion, what would work is to see the whole series, i.e., the future series as well, or if we assume that the series is infinite, the future series will be recognized. Sort of the conclusion is that there are an infinite number of variations on an infinite series and we will train on them and see them.

It is useless for practice, but it is necessary to understand.

SZY and on the density after the assessment can be divided into sections.

divide it into sections and choose the most frequent examples, the rest should be thrown out as noise.

Or, on the contrary, you may want to delimit them by rare events

As you can see from the article - this is a problem of the real world, not just Forex. And Forex traders are struggling with it in different spheres.

Reason: