mytarmailS 2024.03.09 08:50 #34131

a couple of years more and Alexey will understand that all his quantisation is a clumsy attempt to do ordinary clustering )

Aleksey Vyazmikin 2024.03.09 08:58 #34132

Forester #:

k-means in Alglib is available in Include\Math\Alglib\dataanalysis.mqh
. But it is better to feed data to it in normalised form (in one scale). Otherwise, for example, changes in 1000 units (e.g. volumes) will completely drown out changes in 0.01000 units (e.g. prices).

Yes, that's why I find the idea interesting - it should be relatively easy to port, in theory. Does Alglib support saving and applying models?

Regarding normalisation, in general normalisation is needed, but leaves are binary - the problem goes away by itself.

Data problem... IBFX or Errors, bugs, questions [Archive!] Any rookie question,

Forester 2024.03.09 10:14 #34133

Aleksey Vyazmikin #:

Yes, that's why I find the idea interesting - it should be relatively easy to port, in theory. Does Alglib support saving and applying models?

Regarding normalisation, in general normalisation is needed, but leaves are binary - the problem goes away by itself.

//| k-means++ clusterisation|
//| INPUT PARAMETERS:|
//|XY - dataset, array [0..NPoints-1,0..NVars-1].|
//| NPoints - dataset size, NPoints>=K|
//| NVars - number of variables, NVars>=1|
//| K - desired number of clusters, K>=1|
//| Restarts - number of restarts, Restarts>=1 |
//| OUTPUT PARAMETERS:|
//| Info - return code:|
//| * -3, if task is degenerate (number of|
//|distinct points is less than K) |
//|* -1, if incorrect|
//|NPoints/NFeatures/K/Restarts has been passed|
//| * 1, if subroutine finished successfully |
//| C - array[0..NVars-1,0..K-1].matrix whose columns|
//| store cluster's centres|
//| XYC - array[NPoints], which contains cluster |
//| indexes|

Get the array C with cluster centres, then use your new point to find which of the centres it is closer to.

There is something about clustering below, maybe there is a ready-made prediction function there. Figure it out.

Machine Learning and Neural A newcomer asks! Any rookie question, so

Aleksey Vyazmikin 2024.03.09 11:02 #34134

Forester #:
//| k-means++ clusterisation|
//| INPUT PARAMETERS:|
//|XY - dataset, array [0..NPoints-1,0..NVars-1].|
//| NPoints - dataset size, NPoints>=K|
//| NVars - number of variables, NVars>=1|
//| K - desired number of clusters, K>=1|
//| Restarts - number of restarts, Restarts>=1 |
//| OUTPUT PARAMETERS:|
//| Info - return code:|
//| * -3, if task is degenerate (number of|
//|distinct points is less than K) |
//|* -1, if incorrect|
//|NPoints/NFeatures/K/Restarts has been passed|
//| * 1, if subroutine finished successfully |
//| C - array[0..NVars-1,0..K-1].matrix whose columns|
//| store cluster's centres|
//| XYC - array[NPoints], which contains cluster |
//| indexes|

Get the array C with cluster centres, then use your new point to find which of the centres it is closer to.

There is something about clustering below, maybe there is a ready-made prediction function there. Figure it out.

I understand that you need an array for each cluster, which contains the values of weights (elements) of the centroid - without them you can't calculate on the new given.

Forester 2024.03.09 11:04 #34135

Aleksey Vyazmikin #:

I understand that we need an array for each cluster, which contains the values of centroid weights (elements) - without them we can't calculate on the new given.

There are no weights there, there (in C) are coordinates of centres.

Aleksey Vyazmikin 2024.03.09 11:11 #34136

Forester #:
There are no weights, there (in C) are the coordinates of the centres.

As I understand it, you need mu. It is different for each predictor, hence the vector/array.

Forester 2024.03.09 12:35 #34137

Aleksey Vyazmikin #:

As I understand it, you need mu. It is different for each predictor, hence the vector/array.

I think this formula is involved in training / finding cluster centres. For prediction you just need to find the nearest centre by C[]

mytarmailS 2024.03.09 15:19 #34138

What attributes would you include in the model if you wanted to predict whether the TS will make money on new data for the next n points?

Renat Akhtyamov 2024.03.09 15:31 #34139

Aleksey Vyazmikin #:

As I understand it, you need mu. It is different for each predictor, hence the vector/array.

The mu is the centre of the segment, the cluster in this case, as I understand it.

If it were a circle, the formula would work.

Maxim Dmitrievsky 2024.03.09 16:12 #34140

Renat Akhtyamov #:

mu is the middle of a segment, a cluster in this case, I take it.

If it were a circle, the formula would work.

The wind of life is sometimes fierce
On the whole, however, life is good
And it is not terrible when the bread is black,
It is terrible when the soul is black.

Machine learning in trading: theory, models, practice and algo-trading - page 3414

The wind of life is sometimes fierce On the whole, however, life is good And it is not terrible when the bread is black, It is terrible when the soul is black.

The wind of life is sometimes fierce
On the whole, however, life is good
And it is not terrible when the bread is black,
It is terrible when the soul is black.