Machine learning in trading: theory, models, practice and algo-trading - page 3332

 
Aleksey Nikolayev #:
I think we have already discussed his quantisation in his time from all sides. I can only add to what I said earlier that I am happy for him that it brought him at least $200.

Your twig? Today you have expressed a lot of philosophical and other fabrications off-topic (flud), and what do you call "not respecting your own principles"?
Besides, what would Kant and Diogenes, and perhaps Aristotle and Pythagoras, call a person who gets high from humiliating, insulting and belittling another person's virtues and achievements?
 
Aleksey Nikolayev #:
I think we have already discussed his quantisation in his time from all sides. I can only add to what I said earlier that I am happy for him that it brought him at least $200.

Thank you, it's nice to be happy for my income - it's rare!

The article is introductory - you are right, everything I wrote there - I think, and so clear.

The second part is in moderation, it's a bit more interesting. However, at the moment I have given up describing my own method, and came up with a simplified version, which gave a small effect on tests. It will be described in the second part.

However, this is the case when new ideas take no more than 5% of the text.

Come back to read and comment if you wish.

 
Andrey Dik #:

h ttps:// www.mql5.com/ru/forum/309270
Your thread? Today you made a lot of philosophical and other off-topic musings (flud), and what do you call "not respecting your own principles"?
Besides, what would Kant and Diogenes, and perhaps Aristotle and Pythagoras, call a person who gets high from humiliating, insulting and belittling the virtues and achievements of another person?

I don't know which Diogenes you mean, but in the sense of trolling we are all children in comparison with Diogenes of Sinope or Diogenes of Laertes.

If you look at the dates of my thread, my registration on the resource and today's one, it will become clearer. Two years after registration there was still hope for constructive and useful dialogue on the forum, and six and a half years later there is almost no hope left. Just for fun.

 
Aleksey Vyazmikin #:

Here you go thanks, it's nice to be happy for my income - a rare occurrence!

The article is introductory - you're right, everything I wrote there - I think it's clear enough.

The second part is under moderation, it's a bit more interesting. However, at the moment I have given up describing my own method, and came up with a simplified version, which gave a small effect on tests. It will be described in the second part.

However, this is the case when new ideas take no more than 5% of the text.

Come back to read and comment if you wish.

Thank you for the invitation. Naturally, I have read the first part and will read the second part too. If there are any thoughts on the text, I will share them.
 
Forester #:

Why randomly?
Cycle through all points of 1 class, and measure the distance to all points of the other class, taking the minimum distance.
When everything is obtained, sort, delete to the distance you need, one pair at a time. If the deleted point was used in another pair, you find a new point with a new minimum distance, sort again and continue.
Maybe you can think of a better way. Maybe without sorting - just delete to the required distance.

Ugh, I'm not getting it right, I guess:

  1. We build a matrix of distances - length and width of the same size as the number of examples in the sample.
  2. We build a new matrix, let's say binary, where the units are those points that meet the criterion "minimum distance".
  3. As I understand, here we need to count the number of points in the conditional island (sum up the units in the rows), and if there are more points than in the neighbouring island, and the points are divided between them, then assign these points to the pile (cluster) where there are more such points. Fix that such-and-such point belongs to set #n of points, and zero these points into the matrix from step two.
  4. Continue zeroing until there are no points left.

Did I understand the prototype of the algorithm correctly?

I am returning to the topic with such a delay, because I am a little fascinated by the idea that leaves in CatBoost models and in other tree ensembles can be strongly correlated in activation, which distorts their confidence during training, leading to an overestimation of the leaf value for the model as a whole.

 
Aleksey Vyazmikin #:

Eh, I'm a little slow on the uptake, I guess:

  1. We build a matrix of distances - length and width of the same size as the number of examples in the sample.
  2. We build a new matrix, let's say binary, where the units are those points that meet the "minimum distance" criterion.
  3. As I understand, here we need to count the number of points in the conditional island (sum up the units in the rows), and if there are more points than in the neighbouring island, and the points are divided between them, then assign these points to the pile(cluster) where there are more such points. Fix that such-and-such point belongs to set #n of points, and zero these points into the matrix from step two.
  4. Continue zeroing until there are no more points left.

Did I understand the prototype of the algorithm correctly?

I am returning to the topic with such a delay, because I am a little fascinated by the idea that leaves in CatBoost models and in other tree ensembles can be strongly correlated in activation, which distorts their confidence during training, leading to an overestimation of the leaf value for the model as a whole.

Clustering has nothing to do with this. It's just removing the closest points with different classes that contradict each other, i.e. noise. And then you can train it by clustering, or by tree - whatever you want.

1) You can also use a matrix, but not necessarily, but immediately find each point of 0 class the closest point of 1 class, i.e. we get at once point 2.
3) do not count anything and do not refer to clusters, just remove pairs of the closest points. With the distance less than the threshold, the threshold in that example would be 0.6. In other problems we will probably have to select it.
If a deleted point of 1 class was paired with another point of 0 class, then it is left without a pair, it has to find a new nearest point of 1 class (again make a calculation or use a matrix, as you suggested in point 1, if memory is enough, I think a matrix of 1million by 1million will not fit into any memory, up to 100 thousand maybe).
4) not until it remains, but up to the threshold distance. If it is very large, then only points of 1 of the classes will remain, which were initially more.

But as I wrote before, I don't think this noise removal is a good idea (see https://www.mql5.com/ru/forum/86386/page3324#comme nt_50171043). It's not like you can't remove that noise when making predictions. The tree itself will mark noisy leaves by giving them a probability of about 50%, and take for example non-noisy leaves with a probability of one of the classes >80% (or as many as you see fit).

Машинное обучение в трейдинге: теория, модели, практика и алготорговля - На рыночных данных сигналы пропадают, потому что на новых данных сигналы выходят за узкий допустимый диапазон.
Машинное обучение в трейдинге: теория, модели, практика и алготорговля - На рыночных данных сигналы пропадают, потому что на новых данных сигналы выходят за узкий допустимый диапазон.
  • 2023.10.26
  • www.mql5.com
если в работу использовать листья с высокой чистотой классов и не делить листья до 1 примера в листе. остальные как то достигли чистоты листьев например 70 - вроде неплохо. Препочитаю дерево и лист с честными 53 чистоты одного из классов
 
Forester #:
Clusters have nothing to do with it. It's just removing the nearest points with different classes that contradict each other, i.e. noise. And then you can use clustering, or tree - whatever you want to train.
.

1) You can also use a matrix, but not necessarily, but immediately find each point of 0 class the closest point of 1 class, i.e. we get at once point 2.
3) do not count anything and do not refer to clusters, just remove pairs of the closest points. With the distance less than the threshold, the threshold in that example would be 0.6. In other problems we will probably have to select it.
If a deleted point of 1 class was paired with another point of 0 class, then it is left without a pair, it has to find a new nearest point of 1 class (again make a calculation or use a matrix, as you suggested in point 1, if memory is enough, I think a matrix of 1million by 1million will not fit into any memory, up to 100 thousand maybe).
4) not until it remains, but up to the threshold distance. If it is very large, then only points of 1 of the classes will remain, which were initially more.

But as I wrote before, I don't think this noise removal is a good idea (see https://www.mql5.com/ru/forum/86386/page3324#comme nt_50171043). It's not like you can't remove that noise when making predictions. The tree itself will mark noisy leaves by giving them a probability of about 50%, and take for example non-noisy leaves with a probability of one of the classes >80% (or as many as you see fit).

I can't quite get it into my head yet. Well it all happens in one space - in the metric of one predictor, but how to take into account the others?

As for what to do when predicting - I was thinking of using two models - one that detects what has been dropped or confirms that the data is in the "clumping" region, and the other that already works on what is left.

 
Aleksey Vyazmikin #:

I can't get my head round it yet. Well, it all happens in one space - in the metric of one predictor, but how to take into account the others?

As for what to do when predicting - I was thinking of using two models - one detects what has been eliminated or confirms that the data is in the "clumping" area, and the other already works on what is left.

https://www.mql5.com/ru/articles/9138

No one has cared for a year now

I've written a dozen or twenty algorithms like this, some are well established. The one in the article is not the best in terms of stability of results, the first pancake.

so there is nothing to discuss, because there is nothing better yet.


Метамодели в машинном обучении и трейдинге: Оригинальный тайминг торговых приказов
Метамодели в машинном обучении и трейдинге: Оригинальный тайминг торговых приказов
  • www.mql5.com
Метамодели в машинном обучении: Автоматическое создание торговых систем практически без участия человека — Модель сама принимает решение как торговать и когда торговать.
 
I am new to ML. I am working on several models. And in the last week a problem has appeared. None of the models are saved in ONNX(((. Who has encountered this problem?
WARNING:tf2onnx.tf_loader:Could not search for non-variable resources. Concrete function internal representation may have changed.
ERROR:tf2onnx.tf_utils:pass1 convert failed for name: "model_3/lstm_4/PartitionedCall/while"
op: "StatelessWhile"
input: "model_3/lstm_4/PartitionedCall/while/loop_counter"
input: "model_3/lstm_4/PartitionedCall/while/maximum_iterations"
input: "model_3/lstm_4/PartitionedCall/time"
input: "model_3/lstm_4/PartitionedCall/TensorArrayV2_1"
input: "model_3/lstm_4/zeros"
input: "model_3/lstm_4/zeros_1"
input: "model_3/lstm_4/PartitionedCall/strided_slice"
input: "model_3/lstm_4/PartitionedCall/TensorArrayUnstack/TensorListFromTensor"
input: "Func/model_3/lstm_4/PartitionedCall/input/_3"
input: "Func/model_3/lstm_4/PartitionedCall/input/_4"
input: "Func/model_3/lstm_4/PartitionedCall/input/_5"
attr {
  key: "T"
  value {
    list {
      type: DT_INT32
      type: DT_INT32
      type: DT_INT32
      type: DT_VARIANT
      type: DT_FLOAT
      type: DT_FLOAT
      type: DT_INT32
      type: DT_VARIANT
      type: DT_FLOAT
      type: DT_FLOAT
      type: DT_FLOAT
    }
  }
}
attr {
  key: "_lower_using_switch_merge"
  value {
    b: false
  }
}
attr {
  key: "_num_original_outputs"
  value {
    i: 11
  }
}
attr {
  key: "_read_only_resource_inputs"
  value {
    list {
    }
  }
}
attr {
  key: "body"
  value {
    func {
      name: "while_body_149241"
    }
  }
}
attr {
  key: "cond"
  value {
    func {
      name: "while_cond_149240"
    }
  }
}
attr {
  key: "output_shapes"
  value {
    list {
      shape {
      }
      shape {
      }
      shape {
      }
      shape {
      }
      shape {
        dim {
          size: -1
        }
        dim {
          size: 128
        }
      }
      shape {
        dim {
          size: -1
        }
        dim {
          size: 128
        }
      }
      shape {
      }
      shape {
      }
      shape {
        dim {
          size: 1
        }
        dim {
          size: 512
        }
      }
      shape {
        dim {
          size: 128
        }
        dim {
          size: 512
        }
      }
      shape {
        dim {
          size: 512
        }
      }
    }
  }
}
attr {
  key: "parallel_iterations"
  value {
    i: 32
  }
}
, ex=Could not infer attribute `_read_only_resource_inputs` type from empty iterator
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-50 ef5b7ad3f4> in <cell line: 87>()
     85 
     86 # Преобразовать Keras-модель в ONNX формат
---> 87 onnx_model = tf2onnx.convert.from_keras(model)
     88 
     89 # Сохранить модель в формате ONNX

8 frames
/usr/local/lib/python3.10/dist-packages/onnx/helper.py in make_attribute(key, value, doc_string, attr_type)
    874         value = list(value)
    875         if len(value) == 0 and attr_type is None:
--> 876             raise ValueError(
    877                 f"Could not infer attribute `{key}` type from empty iterator"
    878             )

ValueError: Could not infer attribute `_read_only_resource_inputs` type from empty iterator
 
Aleksey Vyazmikin #:

I can't get my head round it yet. Well, it all happens in one space - in the metric of one predictor, but how to take into account the others?

As for what to do when predicting - I was thinking of using two models - one detects what has been eliminated or confirms that the data is in the "clumping" area, and the other already works on what is left.

In the example there are 2 predictors, i.e. we change the distance in 2-dimensional space (calculate the hypotenuse). If there will be 5000 signs, then you will measure the distance in 5000-dimensional space (how to measure - see the code k-means in Algibe, there just this is the main task - to measure distances, take it as a basis).
It looks like the root of the sum of squares of cathetes in all spaces https://wiki.loginom.ru/articles/euclid-distance.html.

If you will really do it - don't forget to adjust predictors, so that for example volumes of 1...100000 don't swallow price deltas of 0,00001...0,01000 in calculations.

How to detect it? That is the question. Especially on market data, where there will not be such a clear separation of the noisy area as in the example. Everything will be noisy, 90-99 per cent.

It may be easier to use ready-made packages for removing noisy lines, maybe they have a detector....

Reason: