Algorithm for combining ranges of a segment - help to create - page 6

 
Aleksey Vyazmikin:

What difference does it make whether the paths are long or short, or is it a matter of estimation (the length of the arrow in the analogy of the figure)?

We have the desire to step on the two best paths in the example, if there are fewer then there is one path.

Please explain why this might be a problem.

If there are long and short paths in the set, the path is longer if you enter the area with only long paths than if you enter the area with short paths. For example, in the beginning as in your picture, and then there are two areas parallel to each other, and in the first area, the segments are 3 times shorter than in the second area and take up 75 percent of the path.

 
Valeriy Yastremskiy:

If there are areas with short and long segments in the set, then if you get to the area with only long segments, the path will be longer than if you get to the area with short segments. For example, in the beginning, as in your drawing, and then there are two areas parallel to each other, and in the first area, the segments are 3 times shorter than in the second area and take up 75 percent of the path.

The movement will start from each segment, so you must go through those areas as well.

 
Aleksey Vyazmikin:

The movement will start from every segment, so must go through those areas as well.

The movement can start from any segment, but it is clear that long segment points are not needed. In your algorithm you only have relations with the nearest segments, not with any segments, and if you hit a point with long segments, and nearby points with only long segments, this is not a good result.

 
Valeriy Yastremskiy:

The movement can start from any segment, but it is clear that long segment points are not needed. In your algorithm you only have relations with the nearest segments, not with any segments, and if you get to a point with long segments, and next to points with only long segments, it is not the best result.

"Length" is relative here, until you get to a point you can't measure it.

Another thing is the estimation into composite analogues, when one segment is represented by two, then yes, we can drop one segment.

 
Aleksey Vyazmikin:

"Length" here is relative, until we get to a point we can't measure it.

Another thing is estimation on composite analogues, when one segment is represented by two, then yes, we can drop one segment.

I don't get it. If the length/price can only be looked at by hitting a point, it's a much more difficult task. And without a sufficiently complete price / length estimate, the result cannot be estimated reliably.

It's not clear about composite analogues.

 
Valeriy Yastremskiy:

I don't get it. If the length/price can only be viewed by hitting a point, it's a much more difficult task. And without a sufficiently complete price/length estimate, the result cannot be assessed reliably.

Yes, it is.

Valeriy Yastremskiy:

It's not clear about the composite analogues.

In the figure below we have two large segments and 5 small ones below them, but you can see that they are on the same range and therefore essentially describe a similar area.

The only question is which is better - the smaller bars giving the possibility for each of them to find a correlating predictor and having a more accurate cutoff, or the larger generalisation ability in the larger bar. I think the shallow cutoffs are better, their minimum is limited in the selection.

 

Another thought occurred, why not actually take the best x% of the segments and use them to fill the space in the first step, and in the second step identify the gaps between the segments and look for segments to embed in these gaps.

The figure shows conventionally two stages.


 
Aleksey Vyazmikin:

Another thought occurred, why not actually take the best x% of the segments and use them to fill the space in the first step, and in the second step identify the gaps between the segments and look for segments to embed in these gaps.

In the figure I have shown the two stages conditionally.


Well, that's what I'm trying to say, first estimate lengths/values from points, identify multiple valuable and toxic segments and then build a path based on the values of the segments and the ability to best fill the path without gaps.

At the very least the solution won't be the best, but it will be better than average.

 

The question is off-topic and rather philosophical. Is it realised that the approach to classification by dividing attributes into segments implies a discontinuous dependence of outputs on inputs? That is, a situation may arise when a trade will open at one set of attributes and will not open at another very, very close to the first (they are near the boundary, but on opposite sides of it). I'm not saying it's the wrong approach. I just want to ask - is there some kind of trader's intuition behind it or is it an arbitrary choice?

As a possible alternative, one can suggest classification via logistic regression or the nearest neighbour method. The output may contain the estimation of probability of belonging to a class, which can be used, for example, to determine the trade volume. I do not insist on any particular algorithms, just interested in the trader aspect of choosing a particular MO algorithm.

 
Valeriy Yastremskiy:

That's what I'm trying to say, first estimate the lengths/values from the points, identify the many valuable and toxic segments and then build a path based on the values of the segments and the ability to best fill the path without gaps.

At a minimum the solution won't be the best, but it will be better than average.

The question here is how to identify "many valuable segments and toxic" - i.e. you need to identify their interchangeability, or do it in two passes, as I suggested earlier. Or do you have another option?

Reason: