What data organization to use for finding...

 
Hello,

In fact, I don't even know how to name this, so I'll try to provide a description of what I want to achieve.

I have a chart on which I draw 10s of thounds of objects of different types, iE trendlines, rectangles, vertical and horizontal lines, triangles... All have coordinates of type price and datetime.

I have a "list" of all objects drawn.

What I would like to do now is, given the coordinates of the mouse, converted to price and datetime, query this list efficiently (meaning not iterate the whole list) and get the object(s) that are coinciding with the mouse coordinates.

What type of data organization would allow me to query this "list" or container to return these objects?

The challenge is, I only have the coordinates for the describing edges of each object.

So my thought goes to some type of 2 dimensional map overlaying the complete chart, containing the objects, so that I could simply query this map for the coordinates.

But this map would mostly be empty, so I would like to have a data structure that only holds populated data fields.

My question is, how could I go about this, and how are such structures called?


 

Searching objects by given coordinates is a GIS territory. 

You can look at Spatial indexing, R-Tree indexing being most popular. 

Before that, did you measure simple list search? Is it that slow?

Maybe you can make simple indices by day, or by hour - make list of all periods where each period contains list of all objects that appear in that period, and then search by price in that list?  if you have objects that spans multiple periods, it will appear in all lists where it appears.

 
Drazen Penic #:

Searching objects by given coordinates is a GIS territory. 

You can look at Spatial indexing, R-Tree indexing being most popular. 

Before that, did you measure simple list search? Is it that slow?

Maybe you can make simple indices by day, or by hour - make list of all periods where each period contains list of all objects that appear in that period, and then search by price in that list?  if you have objects that spans multiple periods, it will appear in all lists where it appears.

Spatial indexing sounds good....

Well, yes, lists are to slow. It happens on every mouse move event and already 10k objects significantly impacts performance. Additionally, a Trendline has only two value pairs of coordinates, so if I would go by list, I would need to prefill for every pixel and maintain a map with additional overlapping objects. It gets really heavy on computation or memory or both.

That's why I am looking for some solution other than a "brute force" approach.

In the end, I want to "highlight" the object that's being hovered by the mouse pointer and display additional information.

I am displaying an ABC pattern with trendlines and there are complex calculations that lead up to this pattern. So when I hover one of the three trend lines that make up a pattern, I want to highlight that, and display additional text information in a separate box on the chart.

So the amount of data is already around 1.5GB in memory for displaying around 50k ABC patterns, round about 300k objects, with all labels and trend lines...

This is especially required for analysis and development of strategies, as well as verifying the correct workings of the ABC pattern algorithm.

All in all, currently, it is very time consuming to work with, and one of the major issues is the displaying of accumulated data. The user needs to do a lot of handwork to retrieve all relevant data.

So I would like to make this more automated.

An ABC pattern can be composed of multiple other ABC patterns and they are all relevant for analysis.

At the moment, these data is collected manually, but the user needs to have in depth knowledge about the underlying algorithm to know what correlate. Ant then query the database to gather relevant details.

But the algorithm has all these details already processed and generated, so the data is available in memory, but I need a way to enable the user to select the relevant data, he is looking at.

Since the association is by the objects on chart, at least for the user, I would need to make these objects be more interactive, and that's where my question comes in.

Your idea to splice the data into multiple chunks was also on my mind, but it escalades the memory usage greatly, or the computational impact, or a tradeof of both.

So I am looking for different ways to do it.

I was already looking into GIS as an approach. Maybe a vector database could help... But all need to be rewritten for mql, as there is no such solution.

I am looking for an optimal approach to this, still...


 
Wouldn't every new bar or a change in time frame require everything to be recalculated - does the idea still make sense then?
 

That amount of data looks like problem for spatial database. You might want look into PostgreSQL, it's open source and it has decent GIS capabilities. 

 
Dominik Christian Egert: get the object(s) that are coinciding with the mouse coordinates.

That is still unclear. What are you going to do with the few closest ones?
     How To Ask Questions The Smart Way. (2004)
          Be precise and informative about your problem
          The XY Problem

 
Drazen Penic #:

That amount of data looks like problem for spatial database. You might want look into PostgreSQL, it's open source and it has decent GIS capabilities. 

I think R-Trees seem to be a good compromise for retrieving/accessing the data.

But I would have to implement that in mql... I just implemented RBTrees, and it was very intense to get it to work properly.




 
William Roeder #:

That is still unclear. What are you going to do with the few closest ones?
     How To Ask Questions The Smart Way. (2004)
          Be precise and informative about your problem
          The XY Problem

The object closest to the mouse coordinates. I will use it's name, which has a UID part in it, identifying the associated dataset, and get access to that dataset.

I want to "highlight" that object and the correlating ones by changing the line-thikness from 1 to 2 while the mouse is hovering that particular object. Same goes for the correlated objects, which I can retrieve from the associated datasets.

The datasets are linked to each other using the UIDs.

This way I have a 1:1 linked list of all related objects, I can access them fast by using the RBTree structure.

The relevant data used to construct the ABC pattern will be then retrieved from the datasets associated to the objects, identified by the UIDs. Relevant Data will be rendered into a text field and displayed on chart.

Also, I want to highlight the parent and children ABC patterns as well.




 
Carl Schreiber #:
Wouldn't every new bar or a change in time frame require everything to be recalculated - does the idea still make sense then?
The coordinates of objects stay the same, their parameters do not change.

When new periods get pushed to the chart, the algorithm keeps working and as soon as a pattern is deterministic, it will be drawn.

All I had to care about is the pruning of objects, when the chart looses historic bars at the end, on the left side.

That's all fine and working as expected.

All that really changes are the datetime and price coordinates of the mouse and the visible chunk of chart, but a chart could be seen as an "infinite" plain. And since we do not know the price of tomorrow on the chart, I cannot use a fixed boundary approach.




 

That's an interesting problem.

I would go with some simplified algorithm based on R-Tree concept.

You would have to divide your problem to solve in small parts. If you have in total around 300K objects, you only need to deal with the ones currently displayed right ? How much is that ? 100 ? 500 ?

If it's still too much you can divide the chart in X parts. Then you build a structure to classify your objects by time intervals, and by prices intervals.

When the mouse is moving you have firstly to find the concerned objects. That would select the equivalent of a chart window number of objects. Once you have them you apply a loop to check the neighbour of the mouse coordinates.

 
Alain Verleyen #:

That's an interesting problem.

I would go with some simplified algorithm based on R-Tree concept.

You would have to divide your problem to solve in small parts. If you have in total around 300K objects, you only need to deal with the ones currently displayed right ? How much is that ? 100 ? 500 ?

If it's still too much you can divide the chart in X parts. Then you build a structure to classify your objects by time intervals, and by prices intervals.

When the mouse is moving you have firstly to find the concerned objects. That would select the equivalent of a chart window number of objects. Once you have them you apply a loop to check the neighbour of the mouse coordinates.

You suggest a hybrid solution, little bit like 3d rendering, where you only render what you see...


Maybe that's a way to go. And I could use a partial plane for the currently visible area as well as some additional frame around the visible area. Maybe even go as far as gradually adding and removing from the partial plane. It probably would need to span multiple visible areas in all directions.

...

Some additional challenges come up. What happens if zoom level is changed. What happens if price bar scale is changed....

...

I am not convinced this is feasible in a more efficient way than just going without taking a "viewport" into consideration.

I think, it will turn out to be an implementation of R*-Tree in mql....

But. - What about vectors? I am not very educated on vector math, but any shape could be seen as a vector collection. Isn't there some math that could be used to solve the task?

Especially because vectors are native implemented in mql.





Reason: