Discussion of article "Self-organizing feature maps (Kohonen maps) - revisiting the subject" - page 2
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Save the trained grid and post the grid and training data. I think when analysing the answer will be found how it is possible. Or alternatively, it will find what the bug is.
In general, we need a reproducible example.
There is a resource file, somnet and a screen where I took records for the resource file. Maybe it will help ;)
I have some ideas about searching for groups of items that are generally similar. I.e. we are talking about clustering. Found a method on the net: k-mean. I read the description and looked at examples. What do you use to cluster data into groups?
There are some shortcomings in the implementation of displaying the results..... But even in this form it is a working variant.
I decided to test the statistics and this is what I got:
And then, everywhere there is a consistent transition of colours from left to right or from right to left, as it is drawn in the colour palette under the picture. And here with jumping through the colours.
I think that the reason is that you have very little data for training, this is the first and probably the main thing.
The number of nodes is 4 times less than the resolution is the second.
And, it so happened that on a large spread of values (2nd column) nodes from opposite ends of the scale were next to each other, this is the third.
In the aggregate, there was such an arrangement at which a clear boundary was drawn.
But I could not reproduce the boundary in the form of a clear hexagon. In your saved network there is a border, but it is not hexagonal.
Anyway, thanks for the development.
I have some ideas about searching for groups of items that are generally similar. I.e. we are talking about clustering. Found a method on the net: k-mean. I read the description and looked at examples. What do you use to cluster data into groups?
In different ways, it depends on the task. There are many ways of clustering. Kohonen is a universal clustering tool, and everything universal cannot be perfect for a particular task.
For example, if you need to cluster univariate data and do it in the fastest and easiest way, K-means is fine, but I prefer clustering through modes rather than averages.
I think the reason is that you have very little data for training, that's the first and probably the main thing.
The number of nodes is 4 times less than the resolution is the second.
And, it so happened that on a large range of values (2nd column) nodes from opposite ends of the scale were next to each other, this is the third.
In the aggregate, there was such an arrangement at which a clear boundary was drawn.
But I could not reproduce the boundary in the form of a clear hexagon. In your saved network there is a border, but it is not hexagonal.
Screenshots of the MetaTrader trading platform
GBPUSD, H1, 2017.02.25
Alpari International Limited, MetaTrader 5, Demo
2) where did the number 4 come from? was the size of the picture divided by the number of nodes? I just can't understand the relationship. I made 70x70 on purpose to make the picture clearer.
3) 849950-142695=707255 can such a difference affect smaller differences in other columns?
4) I would like to know if it is possible to display numbers inside the picture instead of just drawing them on the side? Some numbers are not visible. Yes, pictures are saved to files, but captions in the form of numbers on the picture do not want. Is this not implemented?
Screenshots of the MetaTrader trading platform
GBPUSD, H1, 2017.02.25
Alpari International Limited, MetaTrader 5, Demo
1) reduced the number of samples to 10;
2) manually made changes for the second column for the values in rows 2,3 and 4
What is this nonsense?
I found the following:
1) the maximum value for the second column is either incorrectly counted or incorrectly displayed. I.e. if you sort all values downwards, the programme shows that the maximum value is the value in row #3, but not as in row #2. I observe such a trick only in this column;
2) I reduced a little "difference" between the maximum value of the second column and the minimum one. I allowed the three maximum values from this column to differ from each other by 1-1.8%. This is not much, is it? I.e., if you "by eye", they are almost identical among all other values from this column.
I am attaching my files again.
Shit. I don't know. This is already delusional or paranoid.
1) reduced the number of samples to 10;
2) manually made changes for the second column for the values in rows 2,3 and 4
What is this nonsense?
I found the following:
1) the maximum value for the second column is either incorrectly counted or incorrectly displayed. I.e. if you sort all values downwards, the programme shows that the maximum value is the value in row #3, but not as in row #2. I observe such a trick only in this column;
2) I reduced a little "difference" between the maximum value of the second column and the minimum one. I allowed the three maximum values from this column to differ from each other by 1-1.8%. This is not much, is it? I.e. if you "by eye" estimate, among all other values from this column they are almost identical.
I attach my files again.
Note that in all the maps of the other columns there is some kind of cluster in this place.
I mean that the result is regularly repeated because that is the structure of the data.
It's just that in the second column, this cluster with minimum values is surrounded or adjacent to the maximum values. That's why the boundary is so sharp.
But SOM puts the data in a separate cluster in the neighbourhood of the maxima because the maps are interconnected and this is the best location for this cluster.
If you try to move them to different corners on the second map, you will have to move nodes from other maps to these positions.
In maps 1,4,6,8-12 these two clusters are very close in values. That is, in 8 of the 12 maps SOM has placed them next to each other. Naturally, the remaining 4 cards can be differentiated as God sent them.
Or I don't understand your problem.
Note that in all the maps of the other columns there is some kind of cluster at this location.
What I mean is that the result repeats regularly because that is the structure of the data.
It's just that in the second column, this cluster with the minimum values is surrounded or neighbouring the maximum values. That's why the boundary is so sharp.
But SOM puts the data in a separate cluster in the neighbourhood of the maxima because the maps are interconnected and this is the best location for this cluster.
If you try to move them to different corners on the second map, you will have to move nodes from other maps to these positions.
In maps 1,4,6,8-12 these two clusters are very close in values. That is, in 8 of the 12 maps SOM has placed them next to each other. Naturally, the remaining 4 cards can be differentiated as God sent them.
Or maybe I'm missing the point of your problem.
Yeah. One problem. In the data file, the maximum value in the second column is 559000. The picture shows (where the horizontal bar, the gradient) that this maximum value is 552000. 559000 cannot be less than 552000.
552000
559000
Is this node data or pattern data?
The nodes do not have to be one-to-one with the training patterns.
552000
559000
Is this node data or pattern data?
Nodes don't have to be one-to-one with training patterns.