Discussion of article "Data Science and Machine Learning (Part 08): K-Means Clustering in plain MQL5"

 

New article Data Science and Machine Learning (Part 08): K-Means Clustering in plain MQL5 has been published:

Data mining is crucial to a data scientist and a trader because very often, the data isn't as straightforward as we think it is, The human eye can not understand the minor underlying pattern and relationships in the dataset, maybe the K-means algorithm can help us with that. Let's find out...

Clustering analysis is a task of grouping a set of objects in such a way that objects with the same attributes are placed within the same groups (clusters).

If you go to the mall, you will find similar items kept together right? Someone did the process of grouping them, When the dataset isn't grouped the clustering analysis will do just like that, group the data values that are more similar(in some sense) to each other than the rest of the groups(clusters).

Clustering analysis itself is not a specific algorithm. The general task can be solved through various algorithms that differ significantly in terms of their understanding of what constitutes a cluster.

Img src: wikipedia

There are three types of clustering widely known;
  1. Exclusive clustering 
  2. Overlapping clustering 
  3. Hierachial clustering 

Author: Omega J Msigwa

 
m_cols = Matrix.Cols();
      n = Matrix.Rows(); //number of elements | Matrix Rows
      
      InitialCentroids.Resize(m_clusters,m_cols);     
      vector cluster_comb_v = {};
      matrix cluster_comb_m = {};      
      vector rand_v = {};
      
      for (ulong i=0; i<m_clusters; i++) 
        {
          rand_v = Matrix.Row(i * m_clusters); 
          InitialCentroids.Row(rand_v,i);
        }     
     Print("Initial Centroids matrix\n",InitialCentroids);    

hi  Omega J Msigwa, thanks your very useful article.

am I missing something or in above code you mean DMatrix?

 
Mahdi Ebrahimzadeh #:

hi  Omega J Msigwa, thanks your very useful article.

am I missing something or in above code you mean DMatrix?

I mean Matrix as explained in the article, since this code is found under the function

void CKMeans::KMeansClustering(const matrix &Matrix, matrix &clustered_matrix,int iterations = 10)
 { 
      m_cols = Matrix.Cols();
      n = Matrix.Rows(); //number of elements | Matrix Rows
      
      InitialCentroids.Resize(m_clusters,m_cols);
      cluster_assign.Resize(n);
            
      clustered_matrix.Resize(m_clusters, m_clusters*n);
      clustered_matrix.Fill(NULL);
      
      vector cluster_comb_v = {};
      matrix cluster_comb_m = {};      
      vector rand_v = {};      
      for (ulong i=0; i<m_clusters; i++) 
        {
          rand_v = Matrix.Row(i * m_clusters); 
          InitialCentroids.Row(rand_v,i);
        }     
     Print("Initial Centroids matrix\n",InitialCentroids);    
.... rest of the code
Reason: