Discussing the article: "Gaussian Processes in Machine Learning (Part 1): Classification Model in MQL5"

MetaQuotes 2026.06.16 13:39

Check out the new article: Gaussian Processes in Machine Learning (Part 1): Classification Model in MQL5.

The article considers the classification model of Gaussian processes. We will start by studying its theoretical principles moving on to the practical development of the GP library in MQL5.

We continue our acquaintance with the machine learning model – Gaussian processes (GP). In the previous article, we examined in detail the regression problem, where the main goal was to predict continuous values. Today we have to deal with a much more complex topic – classification. Its main difficulty is that the inference for classification in Gaussian processes does not have an analytical solution, which requires the use of approximate methods such as Laplace approximation.

To effectively solve this complex problem, we will develop a modular library of Gaussian processes in MQL5. This approach will allow us to structure the code by separating the GP model into independent components and will provide a solid foundation for further improvements and extensions. This library will become a universal tool for both regression and classification tasks.

In the first part of the article, we will examine in detail the theory of GP classification, including the mathematics underlying the approximate methods. We will also introduce the main class of the library — GaussianProcess, which will unite all components of the model, as well as the GPOptimizationObjective class responsible for communication with the Alglib optimization library.

Author: Evgeniy Chernish

Stanislav Korotky 2025.07.19 16:20 #1

I haven’t had a proper read through it yet, but it seems I’ve already missed something.

В отличие от таких методов, как ... деревья решений, которые выдают только метку класса, ГП позволяют получить вероятностное предсказание.

In my humble opinion, decision trees are excellent at predicting class probabilities.

For classification tasks where the targets are discrete class labels, Gaussian likelihood isn’t suitable.

It seems that ‘tree-based’ classification algorithms convert probabilities into continuous ‘log-odds’ values, and then classification effectively boils down to a regression problem on these continuous log-odds values. Why can’t this be applied to Gaussian likelihood, whatever that may be? Unfortunately, I haven’t come across this term anywhere other than in the Python manual, but I’m familiar with the Gaussian distribution, Gaussian mixtures, maximum likelihood and expectation-maximisation ;-).

Machine learning in trading: Machine Learning and Neural Bayesian regression - Has

Evgeniy Chernish 2025.07.19 18:06 #2

Stanislav Korotky #:

I haven’t read it through in detail yet, but it seems I’ve already missed something.

In my humble opinion, trees are excellent at predicting class probabilities.

Good afternoon!

Indeed, I had a look at scikit-learn; the trees return the class probability. For some reason, I thought that only ensemble methods returned probabilities. Well, you live and learn, as they say.

Now, regarding Gaussian likelihood and why it isn’t suitable for classification tasks.

Gaussian likelihood is the probability density of a normal distribution, subject to the mathematical expectation and variance. In our case, the role of the mathematical expectation in the likelihood is played by the hidden function f, whilst the variance is, in fact, the true data noise.

How does the likelihood differ from a standard probability density function? In a standard probability density function, we substitute certain values of y for fixed parameter values and obtain the probability of that y.

With likelihood, it is the other way round. Our y is fixed, whilst the distribution parameters vary. In other words, the likelihood is a function of the parameters. For example, the likelihood tells us that, for parameters 0.2 and 1, the probability of our observed trajectory y = 0.06. And with parameters of 0.8 and 1.2, the probability of observing y is 0.12. In other words, we see that the second set of parameters provides a more plausible description of the empirical data we are dealing with. Hence the name ‘likelihood’.

Now, why can’t we simply take the ‘logodds’ and apply them to Gaussian likelihood? Gaussian likelihood assumes that the observed data y follow a normal distribution. In other words, y are continuous values.

In a GP model for classification, the latent function f(x) can be interpreted as ‘logodds’. But we predict this function; we do not observe it. What we do observe are discrete labels y. And the Gaussian likelihood is applied precisely to the observed data. Our observed data, however, are discrete. And therefore, in the binary case, they are distributed according to the Bernoulli distribution.

For a classification problem, the likelihood must describe the probability of discrete labels; therefore, it is natural to choose the log-likelihood here.

Using Neural Networks in Machine Learning and Neural Calculate the probability of

nevar 2025.07.21 21:05 #3

A very good article. I look forward to your future series on Gaussian processes.

New comment