Machine learning in trading: theory, models, practice and algo-trading - page 3705

 
Maxim Dmitrievsky #:

The new probability booster from Stanford University. Well, it's new, six years old.


Looks like a parametric approach where the distribution is taken from some parametric family:

Our approach is to assume
Pθ (y|x) is of a specified parametric form, then estimate the
p parameters θ ∈ Rp of the distribution as functions of x

I have not yet realised how freely the family of distributions can be chosen.

In general, the parametric approach is good because it allows to work with smaller samples. It is bad in that it is less flexible, but this can be partially compensated by the freedom of choosing the parametric family of distributions.

 
Aleksey Nikolayev #:

Thank you!

Are there any plans to do matrix/vector sorting someday?

Already have for a long time: https: //www.mql5.com/ru/docs/matrix/matrix_manipulations/matrix_sort

Документация по MQL5: Методы матриц и векторов / Манипуляции / Sort
Документация по MQL5: Методы матриц и векторов / Манипуляции / Sort
  • www.mql5.com
Сортировка матрицы или вектора по месту. Параметры axis [in]  Ось, по которой производится сортировка: 0 — горизонтальная, 1 — вертикальная...
 
Renat Fatkhullin #:

It's already been around for a long time: https: //www.mql5.com/ru/docs/matrix/matrix_manipulations/matrix_sort

For some reason I (version 5100) have it like this:

void OnStart()
{  vector v = {0, 1};
   matrix m = {{0, 1}, {2, 3}};
   v.Sort();
   m.Sort(0);
}

// 'Sort' is not a member of 'vector' type test_sort_matrix.mq5 4 6
// 'Sort' is not a member of 'matrix' type test_sort_matrix.mq5 5 6
// 2 errors, 0 warnings 2 0

 
Aleksey Nikolayev #:

For some reason I (version 5100) have it like this:

Forum on trading, automated trading systems and testing trading strategies

New version of MetaTrader 5 build 4620: bug fixes in MQL5 and new OpenBLAS methods

Rashid Umarov, 2024.10.15 16:41

According to the information from the developers, the function was deliberately disabled during the move to a new compiler. It is not known yet when it will be brought back.

 
fxsaber #:
Thanks! Somehow didn't notice that (
 
Aleksey Nikolayev #:

It looks like a parametric approach where the distribution is taken from some parametric family:

Our approach is to assume
Pθ (y|x) is of a specified parametric form, then estimate the
p parameters θ ∈ Rp of the distribution as functions of x

I have not yet realised how freely the family of distributions can be chosen.

In general, the parametric approach is good because it allows to work with smaller samples. It is bad in that it is less flexible, but this can be partially compensated by the freedom of choosing the parametric family of distributions.

For binary classification - Bernoulli distribution. Choice without selection :)

https://stanfordmlgroup.github.io/ngboost/1-useage.html

Usage
Usage
  • stanfordmlgroup.github.io
Usage We'll start with a probabilistic regression example on the Boston housing dataset: from ngboost import NGBRegressorfrom sklearn.datasets ...
 
Maxim Dmitrievsky #:

For binary classification, the Bernoulli distribution. Choice without selection :)

https://stanfordmlgroup.github.io/ngboost/1-useage.html

))

Yes, for classification the probabilistic approach certainly does not give anything new. Only for regression, when the output is not a numerical value, but its distribution.

Imho, this approach is not bad for the method, when we are looking for deviations of the distribution of the output from what it should be at SB.

Well, in theory at least) In practice, of course, we are all waiting for the factory).

 
Aleksey Nikolayev #:

))

Yes, for classification the probabilistic approach certainly does not give anything new. Only for regression, when the output is not a numerical value, but its distribution.

Imho, this approach fits well with the method when one is looking for deviations of the distribution of the output from what it should be at SB.

Well, in theory at least) In practice, of course, we are all waiting for the factory).

Yeah, it's kind of creepy in terms of looking for something via MO (classification or regression). Too many variables and the models are too complex, giving different results all the time.

I like clustarisation or HMM better.

 
Aleksey Nikolayev #:

For some reason I (version 5100) have it like this:

It turns out to have been postponed, but the documentation left it in.

Well, now we will definitely do it.

 
Alexey Burnakov:

Good afternoon, everyone,

I know that there are machine learning and statistics enthusiasts on the forum. I propose to discuss in this topic (without holivars), share and enrich our own knowledge bank in this interesting field.

For beginners and not only there is a good theoretical resource in Russian: https: //www.machinelearning.ru/.

A small review of literature on methods for selecting informative features: https://habrahabr.ru/post/264915/.

I propose problem number one. I will post its solution later. SanSanych has already seen it, please do not tell me the answer.

Introduction: in order to build a trading algorithm, it is necessary to know what factors will be the basis for predicting the price, or trend, or the direction of opening a deal. Selecting such factors is not an easy task, and it is infinitely complex.

Attached is an archive with an artificial csv dataset that I made.

The data contains 20 variables with the prefix input_, and one rightmost variable output.

The output variable depends on some subset of input variables(the subset can contain from 1 to 20 inputs).

Task: using any methods (machine learning) select input variables, which can be used to determine the state of output variable on the existing data.

The solution can be posted here in the form: input_2, input_19, input_5 (example). And you can also describe the found dependence of inputs and output variable.

Who can do it, well done ) From me the ready solution and explanation.

Alexey

Wow! thank you fir this man!