Machine learning in trading: theory, models, practice and algo-trading - page 1709

 
Aleksey Nikolayev:

We can't do without experiments. The main idea, as far as I understand it, is simply to significantly reduce the list of substances allowed for experiments. Here is a link to a more meaningful story in Russian about this research, with an emphasis on biology and without the details of the MO.

A good, scientific article, without the "bubbling" enthusiasm about the omnipotence of AI and the proximity of a panacea. Shows how "cunning" nature is and how naive man is who thinks he has already found the key to it and now...

The result of using MO to find a suitable option from the huge "libraries" of compounds and data on their effect on various strains has been successful. But, this is an almost isolated result and it does not guarantee similar victories in the future. Why? - Because MO uses a statistical and probabilistic approach. Other applications of this search may not be successful at all.

I would focus on learning the general principles of microbial replication and creating a tool for selective blocking in certain strains. This is what distinguishes the intelligent approach from the statistical and probabilistic approach. (That is, a universal solution, vs. a private one).

 
Retrog Konow:

Good, scientific article, without the "bubbling" enthusiasm about the omnipotence of AI and the proximity of a panacea. Shows how "cunning" nature is and how naive man is, thinking that he has already found the key to it and now...

The result of using MO to find a suitable option from the huge "libraries" of compounds and data on their effect on various strains has been successful. But, this is an almost isolated result and it does not guarantee similar victories in the future. Why? - Because MO uses a statistical and probabilistic approach. Other applications of this search may not be successful at all.

I would focus on learning the general principles of microbial replication and creating a tool for selective blocking in certain strains. This is what distinguishes the intelligent approach, from the statistical and probabilistic approach. (That is, the universal solution, vs. the particular.

At the level of single DNA molecules, quantum effects are inevitable, which are inherently probabilistic in nature and cannot in principle be considered without a theorist and matstat. And at all higher levels - up to clinical drug trials - there is no way to do without these sciences. Therefore, methods such as those used in this study are in no way alien to biology and have even led to the emergence of the term in silico (analogous to in vivo and in vitro).

 
Aleksey Nikolayev:

At the level of single DNA molecules quantum effects are inevitable, which are inherently probabilistic in nature and cannot in principle be considered without theorists and matstats. And at all higher levels - up to clinical drug trials - there is no way to do without these sciences. Therefore, methods such as those used in this study are in no way alien to biology and have even led to the emergence of the term in silico (analogous to in vivo and in vitro).

Yes, I also noticed an article on Zen about quantum "fluctuations" in DNA generating its mutations. Certainly MO is a good tool in many areas of research. But, personally, I realized - MO, is not AI, and should not be confused with it. AI will look for an absolute solution, while MO looks for a private one. They have completely different methods of work and MO will not "grow" into AI.

 
Good evening, can you advise a newbie -...


If I buy an EA (5 copies) will all subsequent updates be available? Will they be free for all 5 copies?

 
Aleksey Nikolayev:

What do you think of Hegel's Absolute Idea?)

Not familiar or do not remember :) I'm more of a Christian, solving puzzles now
 
3565832:
Good evening, can you advise a newbie-...


If you buy an EA (5 copies) will all subsequent updates be available? Will they be free for all 5 copies?

Yes
 
Maxim Dmitrievsky:
Not familiar or don't remember :) I'm more into Christianity now, solving puzzles

Are you and Alexander Toddler creating a doctrine of the Grail?)

 
elibrarius:

Alexei, you do leaf analysis, apparently you can answer... Or someone who does.

Here is a description of the splits of a tree as deep as 2 of kaboos


What does "value" mean? Is it a leaf answer? What do negative numbers mean?

If so, what is value for multiclass classification? Below are splits of one of the trees trained on 3 classes.
At each leaf we see an array of 3 values of value. What is the answer? The largest value? Then why store redundant two values? What do the negative values mean?

Interestingly, the sum of the three values is 0.

Yes, in binary classification it is the probability value of belonging to the "main" class.

I haven't done multiclassification in CatBoost, but I think it's the probability of belonging to a particular class.

The figure needs to be transformed to get the actual probability value - there is a logistic function.

Activated leaves in the model are summed up - so, among other things, signs can be with different signs - this is a balancing process, just it can be thinned out after the model is built and trash leaves and trees are discarded.

 
Aleksey Vyazmikin:

Yes, in binary classification this is the probability value of belonging to the "main" class.

I did not do multiclassification in CatBoost, but I think it is the probability of belonging to a particular class.

The figure needs to be transformed to get the actual probability value - there is a logistic function.

Activated leaves in the model are summed up - so, among other things, signs can be with different signs - this is a balancing process, just it can be thinned after the model is built and trash leaves and trees are discarded.

Thank you. That's pretty much what I thought.
Not quite sure how they calculate this value.
For example, I trained 1 tree of depth 1:

    "left": {
      "value": -0.5202020202020202,
      "weight": 384
    },
    "right": {
      "value": -0.0019267822736030828,
      "weight": 507
    },
    "split": {
      "border": 12.587499618530273,
      "float_feature_index": 0,
      "split_index": 0,
      "split_type": "FloatFeature"
    }

When I ask for a response from the tree I get:

cmodel.predict(X, prediction_type='RawFormulaVal') = -0.520202020202 - this is the value from the sheet description

cmodel.predict_proba(X)=0.372805 is the probability of class 1
checked with the formula
x1=-0.52020202
prob=math.exp(x1)/(1+math.exp(x1))=0.3728049958676699

Calculated correctly.

There are 891 lines in total in the dataset.

I counted the number of occurrences of the 1st class at
border < 12.587499618

Total found 384 examples, which corresponds to the description of the leaf, of which 89 examples of class 1.

The probability of class 1 must be
89 / 384 = 0,2317708

But the model gives a probability of 0.372805.

It turns out that some other algorithm is used there to get the probability.

 
elibrarius:

It turns out that some other algorithm is used there to get the probability.

Yes, the results are strange. Don't they take the probability from the test sample involved in the training? But there seems to be an error here.

And how many total units (lines of target) are in the sample?
Reason: