Machine learning in trading: theory, models, practice and algo-trading - page 2396

 
Maxim Dmitrievsky:
Yes, you can take it from my articles. Well, the tester is the easiest one. I can send you some examples of bots on python.

Well, I've got something too - grounded for quick testing of my ideas. But somehow used to end up running in the standard tester. Do you think it's worth getting rid of this habit? )

 
Aleksey Nikolayev:

Well, I have something too - sharpened for a quick test of my ideas. But it's kind of a habit to end up running it in the regular tester. Do you think it's worth getting rid of this habit? )

I like both, if it's possible to port the model quickly.
 

MacBooks are tearing and swearing with the new processors in DOD tasks. The bad news is that CatBoost doesn't support arm architecture yet, but they are kind of working on it

Conclusion

From these tests, it appears that

  • for training MLP and LSTM, the M1 CPU is by far much faster than all the high-end servers tested
  • for training CNN, M1 is only slightly slower than the high-end servers tested

Of course, these metrics can only be considered for similar neural network types and depths as used in this test.

For big trainings and intensive computing lasting for more than 20 minutes, I will still go for cloud-based solutions as they provide cards built for such long heavy loads and enable sending several jobs simultaneously. But this scenario is only for some specific research representing only 10% of my work, mostly for professional usage in some specific business area.

As a machine learning engineer, for my daily personal research, the M1 Mac is clearly the best and the most cost-efficient option today.

https://towardsdatascience.com/benchmark-m1-part-2-vs-20-cores-xeon-vs-amd-epyc-16-and-32-cores-8e394d56003d

Benchmark M1 (part 2) vs 20 cores Xeon vs AMD EPYC, 16 and 32 cores
Benchmark M1 (part 2) vs 20 cores Xeon vs AMD EPYC, 16 and 32 cores
  • Fabrice Daniel
  • towardsdatascience.com
In the first part of M1 Benchmark article I was comparing a MacBook Air M1 with an iMac 27" core i5, a 8 cores Xeon(R) Platinum, a K80 GPU instance and a T4 GPU instance on three TensorFlow models. While the GPU was not as efficient as expected, maybe because of the very early version of TensorFlow not yet entirely optimized for M1, it was...
 
So how are the falconers doing? Did your nets become reasonable, or did you make a breakthrough in technology? Tell me about it!!!!
 

Is anyone familiar with this kind of teacher training?

When the teacher is not labels, but the person himself, the person clicks the mouse on the image he likes, and AMO tries to separate this image from everything else and find the same images with the same result in the structure...


Does anyone know if there is such a training and if so what is it called?

I know how to implement it, but maybe there is a ready-made?

 
There was also a desire to express a thought out loud. A thought on the topic of non-stationarity. It is clear how to use the K-nearest neighbor method in its conditions. We take the last N patterns in time, from which we choose K closest to the forming pattern and make a decision based on them. The simplicity comes from essentially no learning. I wonder if there are any other MO algorithms that are easy to use in a similar way.
 
Aleksey Nikolayev:
I also had a desire to express a thought out loud. A thought on the topic of non-stationarity. It is clear how to use the K-nearest neighbor method in its conditions. We take the last N patterns in time, from which we choose K nearest neighbors to the pattern being formed and make a decision based on them. The simplicity comes from essentially no learning. I wonder if there are any other MO algorithms that are easy to use in a similar way.

I have researched and investigated a lot with this method as well as the method itself, I don't know why but it is the closest and most intuitive to me.

This method is from the "no model prediction" family

It's known on the Net as "prediction by analogues from prehistory" , "method of complexing analogues of MSUA" etc...

Once upon a time it was used for weather forecasting...

It's essentially the usual clustering, only more accurate... The difference is only that in usual clustering center of a cluster (prototype) something average between analogues, and the given method the center of a cluster is the current price or whatever, thus it is possible more precisely to find analogues for the current moment...

I even looked for multidimensional patterns, I even invented my own mini method to look for patterns in the prehistory, so I'm very deep into this topic...

 
mytarmailS:

I researched and investigated a lot with this method and the method itself, I don't know why but it is the closest and most intuitive to me.

This method is from the "no model prediction" family

It's known on the Net as "prediction by analogues from prehistory" , "method of complexing analogues of MSUA" etc...

Once upon a time it was used for weather forecasting...

In essence it's the usual clustering, only more accurate... The difference is only that in usual clustering center of a cluster (prototype) something average between analogues, and the given method the center of a cluster is the current price or whatever, thus it is possible more precisely to find analogues for the current moment...

I've even looked for multidimensional patterns, I even invented my own mini method to look for patterns in the pre-history, so I'm very deep into this topic...

The method is intuitively obvious, so it's impossible to avoid it. But I want some kind of variety. For example, some simple model retraining, where one new example is added and the obsolete ones are thrown out.

 

Aleksey Nikolayev:

Some simple model retraining where one new example is added and obsolete ones are thrown out.

Or throwing out obsolete examples when time proves to be too significant compared to others.

 
Aleksey Nikolayev:

Or throwing out obsolete examples when time proves to be too significant a feature compared to others.

I don't understand what the difference is between your idea and the constant retraining of AMO in the sliding window...


You take the last n images from the current one, sorted by time, make a prediction based on them, what is that supposed to accomplish?

You just retrain in a sliding window like with AMO above, what's the advantage?

Reason: