Machine learning in trading: theory, models, practice and algo-trading - page 3409

 
Maxim Dmitrievsky #:
So far from interesting is about feature neutralisation... further tutorial on ensembles

I read about feature neutralisation on your link and the link from that page.
It looks like importance of features, - like setting an importance coefficient for them and they influence the model more/less. I've seen it in Sclearn, I think it is also in other packages.
It is interesting to analyse features to determine their importance. Although mostly linear correlation with the target is used for estimation. Simple enough, but better than from the ceiling (I didn't know how to estimate them before, except for ready-made packages).

Do you know what epochs are on the graphs there? In neural networks it's another step of learning. But it's strange to represent them on graphs, I think it's something else. And it feels like they're epochs in time. Maybe steps in a valking forwards? Or just days/weeks/months.

Again the same term is applied to different things.... epochs this time.
 
Maxim Dmitrievsky #:

https://numerai.fund/

something isshakily traded there... but I'm more interested in what you can earn on models.

Yep, they take"To stake NMR on your model you must first deposit NMR into your Numerai wallet at your account's unique deposit address."So you bet on your own model with their crypto, and based on the results they either give you dough or take away yours :)

more likely to be wobbly

 
Forester #:

I read about feature neutralisation in your link and the link from that page.
It looks like feature importance, - like you set an importance coefficient and they influence the model more/less. I've seen it in Sclearn, I think it is also in other packages.
It is interesting to analyse features to determine their importance. Although mostly linear correlation with the target is used for estimation. Simple enough, but better than from the ceiling (I didn't know how to estimate them before, except for ready-made packages).

Do you know what epochs are on the graphs there? In neural networks it's another step of learning. But it's strange to represent them on graphs, I think it's something else. And it feels like they're epochs in time. Maybe steps in a valking forwards? Or just days/weeks/months.

Again the same term is applied to different things.... eras this time.
There are Eras, not epochs. A new era every week (trades open once a week). But many instruments at once. So the dataset consists of a bunch of something out there, maybe different cryptos or stocks.
 
Maxim Dmitrievsky #:
A video from them:

The point is that the neural network is better at removing noise, and profitable inputs are often in conditional outliers of predictors. So wooden models may be better at catching fleas.

Maxim Dmitrievsky #:
Above in the vidos numerai say that there are no perfect methods.

It is hard to disagree.


Maxim Dmitrievsky #:
They call their approach feature neutralisation, they look at correlation between features and labels and std. In short, according to Sanych's method

Do they have a target for regression? I haven't got into the code yet.

Maxim Dmitrievsky #:
Here you go, and at the same time you can pull up python :) and your datasets are about as huge as theirs.

If there is a binary target, I could test my method of predictor selection on their data. But I understand that they want the code together with the training phase, and just specifying which predictors should be discarded is not interesting to them?

By the way, Python is very hard to work with large datasets - it couldn't even read a csv file out of the box with my selection until I forcibly specified that it should use the int8 data type, and it places it too wide in memory. MQL5 works much better with memory in this respect.

How can I download their API?
 
Aleksey Vyazmikin #:

The point is that the neural network is better at removing noise, and profitable inputs are often in conditional outliers of predictors. Therefore, wooden models may be better at catching fleas.

It's hard to disagree.


Do they have targets for regression? I haven't looked at the code yet.

If there is a binary target, I could test my method of predictor selection on their data. But I understand that they want the code together with the training stage, and they are not interested in just specifying which predictors should be discarded?

By the way, python is very hard to work with large datasets - it couldn't even read a csv file out of the box with my sampling until I forcibly specified that it should use int8 data type, and it places it too wide in memory. MQL5 works much better with memory in this respect.

How can I download their API?

Binary target, but broken into several strata, so regression

They don't have anything optimised there, it takes a long time to calculate. It's logical to open big data in Nampa, not in PandOS.

The final model is uploaded to the site and that's it, everything is written on the site. I don't see the point, because you trade there with your own money. Well, you get a bonus for a good model and that's all. Yields are small.

 
Maxim Dmitrievsky #:

Binary target, but split into multiple strata, so regression

Can you post a sample for training and testing them? So that I don't have to bother with their API and registration?

Maxim Dmitrievsky #:
It is logical to open big data in Nampa, not in PandOS.

Yes, apparently it will be faster, but it is not clear what type of data will be and if in csv columns with date and time or text, there will probably be an error.

I now open csv and save in feather format - this thing opens data very quickly and supports different types. From now on I read only this format.

Maxim Dmitrievsky #:
I don't see the point, because you trade there with your own money. You get a bonus for a good model and that's all. Yields are small.

Well, as a benchmark to understand how far behind the leaders....

By the way, have you seen the statistics on the turnover of leaders at different eras?

 

It's amazing how my train of thought coincides with this video


Part two.


 
Aleksey Vyazmikin #:

It's amazing how my train of thought coincides with this video


Part two.


do you honestly think anyone would watch two hour long videos without at least a brief digest of "what it's about" ?

 
Aleksey Vyazmikin #:

Can you post a sample for training and testing them? So that I don't have to bother with their API and registration?

Yes, apparently it will be faster, but it is not clear what type of data will be and if in csv columns with date and time or text, there will probably be an error.

I now open csv and save in feather format - this thing opens data very quickly and supports different types. From now on I read only this format.

Well, as a benchmark for understanding how far behind the leaders you are....

By the way, have you seen the statistics on the turnover of leaders on different eras?

It's all in the notebook, no texting, no registration.

 
Maxim Kuznetsov #:

do you honestly think anyone would watch two hour long videos without at least a brief digest of "what it's about" ?

Whoever is curious will.

The video is about an attempt to extract additional information from leaves, after building a gradient bousting model, to improve the model in various ways. But in the middle of the second part our roads with the author of the video diverged.

Reason: