Machine learning in trading: theory, models, practice and algo-trading - page 156

 
And by the way, cosmic percentages in the long term are earned on 5 minutes, with a corresponding increase in transactions, so..... everything fits....
 
Alexey Burnakov:


2) Yeah, well... how is this Sharpe 2-3 counted? How do the funds calculate, or rather HOW do they determine that this is the real Sharpe estimate on a real trade?


The point is that on the models, Sharpe is the ratio of return to risk, there are many variations, what to count as return, such as stupid growth percentage or regression of the dynamics of growth, and the risk is CKO or maximum drawdown, etc. The differences are not fundamental, but if 2-3 were a reality everyone would be a billionaire, in real trading for several reasons it comes out many times less, even when it is done by PhD team. But a lot of it has to do with capacity, if many models trade for $100K or even up to $10M the situation would be much more pleasant, but it won't even pay back the investment and salaries with employee bonuses.

 
J.B:

That's the thing about models, Sharpe is the ratio of return to risk, there are many variations, what to count as return, such as stupid percentage growth or regression of the dynamics of growth, etc., and the risk is CKO or maximum drawdown, etc. differences are not fundamental, but if 2-3 would be real everyone would be a billionaire, in real trading for several reasons it comes out many times less, even when it deals with the PhD team. But a lot of it has to do with capacity there, if many models traded at $100K or even up to $10M the situation would be much nicer, but that wouldn't even pay back the investment and salaries with employee bonuses.

PhD is not an indicator. Overtraining will be as good as Ms and Bs. Hence the sharp drop in metrics on the real.
 
Alexey Burnakov:

Well, OK.

Suppose I'm not classifying the gain by up/down, but I'm building a regression model. So R^2 or some other deterministic metric (e.g. robust absolute variance metric) is fine.

Regarding mutual information, is it barefaced, or is there strong evidence that the metric works unreliably? I doubt it.

Update: I've done a lot of research on synthetic and real data using mutual information. If the dependence is stationary, the metric works well everywhere. If the dependence is on the verge of noise, the metric may show zero dependence. But in general I see no reason why it performs worse in multivariate nonlinear systems than, for example, F1. You can read it here:https://habrahabr.ru/company/aligntechnology/blog/303750/

But when I was doing classification of incremental price movement I got approximately the following picture (for 5 currency pairs together, i.e. one model for all)


That is, at least median accuracy values on 50 pending samples in the neighborhood of 57% at the maximum. For some currency pairs I reach median accuracy above 60%. This is only on time series data.

Sorry, no time to give strong evidence, market data is not stationary and dependencies are not linear, simulate for example a 10 dimensional fractal noise in 2d like this: red one class green the other only in 10d

As you see it is not Gaussian dependencies a lot of "islands" and so on. Well, here's how to calculate the efficiency of using mutual information or r^2 when adding and removing one dimension, how classification falls. R^2 is generally linear, in the case where the separating hypersurface of complex topology and many islands everything is sad. Here classical statistical criteria are not enough, you can check it yourself. And if there are 100d or 1000d of such mess?

 
J.B:

Sorry, no time to give strong evidence, market data is not stationary and dependencies are not linear, simulate for example 10-dimensional fractal noise in 2d like this: red one class green another only in 10d

As you see it is not Gaussian dependencies a lot of "islands" and so on. Well, here's how to calculate the efficiency of using mutual information or r^2 when adding and removing one dimension, how classification falls. R^2 is generally linear, in the case where the separating hypersurface of complex topology and many islands everything is sad. Here classical statistical criteria are not enough, you can check it yourself. And if 100d or 1000d of such mush?

It is not provable...

You can't understand me. I'm saying, I'm not classifying, I'm building a regression model. What does classification have to do with it... I'm not running any hyperplanes. I'm doing a conditional modeling of the median value of the target and measuring its quality by residuals analysis. That's how it's always done.

If we're talking about classification, then the normality requirement of something is not necessary, for example, if the probability of something is close to zero. Non-linearity and multidimensionality is just the scope of mutual information. I don't think you're quite up to speed on this issue...

 
Dmitry:

10% is the deposit load.

If you have a deposit of $1,000, you load it up by 10% - you open a trade for $100.

Now, WARNING, depending on the leverage provided by your broker/coach you can buy different lots - $10,000 (1:100), $5,000 (1:50), $20,000 (1:200).

P.S. yokernybaby........


Don't swear in a light topic...

Let's do the math.

First example. I have $500. microlot is worth $1000. I open one trade with a microlot (because purchases of larger amounts no longer fit within the inherent risk limit) and thus use 1:2 leverage. Since the dealer gives me a maximum leverage of 1:100, I load my deposit with 2% to buy $1,000 / 100.

Second example. If I open 5 trades, with the same level of capital, I load the deposit by 10%, and use 1:10 leverage (0.01 * $100,000 * 5 / 500).

That is, the maximum leverage provided depends only on the percentage of the deposit load and gives the opportunity to open the whole cut. The actual leverage involved is at my discretion. But the minimum is 1:2 for my investments.

All clear now?

 
Alexey Burnakov:

Don't swear in a light topic...

Let's do the math.

First example. I have $500. A micro lot costs $1000. I open one trade with a microlot (because buying larger amounts no longer fits within the inherent risk limit) and thus use 1:2 leverage. Since the dealer gives me a maximum leverage of 1:100, I load the deposit by 2% to buy $1,000 / 100.

If I open 5 trades, with the same level of capital, I load the deposit by 10%, and use 1:10 leverage (0.01 * 100000 * 5 / 500).

Are we clear now?

You are using the same leverage that the kitchen provides you. You don't vary the leverage (it's set - a constant), but the amount of capital you use for a given leverage.

Once again - what leverage does the kitchen give you for your type of account? 1:100?

 
Dimitri:

You use the same leverage that the kitchen provides you. It's not the leverage you're using that varies (it's a set constant), but the amount of capital you're using for a given leverage.

Once again, what leverage does the kitchen give you for your account type? 1:100?


The maximum leverage - yes, 1:100. But I don't use it. Once again.

 
Alexey Burnakov:

It's not probative...

You can't understand me. I say, I don't classify, I build a regression model. What does classification have to do with it... I'm not running any hyperplanes. I'm doing a conditional modeling of the median value of the target and measuring its quality by residuals analysis. That's how it's always done.

If we're talking about classification, then the normality requirement of something is not necessary, for example, if the probability of something is close to zero. Nonlinearity and multidimensionality is just the scope of mutual information. I don't think you're quite up to speed on this issue...

Well, how not a classification, let's take for example 1000 factors, a deep neural network with well for example 100 outputs that give probabilities of up/down movements of a given instrument at different time horizons. Is this a regression? Regression is when the price is predicted.

You may use mutual information, while we may simply look through the factors and calculate the percentage of influence of each one on the final forecast, for a specific model, which is even worse. googleNet in terms of sophistication. The main thing is that for N seconds it will move in the right direction with the given probability.

 
Alexey Burnakov:

The maximum leverage is yes, 1:100. But I don't use it. I'll say it again.

Okay, if you don't understand elementary things, there's no point in arguing.

To make a long story short, you have to divide your interest to the hedge fund's percentage by about 10.

Reason: