Bayesian regression - Has anyone made an EA using this algorithm? - page 45

 
http://www.quantalgos.ru/?p=1898 maybe the author of this thread will benefit...
Предсказание чего угодно с использованием Python | QuantAlgos
  • 2016.03.12
  • www.quantalgos.ru
Небольшая статья с ресурса http://www.talaikis.com/ о построении простой стратегии, использующую наивный байесовский классификатор при создании процесса возврата к среднему. Весь код в статье приведен на языке Python. Это достаточно большая область исследований, но расскажем все очень кратко. Мы попытаемся найти взаимоотношение между...
 
Ilnur Khasanov:
http://www.quantalgos.ru/?p=1898 maybe the author of this topic will benefit...
Did something similar on return to average, in a slightly different way. It works, but you can't release it on the real. There are problems whose solutions are completely unclear. The main drawback of the method - you have to constantly prove that you are right and not the market. :)
 

Pseudorandom number generator. (PRNG)

Using the polar coordinates method suggested above, I transformed МТ4 PRNG into PRNG with a normally distributed random value.

To visually check the correctness of the code, I projected the results on the price chart.

This is what the basic PRNG shows after 1000 calls. Areas of histogram rectangles are proportional to the number of random numbers generated that fit into that range of the vertical scale.


Now, converting those thousands of hits using the formulas of the method results in

a perfectly adequate bell.

 
Yuri Evseenkov:

Using the polar coordinate method proposed above, converted the MT4 PRNG to a PRNG with a normally distributed random variable.

An old bike from as far back as mql4.com.
 

To an attempt to apply Bayesian formula. Again.

Task. Using Bayes' theorem, determine which value of a tick that has not yet arrived is most likely.

Given. Time series x,y.

y=ax+b A line from the last tick to the future.

P(a,b|x,y)=P(x,y|a,b)*P(a)*P(b)/P(x,y); (1) Bayes formula.

P(a,b|x,y)is the probability that the coefficients a and b correspond to the x and y coordinates of a future tick.

We need to find such a and b that this probability (more correctly saidprobability measure) is maximal.

P(x,y|a,b) - let's take the real histogram of ticks distribution by price levels as a likelihood function. The function is defined by a two-dimensional array (matrix): price range - probability, percentage ratio of ticks falling within this range to the total number of ticks.

P(b) - normal distribution of increments is taken as an a priori probability b. PRNG with the normally distributed value is used.

P(a) coefficient a determines the slope of the straight line and the sign of the predicted increment. So far I'm thinking of using the linear regression code I posted earlier. I.e. take the probability of the coefficient a found there as unity. And in (1) substitute the probability P(a) calculated taking into account the difference of this a and the calculated for the given y.

Perhaps you have some thoughts about how the sign of increments of each tick behaves?


 

There is definitely no need to put ticks in the formula. Anyone can generate those ticks on FORTS, which is done every day.

The problem is not in the mathematical methods, rather. But in the adequacy of the choice of data to which to apply.

 
Why take artificial ticks at all? You can learn to predict them without higher mathematics. Ask MQ how.

Take real ticks, 10,000 pieces and look at the distribution. That would at least be practical.
 
Alexey Burnakov:
Why take artificial tics at all? You can learn to predict them without higher mathematics. Ask MQ how.

Take real ticks, 10,000 pieces and look at the distribution. At least that would be practical.

So the likelihood function P(x,y|a,b) in (1) is the real distribution of real ticks (tick volumes). It is extremely seldom normal. And P(a) and P(b) are corrective probabilities, by laws taken as a priori probabilities.

What to ask MQ? The principle of modelling ticks in the strategy tester? Yes, there must be some principle. Perhaps knowing it, we may create tester "grails". But I cannot develop it in testing mode, since I have neither tick history, nor practice working with it. Everything will be in real time.

Interested in your words:

"I don't do regression and price values (or its transformations) at all in my experiments, I predict the sign, but you could say that this is also part of the price information.

My errors look like this:

0 1

0 0,58 0,42

1 0,43 0,57

Or roughly as originally:

1 - correct, 0 - error: 1, 1, 1, 0, 0, 0, 1 , 1, 1, 0, 1

And the resulting probability distribution should be as different as possible from 0.5 / 0.5. If we obtain mutual inseparability of such outcomes, we will come to binomial distribution, and there are many, many formulas for it and statistical tests." End of quote.

What , does the binomial distribution really rule in the case of predicting a sign ? What is the mutual independence of the outcomes? Thank you.

 
Yuri Evseenkov:

So the likelihood function P(x,y|a,b) in (1) is the real distribution of real ticks (tick volumes). It is very rarely normal. And P(a) and P(b) are corrective probabilities, by laws taken as a priori probabilities.

What to ask MQ? The principle of modelling ticks in the strategy tester? Yes, there must be some principle. Perhaps knowing it, we may create tester "grails". But so far I cannot develop it in testing mode, since I have neither tick history, nor practice working with it. Everything will be in real time.

Interested in your words:

"I don't do regression and price values (or its transformations) at all in my experiments, I predict the sign, but you could say that this is also part of the price information.

My errors look like this:

0 1

0 0,58 0,42

1 0,43 0,57

Or roughly as originally:

1 - correct, 0 - error: 1, 1, 1, 0, 0, 0, 1 , 1, 1, 0, 1

And the resulting probability distribution should be as different as possible from 0.5 / 0.5. If we obtain mutual independence of such outcomes, we come to binomial distribution, and for it there are many, many formulas and statistical tests." End of quotation.

Does the binomial distribution really rule in the case of predicting a sign? What is the mutual independence of the outcomes? Thank you.

In order. Yes, the ticks are generated by the algorithm in the tester. The test on real ticks is not yet in release. There is an article on this site on how these ticks are generated. They are not real ticks at all.

About the binomial distribution. If you predict a binary variable, you get a 2*2 matrix showing the recognition accuracy. This is essentially a joint distribution of the two binary variables, the target variable and the simulated variable.

If your sequence of realizations of the target variable i.i.d. is independent and identically distributed, this opens up the possibility of applying many criteria. The result of flipping a coin is exactly that. It is a Bernoulli process. So the events are independent of each other. If this is valid, your probability distribution obeys a binomial distribution. For example, the number of successes has its probability, which is approximately normal.

I'm writing rambling, it's late. What I really like about binomial distributions, for a quadratic table apply the chi-square criterion, showing the significance that your result is different from a random guess. You can do the same for multinomial (non-binding square) tables. Also for binary variables there are many machine learning methods.
 

Using ticks for prediction is dangerous in my opinion, and the model should be set up for each broker separately.

If we take ticks from strategy tester there will be a serious difference from the real ones, because the ticks in the tester are generated by a template from ohlc values of minute bars(https://www.mql5.com/en/articles/75). That is why no one ever tests scalpers, but immediately puts them on a real account and optimizes them along the way.

About real ticks - they can vary a lot from broker to broker. For example in this thread https://www.mql5.com/en/forum/64228/page2#comment_1960403 (https://c.mql5.com/3/78/tbd.png ) attached is a screenshot, this is the distribution of tick increments over the same time frame at two different brokers. I do not remember the length of the interval, something from a day to a week. Generally they coincide, but one of them has twice more ticks without price change. If you compare more than ten brokers I think there may be huge differences, especially for the "surprise candlesticks".
As an alternative, all ticks may be removed without price changes. Then there is a nuance that OnTick() event can be skipped in the EA and a new price with the previous one skipped will be received by the terminal. I.e., not 1.23456 -> 1.23490 -> 1.23410, but simply 1.23456 -> 1.23410. And instead of two changes, your model will get one.
It will turn out that the time interval between two neighbouring ticks is not defined and there will be data gaps, I think that's bad.
It is still worth trying, you need to use MT4 and the Tickstory Lite program (there is a free version) to insert real ticks into tester (they are taken from broker Dukascopy). Only the MT4 terminal should be used with a build less than 950, otherwise the free version of tickstory will make test data with zero spread.

I tried something with ticks, like finding an average and buying and selling if the current price strongly deviates from the average. If there was any profit, then spread was eating everything up and I've gone to bigger timeframes.

The Algorithm of Ticks' Generation within the Strategy Tester of the MetaTrader 5 Terminal
The Algorithm of Ticks' Generation within the Strategy Tester of the MetaTrader 5 Terminal
  • 2010.06.02
  • MetaQuotes Software Corp.
  • www.mql5.com
MetaTrader 5 allows us to simulate automatic trading, within an embedded strategy tester, by using Expert Advisors and the MQL5 language. This type of simulation is called testing of Expert Advisors, and can be implemented using multithreaded optimization, as well as simultaneously on a number of instruments. In order to provide a thorough testing, a generation of ticks based on the available minute history, needs to be performed. This article provides a detailed description of the algorithm, by which the ticks are generated for the historical testing in the MetaTrader 5 client terminal.
Reason: