The user is testing a model with various parameters and observing its performance.

Mihail Marchukajtes 2016.11.06 22:25 #1931

Well. I started the week before the market opened, actively testing version 14. I would like to say the following. The longer I train, the more inputs are involved in the TS. More predicates. I had up to 8-9 entries at maximum. However, the generalization ability in this case is not usually high. And such TSs work with a stretch. In other words, they can barely reach the mark of 3. But predicates with 4-6 inputs work satisfactorily. I increased the number of entries from 50 to 150. I've been practicing for the third hour. But I think that this time there will also be some inputs. So let's see......

Decision-making error All freelancers are welcome Loop question

Mihail Marchukajtes 2016.11.07 02:40 #1932

Well, again I noticed such a thing. The thing is that I have a set of data, perdicts 12 and then go their same lag, lag1 and lag2. Previously, the inputs were mostly at the beginning of the set, that is, lags were few and then no more than lag1, rarely when there was lag2. Now, on the contrary, the first data is practically not used at all, but lag1 and what is the most regrettable lag2 began to appear more often. But the fact is, before the generalization went on the initial columns mainly, now on the final.... practically, so let's make conclusions....

Profitable system needing EA About coding style MQ5 VPS EA [Trade

Yury Reshetov 2016.11.07 07:42 #1933

Mihail Marchukajtes:
Well, again I noticed this stuff. The thing is that I have a set of data, perdicts 12 and then go their same lags, lag1 and lag2. Previously, the inputs were mostly at the beginning of the set, that is, lags were few and then no more than lag1, rarely when there was lag2. Now, on the contrary, the first data is practically not used at all, but lag1 and what is the most regrettable lag2 began to appear more often. But the fact is, before the generalization went on the initial columns mainly, now on the final.... practically, so let's conclude....

So you need to roll back to previous versions.

My flight is normal. Maybe because there are no lags in the sample?

СанСаныч Фоменко 2016.11.07 09:18 #1934

Dr.Trader:

It looks good in general, I wonder what will happen in the end.

About the committee - I posted some examples, but there are models that use regression with rounding when classifying, and there is not so straightforward. I tried two different ways of combining votes:

1) Round everything up to classes, take the class for which there would be more votes.
I.e., having a 4-bar forecast from three models
c(0.1, 0.5, 0.4, 0.4) c(0.6, 0.5, 0.7, 0.1) c(0.1, 0.2, 0.5, 0.7) I would further round it up to classes
c(0, 1, 0, 0) c(1,1,1,0) c(0,0,1,1) , and the final vector with predictions would be c(0, 1, 1, 0) by number of votes.

2) another option is to find the average result right away, and only then round it up to the classes
the result would be c((0.1+0.6+0.1)/3, (0.5+0.5+0.2)/3, (0.4+0.7+0.5)/3, (0.4+0.1+0.7)/3)
or (0.2666667, 0.4000000, 0.5333333, 0.4000000), or
c(0, 0, 1, 0)

You can see that the result is different, and it depends on which step to round. I don't know which is more standard, but I think the second way works better with new data.

The tsDyn package the SETAR function

Turns out that the value of threshold (there can be two thresholds like in RSI) is variable. Gives amazing results.

Also let's not forget the calibration algorithms in classification. The point is that the class prediction is in reality not a nominal value, the algorithm calculates the probability of the class, which is a real number. Then this probability is divided for example in half and you get two classes. And if the probabilities are 0.49 and 051, that's two classes? What about 0.48 and 052? Is this a division into classes? Here is where SETAR would divide into two classes, between which would be Reshetovskie "on the fence".

Bayesian regression - Has Any questions from newcomers [Archive!] Pure mathematics, physics,

Alexey Burnakov 2016.11.07 09:19 #1935

Dr.Trader:

It looks good in general, I wonder what will happen in the end.

About the committee - I posted some examples, but there are models that use regression with rounding when classifying, and there is not so straightforward. I've tried two different ways of combining votes:

1) round everything up to classes, take the class that has the most votes.
I.e. having a 4 bar forecast from three models
c(0.1, 0.5, 0.4, 0.4) c(0.6, 0.5, 0.7, 0.1) c(0.1, 0.2, 0.5, 0.7) I would further round it up to classes
c(0, 1, 0, 0) c(1,1,1,0) c(0,0,1,1) , and the final vector with predictions would be c(0, 1, 1, 0) by number of votes.

2) Another option is to find the average result right away, and only then round it up to classes
the result would be c((0.1+0.6+0.1)/3, (0.5+0.5+0.2)/3, (0.4+0.7+0.5)/3, (0.4+0.1+0.7)/3)
or (0.2666667, 0.4000000, 0.533333, 0.4000000), or
c(0, 0, 1, 0)

You can see that the result is different, and it depends on which step to round. I don't know which of these is more standard, but it seems to me the second way works better on new data.

This is the gbpusd pair. Which means the model is waiting to be tested by Brexit. i haven't even processed last year's data yet.... Could be a plum...

Depending on the result of the final test I will set the tone of the article. It's always a bit of a surprise to see that the model works and the norm to see that it drains.

I will assemble the committee as follows:

I build n vectors of numeric type predictions on the number of models on the training data (regression of price increment).

I average the response on the selected models.

I count quantiles 0.05 and 0.95.

At validation I repeat steps 1 and 2.

I select only those examples where the average is outside the quantiles.

I multiply the response by the prediction sign and subtract the spread.

On the obtained vector I build m subsamples with random inclusion at the rate of 1-4 deals per day depending on the forecast horizon.

The committee has already shown a threefold increase in MO compared to single models. Because the models are diverse...

Edge effect on the Is there a pattern Algorithm for combining ranges

mytarmailS 2016.11.07 09:29 #1936

Guys, please help me with this problem, because I figured that I will not get an answer.

http://ru.stackoverflow.com/questions/586979/%D0%9A%D0%B0%D0%BA-%D0%B8%D0%B7-%D0%B4%D0%B0%D0%BD%D0%BD%D1%8B%D1%85-%D0%B2%D1%8B%D1%87%D0%BB%D0%B5%D0%BD%D0%B8%D1%82%D1%8C-%D0%BD%D0%B5%D0%BA%D0%B8%D0%B5-%D0%B3%D1%80%D1%83%D0%BF%D0%BF%D1%8B-%D0%B4%D0%B0%D0%BD%D0%BD%D1%8B%D1%85-%D0%BF%D0%BE-%D1%83%D1%81%D0%BB%D0%BE%D0%B2%D0%B8%D1%8E

you can answer here too, it makes no difference to me

Как из данных вычленить некие группы данных по условию

ru.stackoverflow.com

нужно найти такие строчки которые повторяются не менее 10 раз в всей выборке и в каждой из найденных одинаковых групок которые повторялись, количество "1" в target.label должно превышать 70% по отношению к "0" вот найденные одинаковые строчки единичек больше чем нулей...

[ARCHIVE!] Any rookie question, EURUSD - Trends, Forecasts FOREX - Trends, forecasts

Dr. Trader 2016.11.07 11:14 #1937

I'll answer it here, then.

#пара строк из той таблицы, не буду я всё текстом копировать, потом первая строка повторена ещё дважды
dat <- data.frame(cluster1=c(24,2,13,23,6), cluster2=c(5,15,13,28,12), cluster3=c(18,12,16,22,20), cluster4=c(21,7,29,10,25), cluster5=c(16,22,24,4,11), target.label=c(1,1,0,1,0))
dat <- rbind(dat, dat[1,], dat[1,])
#результат последней строки поменян на 0 для эксперимента
dat[7,"target.label"]=0

library(sqldf)
#для sqldf точек в названиях колонок быть не должно
colnames(dat)[6] <- "target"

dat1 <- sqldf( "select cluster1, cluster2, cluster3, cluster4, cluster5, avg(target) as target_avg, count(target) as target_count from dat group by cluster1, cluster2, cluster3, cluster4, cluster5" )
dat1
dat1[ dat1$target_count>=10 & dat1$target_avg>0.63 , ]
dat1[ dat1$target_count>=10 & ( dat1$target_avg<0.37 | dat1$target_avg>0.63 ), ] #на случай если оба "0" или "1" встречаются чаще 70%

Apophenia as an apologist Simple Indicator from CSV [Archive!] Pure mathematics, physics,

Dr. Trader 2016.11.07 11:32 #1938

SanSanych Fomenko:

The tsDyn package is a SETAR function

SETAR refers specifically to committee calibration, or is that a separate topic for creating financial models?

I flipped through the package's manual, didn't see what I need... I have a situation like this: I have a training table with 10000 examples. I have 100 models that were trained on these examples. To test the models you can use them to predict the same input data and I get 100 vectors each with 10000 predictions. Can SETAR be used to somehow combine all these 100 vectors into one?
And then, for a forecast with new data, there would be 100 forecasts again, and I would have to merge them into one (there would not be 100 vectors, but just 100 single forecasts). SETAR can do that too, using the committee parameters from the training data?

Article: Price forecasting with Market etiquette or good neural network and inputs

СанСаныч Фоменко 2016.11.07 12:12 #1939

Dr.Trader:

SETAR refers specifically to committee calibration, or is that a separate topic for creating financial models?

I flipped through the manual for the package, I didn't see what I need... Here's the situation: I have a training table with 10000 examples. I have a training table with 10000 examples and 100 models that were trained on these examples. To test the models you can use them to predict the same input data and I get 100 vectors each with 10000 predictions. Can SETAR be used to somehow combine all these 100 vectors into one?
And then, for a forecast with new data, there would be 100 forecasts again, and I would have to merge them into one (there would not be 100 vectors, but just 100 single forecasts). SETAR can do that too, using committee parameters from the training data?

As I understand it, it has nothing to do with committees

Mihail Marchukajtes 2016.11.07 12:42 #1940

Yury Reshetov:

So you need to roll back to previous versions.

I have a normal flight. Maybe because there are no lags in the sample?

Well, yes, I did lags because in previous versions they increased enumeration ability, now with improved algorithm of prefetching it is not required, so I'm trying to train without them. Let's take a look and see. Later I will write about the result of today...

[Archive!] Writing an advisor [Archive!] Any rookie question, HOW TO TRADE FOR

Machine learning in trading: theory, models, practice and algo-trading - page 194