Machine learning in trading: theory, models, practice and algo-trading - page 1874

 
Maxim Dmitrievsky:

there's a comma separator

first hundred

second hundred.

I don't see anything abnormal.


made a cut like this from 5 o'clock to 6 o'clock


 
mytarmailS:

first hundred

second hundred

I don't see anything abnormal.


I did a cut like this from 5:00 to 6:00.


but look at the curves at 2 o'clock at once.

I'm getting a skew between the hours.

or is there 24 values in each?

 
Maxim Dmitrievsky:

and look at the curves in two hours at once

I have a skew between the hours.

or is there 24 values each?

it's true! 24 values in each...

12 5-minute marks of the 5th hour and 12 5-minute marks of the 6th hour together make one row of 24 marks.

 
mytarmailS:

it is so! 24 values in each...

12 5-minutes of the 5th hour and 12 5-minutes of the 6th hour together make one row of 24 values.

Why by hundreds? draw it all at once, there's a skew somewhere at the end, I think.

 
Maxim Dmitrievsky:

Why by the hundreds? Draw it all at once, there's a skew somewhere at the end, I think.

because there's too many lines, you can't see anything...

here's the whole thing.

-----------------------

You messed up when you created the dataset for the clusters.

 
mytarmailS:

because there are too many lines, you can't see anything...

Here it is all at once.

-----------------------

Something you messed up when you created the dataset for the clusters.

Got it... thanks.

P.S. the number of hours in the dataset is different

5-hour clock: 139

6 hours: 140

somewhere an hour is missing or an hour is extra.

I do not know why you have it straight, probably the package itself does something

solution: reindex the dateframe at 5 min, add NaN in place of the missing bars. Do fillNa on the closest values. Finally %)

 
Why NaN? The price has not changed - use the Close price of the previous bar.
 
Maxim Dmitrievsky:

Got it... thanks

I also have a different number of 5 and 6 o'clock, but everything works fine, it seems.

vector with hrs hours

hrs
   [1]  9  9  9  9  9  9  9  9  9  9  9  9 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11
  [32] 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 14 14
  [63] 14 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16
  [94] 16 16 16 17 17 17 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 19 19 19 19
 [125] 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 21
 [156] 21 22 22 22 22 22 22 22 22 22 22 22 22 23 23 23 23 23 23 23 23 23 23 23 23  0  0  0  0  0  0
 [187]  0  0  0  0  0  0  1  1  1  1  1  1  1  1  1  1  1  1  2  2  2  2  2  2  2  2  2  2  2  2  3
 [218]  3  3  3  3  3  3  3  3  3  3  3  4  4  4  4  4  4  4  4  4  4  4  4  5  5  5  5  5  5  5  5
 [249]  5  5  5  5  6  6  6  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7  7  7  7  8  8  8
 [280]  8  8  8  8  8  8  8  8  8  9  9  9  9  9  9  9  9  9  9  9  9 10 10 10 10 10 10 10 10 10 10
 [311] 10 10 11 11 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 13 13 13 13 13


code - if it is 5 o'clock then take the index from this place and add 24 hours, those take the full two hours, and output the result

for(i in 2:length(x$close)){
  
  if(hrs[i] == 5 & hrs[i-1] == 4){
    
    ii <- i:(i+23)
    
    print(  hrs[ii] )}

we get

[1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6
......
.....
....

everything works properly, I don't know what's wrong there

 
mytarmailS:

I also have a different number of 5 and 6 o'clock, but everything works fine, like.

vector with hrs hours


code - if it is 5 o'clock, then we take indexes from this place and add 24 hours, those take two full-fledged hours, and output the result

we get

Everything is working properly, I don't know what's wrong with it.

Why do you divide the two by the length of the dataframe in the cycle? ) I do not understand the scribbles R, unfortunately

If the number of hours is different (and it can be different because of omissions), then getting rows in the loop leads to shifts. I.e. an hour of one day and an hour of another day are taken, for example. Or some 5-minutes are missing, so the shift will be by 5 minutes of one hour from another.
 
Maxim Dmitrievsky:

why do you divide the two by the length of the dataframe in the loop? ) I don't understand R's scribbles, unfortunately

I don't divide any twos there.

For example:

2 :10 means - take 2 to 10 == 2,3,4,5,6,7,8,9,10

Reason: