How to reduce the memory size of KaRMA? - General

Aleksey Vyazmikin 2022.10.28 21:28 #28061

mytarmailS #:

.

Your script consumes almost 9 gigabytes of RAM on my sample, but it seems to work, the files are saved. I don't even know where the memory is consumed there, while the sample takes a little more than a gigabyte.

[ARCHIVE] Any rookie question, MT5 RAM memory voraciousness, AMD or Intel as

Aleksey Vyazmikin 2022.10.28 23:15 #28062

mytarmailS #:

.

I also found out that headings in the table (column names) are saved in quotes - how to switch this off?

Maxim Dmitrievsky 2022.10.29 04:05 #28063

What does this code do? To make it faster, you should convert all columns to the same data type (float 32, 16 - no need, it will be slower) and calculate coRRR through fast arrays.

if we are talking about the real correction of the kaRma

Indicators: Extended normalized RSI Questions from Beginners MQL5 Reading indicator buffers set

mytarmailS 2022.10.29 07:30 #28064

Aleksey Vyazmikin #:

Your script consumes almost 9 gigabytes of RAM on my sample, but it seems to work, the files are saved. I don't even know where the memory is used, while the sample takes a little more than a gigabyte.

So?

R bad probably)

Aleksey Vyazmikin #:

I've also found a problem - headings in the table (column names) are saved in quotes - how to switch it off?

what did you do to solve the problem?

Aleksey Vyazmikin 2022.10.29 08:58 #28065

mytarmailS #:

So what?

R is bad, I guess.)

what did you do to solve the problem?

Bad/good is too critical a judgement.

It's obvious that the package code is not memory efficient but can be fast, or the script copies the whole table\selection many times.

And what you did - you found the problem and reported it to a professional hoping for help.

too many calculations. I Requests & Ideas Retrieving a price stream

Aleksey Vyazmikin 2022.10.29 09:01 #28066

Maxim Dmitrievsky #:

What does this code do? For speed, you should convert all columns to the same data type (float 32, 16 - no need, it will be slower) and calculate the number of columns using fast arrays.

if we are talking about the real correction of the kaRma

As far as I understand, there is no concept of different data types (int, float, etc.) in R at all. And, it will reduce the memory size, but it will not affect the speed much. On video cards, yes, there will be an increase.

Service Work: Towards re-shaping Wishes for the work FOREX - Trends, forecasts

Maxim Dmitrievsky 2022.10.29 09:05 #28067

Aleksey Vyazmikin #:

As far as I understand, in R there is no concept of different data types (int, float, etc.). And, it will reduce the memory size, but it will not affect the speed much. On video cards, yes, there will be an increase.

everything is there. It will affect the speed catastrophically. Dataframes are the slowest beasts with the most overhead.

It's not about video cards, it's about understanding that such things don't count through dataframes in a sober state.

is there something wrong Check - what have Attention, contest!

mytarmailS 2022.10.29 09:16 #28068

Aleksey Vyazmikin #:

Tip: Is it necessary to use vectors of 100,000 observations to see the correlation between them?

I am looking for highly correlated vectors, i.e. with correlation greater than 0.9.

mytarmailS 2022.10.29 11:41 #28069

You're welcome

Vladimir Perervenko 2022.10.29 17:32 #28070

Aleksey Vyazmikin #:

Your script has been running for more than a day and has not yet created a single file based on the results of the screening. I don't know, maybe it's time to switch it off?

Depends on the zhekez and sample size. If your processor is multi-core, parallelise the execution. Below is a variant of parallel execution

##----parallel--------------------------
library("doFuture")
registerDoFuture()
plan(multisession)
require(foreach)
bench::bench_time(
foreach(i = 1:length(cor.test.range))%dopar%{
    get.findCor(dt, cor.coef = cor.test.range[i])
}-> res
)
#  process     real
# 140.62 ms    2.95 m
#
 bench::bench_time(
for(i in 1:length(cor.test.range)){
    paste0("train1_" , cor.test.range[i]*10 , ".csv") %>%
        paste0(patch , .) %>% fwrite(res[[i]], .)
}
)
#  process    real
# 156 ms   157 ms

Four times faster than serial. Hardware and software

sessionInfo()
#  AMD FX-8370 Eight-Core Processor
#  R version 4.1.3 (2022-03-10)
#  Platform: x86_64-w64-mingw32/x64 (64-bit)
#  Running under: Windows 10 x64 (build 19044)
#
#  Matrix products: default
#
#  locale:
#     [1] LC_COLLATE=Russian_Russia.1251  LC_CTYPE=Russian_Russia.1251    LC_MONETARY=Russian_Russia.1251
# [4] LC_NUMERIC=C                    LC_TIME=Russian_Russia.1251
#
#  attached base packages:
#     [1] stats     graphics  grDevices utils     datasets  methods   base
#
#  other attached packages:
#     [1] doFuture_0.12.2 future_1.28.0   foreach_1.5.2   fstcore_0.9.12  tidyft_0.4.5
#
#  loaded via a namespace (and not attached):
#     [1] Rcpp_1.0.9        codetools_0.2-18  listenv_0.8.0     digest_0.6.30     parallelly_1.32.1 magrittr_2.0.3
# [7] bench_1.1.2       stringi_1.7.8     data.table_1.14.4 fst_0.9.8         iterators_1.0.14  tools_4.1.3
# [13] stringr_1.4.1     import_1.3.0.9003 parallel_4.1.3    compiler_4.1.3    globals_0.16.1

Good luck

Machine learning in trading: theory, models, practice and algo-trading - page 2807