Cloud sync errors - page 2

 
What is different between 10 minutes and 20 minutes? No different. 10 minutes is extremelly long for single call. You want test wrong written EA? No problem. But don't expand your problems on public Cloud server. Other users want use Cloud server too. You can use local agents and remote agents without any restrictions - You are welcome
 
cowil:

Hi Stringo,

Firstly, thanks for the info.

However, I'm interested in MetaQuotes reasoning for this. If a large amount of "Every Tick" data is used (say for instance, 2003.1.1 -> 2013.1.1) and the Expert being optimised is reasonably complicated, it'll often take longer than 10 minutes for a single optimisation iteration to occur. Is there any specific reason that MetaQuotes chose a period of 10 minutes as a timeout? Also, is there any way for the cloud user to increase this timeout or has this been "hard wired" by MetaQuotes?

stringo:
Endless loop detected on Cloud Agents only. If one of calls (OnInit, OnDeinit, OnTick, OnTimer etc) works more than 10 minutes
It's not a single optimization, it's a signal call to a function.
 
angevoyageur:
It's not a single optimization, it's a signal call to a function.

Ah - my mistake - I somehow had it in my head that we were talking about a single optimisation iteration, rather than a single call to an event handler (even though Stringo had specifically mentioned a single event handler call). A single call to an event handler that lasts longer than 10 minutes would indeed be ridiculous. My humblist apologies - must have had a brain fade - time to rest the brain.  :)

Mmmmm - so there must be something odd happening within my Expert that's causing OnTick() to occasionally take longer than 10 minutes to complete a call. Time to start digging... 

Anyway, thanks again for your help guys! 

 
angevoyageur:
It's not a single optimization, it's a signal call to a function.
Exactly. Single call cannot be longer than 10 minutes on the one of Cloud agents
 

Hi,

Still digging but struggling to find anything. The fact that my Expert optimises flawlessly on my local Agents (i.e. none of the progress percentages of my Agents pause or stop at any time which I would have thought would be the case if there was some sort of endless loop in my OnTick() function that lasted at least 10 minutes or more) really makes it difficult. 

One thing I am curious about - what does the PR number at the end of the error message indicate (i.e. ".... expert rejected by MQL5 Cloud Network in 600 sec (PR116)". Can anyone shed any light on this?

Thanks in advance for your help with this. 

Distributed Computing in the MQL5 Cloud Network
Distributed Computing in the MQL5 Cloud Network
  • cloud.mql5.com
Connect to the MQL5 Cloud Network (Cloud Computing) and earn extra income around the clock — there is much work for you computer!
 
cowil:

Hi,

Still digging but struggling to find anything. The fact that my Expert optimises flawlessly on my local Agents (i.e. none of the progress percentages of my Agents pause or stop at any time which I would have thought would be the case if there was some sort of endless loop in my OnTick() function that lasted at least 10 minutes or more) really makes it difficult. 

One thing I am curious about - what does the PR number at the end of the error message indicate (i.e. ".... expert rejected by MQL5 Cloud Network in 600 sec (PR116)". Can anyone shed any light on this?

Thanks in advance for your help with this. 

PR is a performance rating of an agent calculated according to a special unified method. The higher is an agent's PR, the faster it accomplishes its task and the higher is its rental price per unit time as a result.
See here for more information.
Questions Concerning Payment for Participation in the MQL5 Cloud Network
Questions Concerning Payment for Participation in the MQL5 Cloud Network
  • cloud.mql5.com
Questions concerning payment for participation in the MQL5 Cloud Network - distributed computing network
 
angevoyageur:
See here for more information.
Ah - that certainly explains things. Thanks for your help!
 

Hi,

Well, after spending many hours over the weekend examining my code, I couldn't find anywhere within my Expert's code where the possibility of a endless loop could arise. And in the process, I also became more and more convinced that if my Expert had had endless loop issues, I should have seen this become apparent when using my local agents to optimise my Expert. As I mentioned above, my local agents don't pause at any stage during an optimisation of my Expert - let alone for a period of ten minutes or more.

After becoming convinced that the issues didn't arise from my Expert, I started to examine the other alternatives. The only logical alternatives that I could see was that there were issues with either the agents themselves (i.e. bugs) or problems with the boxes they were running on. It looks like the second of these alternatives appears to be the culprit.

From what I can work out, people are running cloud agents on all sorts of boxes. I'm also presuming that these cloud agents are all Windows based. The reason I mention this is that my own personal experience of consumer versions of Windows is that they are notoriously unstable when thrashed for any length of time, tending to slow down or even seize up when saddled with any serious processing demands.

The Optimisations I've been attempting to perform concern a reasonably complex Expert being run on 6-7 years of "Every Tick" data - i.e. requiring reasonable processing and memory demands. I suspected that the agents in the cloud that were taking on this task were insufficiently spec'ed - especially considering they would be Windows boxes.

So I put the following line of code in my OnInit() event handler:

    // Check optimisation agent stats...
    if (MQL5InfoInteger(MQL5_OPTIMIZATION) && TerminalInfoInteger(TERMINAL_MEMORY_PHYSICAL) < 32000)
        return(INIT_AGENT_NOT_SUITABLE);

The reason I used TERMINAL_MEMORY_PHYSICAL is that the other memory options:  TERMINAL_MEMORY_TOTAL and TERMINAL_MEMORY_AVAILABLE aren't much use as they only provide you with the total, user-mode virtual address space of the host's processor (i.e. 4GB for a 32 bit processor or 8TB for a 64 bit processor). I can't imagine any 64 bit machines out there with 8TB of memory - at least, not yet. :) TERMINAL_CPU_CORES was another one I considered but decided in the end to just go with testing for memory as I would assume that any box with a decent amount of memory would be decently spec'ed in all the other important areas.

And guess what - no more problems! All my optimisations are now running fine. :) 

 


 
cowil:

Hi,

Well, after spending many hours over the weekend examining my code, I couldn't find anywhere within my Expert's code where the possibility of a endless loop could arise. And in the process, I also became more and more convinced that if my Expert had had endless loop issues, I should have seen this become apparent when using my local agents to optimise my Expert. As I mentioned above, my local agents don't pause at any stage during an optimisation of my Expert - let alone for a period of ten minutes or more.

After becoming convinced that the issues didn't arise from my Expert, I started to examine the other alternatives. The only logical alternatives that I could see was that there were issues with either the agents themselves (i.e. bugs) or problems with the boxes they were running on. It appears that the second of these alternatives appears to be the culprit.

From what I can work out, people are running cloud agents on all sorts of boxes. I'm also presuming that these cloud agents are all Windows based. The reason I mention this is that my own personal experience of consumer versions of Windows is that they are notoriously unstable when thrashed for any length of time, tending to slow down or even seize up when saddled with any serious processing demands.

The Optimisations I've been attempting to perform concern a reasonably complex Expert being run on 6-7 years of "Every Tick" data - i.e. requiring reasonable processing and memory demands. I suspected that the agents in the cloud that were taking on this task were insufficiently spec'ed - especially considering they would be Windows boxes.

So I put the following line of code in my OnInit() event handler:

The reason I used TERMINAL_MEMORY_PHYSICAL is that the other memory options:  TERMINAL_MEMORY_TOTAL and TERMINAL_MEMORY_AVAILABLE aren't much use as they only provide you with the total, user-mode virtual address space of the host's processor (i.e. 4GB for a 32 bit processor or 8TB for a 64 bit processor). I can't imagine any 64 bit machines out there with 8TB of memory - at least, not yet. :) TERMINAL_CPU_CORES was another one I considered but decided in the end to just go with testing for memory as I would assume that any box with a decent amount of memory would be decently spec'ed in all the other important areas.

And guess what - no more problems! All my optimisations are now running fine. :) 

 


This sounds like a great idea and I'm thankful for that hint.

However, 3 things about that:

1) As I mentioned above, I also have the problem of the "endless" loop, but since I understood from this thread that "endless loop" is just the best guess for "one event took longer than ten minutes" I accept that it might be my code. I use quite complex indicators, and since (at least I think so) they calculate their whole history when their handle is being created, this might (on slow computers) take more than ten minutes.

2) However! Usually my cloud crashed after 10-15 Minutes. But last night, it worked perfectly for 8 hours. No single crash, although I didn't change the code at all. Weird!

3) And most important, because related to your approach: When you reject an agent based on it's memory, the agent (and for that the whole cloud) don't crash, I get that. But I don't think, that a more powerful machine will try the same parameter-set again, so you basically lose optimization datapoints, am I correct? Would you say, this is the price we'll have to pay?


Will be curious to see, if my agents still work once I'm back from work...

 
cowil:
...
How many agents are available when you reject all those with less than 32G of ram ?
Reason: