Machine learning in trading: theory, models, practice and algo-trading - page 221

 
Andrey Dik:
Perhaps the most sensible and objective opinion of R, which I have seen lately. It's typical thinking - that's what old fans of R suffer from, even from this forum at least, they all think the same, you can even confuse them between themselves if not to look at the nicknames - so much the same way they reason.

When discussing R it is impossible to do without you, although it is not at all clear WHAT you are doing here.

But since you are here, on your example.

You have developed a genetic algorithm, you know how, maybe better than anyone else in the world.

I can't write genetic algorithms. And not because I can't do it, but because I don't need to.

Let me explain.

In a real trading system I use the rfe function (backtesting of predictors from caret). Doesn't give very good results. Should I use your genetic algorithm? Hundred times no, since the same caret has gaf function (predictor selection by genetic algorithm). But besides that again in the same caret there is another function saf (predictor selection by simulated stability - annealing), which is much more effective in MY PROBLEM, THAN the GENETIC ALGORITHM. The problem in my trading systems is one, and the tools are three, with the labor intensity of replacing one by another is three letters. Moreover, and maybe most importantly: ALL THREE FUNCTIONS ARE CONNECTED WITH OTHER FUNCTIONS WHICH I NEED.

I don't have the problem of programming a genetic algorithm at all. But when solving my problems I can easily use various algorithms, including the genetic one. In reality the problem is the selection of predictors, it is a computationally intensive problem and to solve it besides the listed R there are a large number of tools that are programmed NOT in R.

 
J.B:

It depends on what kind of research, if student's research to pass the term paper, then yes, much easier to take the function from the shelf, run it, and also get a standard report in addition with beautiful graphs, there is no question.

If researching a quant engineer, in a financial office that takes money from the market, not just through commissions, or near-market technology, the situation is different. As a rule, in the right offices, in 5 years they build their own trading infrastructure, where there are 95% of all necessary tools, which are conveniently wrapped and you can use them not much more complicated than in R..........

It's all clear and I agree with you, but you're talking about production, but there are no quons, and there are no people with infrastructure, which was done for 5 years... and before this infrastructure was done, there was initial research, that's where most people are, looking for ideas, but you're talking about production...

as I see it, the scheme is

1) searching for a working idea

2) finding a working idea

3) the deepest research of the idea

4) making trade and other infrastructure for this idea (production)

Now you contrast the first point with the fourth - and it's like a bullshit, you know what I mean?

so for the first point, when you need to quickly go through a bunch of ideas R is a thing, on the fourth point it's c++ naturally

 

SanSanych Fomenko:

I don't know how to write genetic algorithms. And not because I can't do it, but because I don't need to.

I remembered it because at pre-interview there I was asked to write a sorting by bubble or faster, without looking on the Internet of course, I wrote it, it seemed to me that the TOR was childish, but then I was told that 70% of people can't do it. I think this is correct, there is a set of basic algorithms which should know how 2 +2 = 4 specialist in one field or another, as it greatly increases, at a minimum, the ability to use them effectively, not to mention that to modify and create more effective analogues. Quant should be able to write MLP, baes, Knn, forest, etc. without peeking. That's 2+2=4 for quant.

SanSanych Fomenko:

I use the rfe function (backward selection of predictors from caret). Gives not very good results. There is a function gaf...

Well, yes, that's just the kind of talk consumers have with "tens of thousands of functions" that they don't understand but just try. Unfortunately everything that is given to try has already been tried and there is "no fish" there, especially in respect of algotrading, all this wealth of functions is illusory, algotrading in its essence cannot be poppy, here you need to be able and love to invent a new generation bicycle, otherwise they will drive a "fancy car" only to the dump or the graveyard, not you at the wheel)))

 
mytarmailS:

Yes it's all clear, and I agree with you, but you're talking about production, but there are no quons here, and there are no people with the infrastructure that was done for 5 years... and before this infrastructure was done, there was initial research, that's where most people are, they're looking for ideas, you're talking about production...

as I see it, the scheme is

1) searching for a working idea

2) finding a working idea

3) the deepest research of the idea

4) making trade and other infrastructure for this idea (production)

Now you contrast the first point with the fourth - and it's like a bullshit, you know what I mean?

so for the first item, when you need to quickly go through a bunch of ideas R is a thing, on the fourth point it's s++ naturally

Wrong scheme, because in algotrading many processes are intertwined and interdependent. You can't look for an idea in isolation from the data and a deep understanding of the algorithms to process it. That is, for example, "looking for ideas" on the minute candlesticks of a currency pair, it is.... On orders from the Russian market is another, on orders from dozens of the world's leading exchanges and providers of macroeconomic data - the third, etc. The same is with the processing, the idea and the toolkit are closely interconnected, the incremental approach makes sense, but not the "vacuum" one, when an idea is looked for as a spherical horse and implemented with C++. For example, what do you think of the ideaModeling high-frequency limit order book dynamics with support vector machines? Try it and then tell me. I can only say that in the article it's 2+2+4 too, in reality everything is orders of magnitude more complicated.

 
J.B:

Once worked in a gamediver office year, I remembered this, as there at the preliminary interview asked to write a sorting bubble or faster, without looking at the Internet of course, I wrote, I thought it was childish TOR, but then I was enlightened that 70% of the people can not do it. I think this is correct, there is a set of basic algorithms which should know how 2 +2 = 4 specialist in one field or another, as it greatly increases, at a minimum, the ability to use them effectively, not to mention that to modify and create more effective analogues. Quant should be able to write MLP, baes, Knn, forest, etc. without peeking. That's 2+2=4 for quant.

I don't want to sound like a non-detective, but it's weird that a pro like you, and even working at quantum fund, didn't know what cross-correlation is as early as this yearhttps://www.mql5.com/ru/forum/71816

Индикатор опережения\отставания временного ряда
Индикатор опережения\отставания временного ряда
  • www.mql5.com
Индикатор опережения\отставания временного ряда.
 
J.B:

Not the right scheme, because in algotrading many processes are intertwined and interdependent. You can not look for an idea in isolation from the data and a deep understanding of the algorithms for their processing. That is, for example, "looking for ideas" on the minute candlesticks of a currency pair, it is.... On orders from the Russian market is another, on orders from dozens of the world's leading exchanges and providers of macroeconomic data - the third, etc. The same is with the processing, the idea and the toolkit are closely interconnected, the incremental approach makes sense, but not the "vacuum" one, when an idea is looked for as a spherical horse and implemented with C++. For example, what do you think of the ideaModeling high-frequency limit order book dynamics with support vector machines? Try it and then tell me. I can only say that in the article it's also 2+2+4 in reality everything is orders of magnitude more complicated.

I agree about hft, but i don't understand why you always cite it as an example, you know that such dates are not available to anyone, none of us here have money to buy order brokers from the west and fast channels.

Why talk about it and make it an example?

 
mytarmailS:

I don't want to sound like a detective, but it's strange that such a pro as you, who works in a quantum fund this year, didn't know what cross-correlation ishttps://www.mql5.com/ru/forum/71816.

Come on... I was solving quite a complex problem, and I was sitting at home, sick, it was not convenient to ask colleagues, I had a month to get a job, then I figured it out myself, and after a while Rev. Combinator called exactly the same solution. If you had such a problem, you probably would not have solved it, because you need to know how cross-correlation between series at low levels is realized and guess how to use it, in R there is no such function "lagging series", and most importantly, you simply would not have set such a problem))) So you are not a quant in a hedge fund.

 

SanSanych Fomenko:


When promoting the R language as a means of solving trading tasks effectively, you constantly mention the ease of using its tools and the wide range of its capabilities. No one argues with that. However, your statements clearly show your ignorance of this language's principles.

Of course, you know all its function names, and you know which of them are needed to solve your tasks, but you're obviously completely ignorant of the mechanisms hidden behind these names. You do not know how they work. Moreover, you don't want to know it and dissuade others from doing it.

You are a user of R. You like this tool because it is easy to use but not for the efficiency in solving concrete tasks which you know nothing about and do not aspire to know.

Understand that a developer differs from the user precisely in his desire to understand the mechanism and achieve maximum efficiency.

This goal can be achieved only with absolute knowledge of the subject. The ease of creating something is not relevant here.

In fact, you suggest that developers forget about their work and become mere users. Enjoy the ease of use of the tools you provide and completely forget about exploring the principles of their work.

With this approach, you can forget about efficiency, and the developers can not forget about it.

If you begin to understand the work of the mechanisms veiled in the R, it may reveal that they are not as perfect as they may seem to an enthusiastic dilettante.

With this post I am trying to articulate a point of view that opposes you, but I could be wrong about others. However, that is how I see the problem.

 
mytarmailS:

I agree about hft, but I do not understand why you always put it as an example, you know that such dates are not available to anyone, no one here has the money to buy order-goers from the west and fast channels

Why talk about it and give it as an example?

Don't generalize. You can start with the OL of the russian market. And HFT, as it only works. HFT and insider.
 
J.B:

Come on... I then solved quite a complex problem, and sat at home, sick, it was not convenient to ask colleagues, I had a month to get a job, then I figured it out myself, and after some time Rev. Combinator called exactly the same solution. If you had such a problem, you probably would not have solved it, because you need to know how cross-correlation between series at low levels is realized and guess how to use it, in R there is no such function "lagging series", and most importantly, you simply would not have set such a problem))) So you are not a quant in a hedge fund.

When I first heard about cross-correlation I got the same idea as you about series comparison(so you're not alone and not unique in this) and it was long before your branch, but I never got around to it...

J.B:
Do not generalize. You can start with the OL Russian market. And HFT, as it only works. HFT and insider.

you may be right.... but it seems to me that the available OLs are written at faster speeds than plaza, and get that tested on one and the reality will be different, i may be wrong but this is the impression i got

Reason: