Machine learning in trading: theory, practice, trading and more

Alexey Burnakov
2979

Good afternoon, everyone,

I know there are machine learning and statistics enthusiasts on this forum. I suggest discussing in this thread (without any hooligans), sharing and enriching our own knowledge in this interesting field.

For beginners and not only, there is a good theoretical resource in Russian:https://www.machinelearning.ru/

A small literature review on methods for the selection of informative features: https://habrahabr.ru/post/264915/

I propose problem number one. I will post its solution later. SanSanych has already seen it, please don't tell me the answer.

Introduction: To build a trading algorithm, you need to know what factors will be the basis for predicting the price, or the trend, or the trade opening direction. Selecting such factors is not an easy task, and it is infinitely complicated.

Attached is an archive with an artificial csv dataset I made.

The data contains 20 variables prefixed with input_, and one rightmost variable output.

The output variable depends on some subset of input variables(the subset may contain from 1 to 20 inputs).

Problem: Using any methods (machine learning), select the input variables that can be used to determine the state of the output variable on existing data.

The solution can be posted here as: input_2, input_19, input_5 (example). And you can also describe the found dependency of inputs and output variable.

Who will manage to do it, well done). I owe you a ready solution and explanation.

Alexey

Files:
Alexey Burnakov
2979
Alexey Burnakov  

Deus Ex Machina.

These are the words that open the pages of many years' worth of philosophical treatises.

So, no one wants to do machine-lifting?

[Deleted]  
Alexey Burnakov:

Deus Ex Machina.

These are the words that open the pages of many years' worth of philosophical treatises.

So, no one wants to do the machine-lifting thing?

Every deal has a risk and other conditions, machine learning uses old data, that is, it operates on something that does not exist.
Alexey Burnakov
2979
Alexey Burnakov  
yerlan Imangeldinov:
Every transaction has risk and other conditions, machine learning uses old data, that is, it operates on something that does not exist.

More precisely, to what it was before.

And in this is looking for a stable dependence.

That's what we're looking for.

[Deleted]  
Alexey Burnakov:

More precisely, to what it was before.

And in this is looking for a stable dependence.

That's what we're looking for.

This is the weakness that the market is learning through the Soros function of the old stuff is better not to use.
Dmitry Fedoseev
67374
yerlan Imangeldinov:
Every deal has risk and other conditions, machine learning uses old data, that is, it operates to something that does not exist.
And you have new data? So, you don't even look at the chart, you don't even look at the old data? Yes?
Alexey Burnakov
2979
Alexey Burnakov  
Dmitry Fedoseev:
Do you have new data? So, you don't even look at the chart, because there's old data? Yes?
You got it off your tongue.
Alexey Burnakov
2979
Alexey Burnakov  

Anyway, here goes. To spur the topic a bit, I promise to transfer 5 credits to whoever solves the given problem correctly.

Give out a set of informative inputs.

The community gave them to me for activity on the forum, I'll return them to the system, but it will get some interesting discussion.

Alexei

Vladimir Perervenko
5099

The stated topic of Machine Learning is important, complex, and huge. Judging by the first post, you want to start with one of the preparatory and important steps, "Evaluation and choice of predictors". What do you want to solve or show with the task given? A new method, method, or what?

The content and topic of the topic do not match.

Specify the goal, maybe there will be interested people.

Few people have free time to solve problems with unclear goals.

Good luck

СанСаныч Фоменко
7131
I have no idea what to do with it:
Each deal has a risk and other conditions, machine learning uses old data, i.e. it operates on something that does not exist.

Always learning from the past.

We look at the graph for centuries. Both on and we see "three soldiers", then we see "head and shoulders". How many such figures we have already seen and believe in these figures, we trade...

And if the task is set as follows:

1. to automatically find such figures, not for all charts, but for a particular currency pair, the ones that occurred recently, not three centuries ago in the Japanese rice trading.

2. is the initial data on which we automatically search for such figures - patterns.

To answer the first question let us consider an algorithm called "random forest". 10-5-100-200 ... input variables. Then it takes the entire set of values of the variables referring to one point in time corresponding to one bar and searches for such a combination of those input variables that would correspond on the historical data to a quite certain result, for example, a BUY order. And another set of combinations for another order - SELL. A separate tree corresponds to each such set. Experience shows that the algorithm finds 200-300 trees for the input set of 18000 bars (about 3 years). This is the set of patterns, almost analogues of "heads and shoulders", and whole mouths of soldiers.

The problem with this algorithm is that such trees can pick up some specifics that are not encountered in the future. This is called "superfitting" here in the forum, "overfitting" in machine learning. It is known that the whole large set of input variables can be divided into two parts: those related to the output variable and those not related to the noise. So Burnakov tries to weed out the ones that are irrelevant to the output.

PS.

When building a trend TS (BUY, SELL) any kind of variables are related to noise!

Alexey Burnakov
2979
Alexey Burnakov  
Vladimir Perervenko:

Judging by the first post, you want to start with one of the preparatory and important steps, "Evaluation and choice of predictors. What do you want to solve or show with the given problem? A new method, method, or what?

The content and topic of the topic do not match.

Specify the goal, maybe there will be interested people.

Few people have free time to solve problems with unclear goals.


Ok.

If someone decides or at least come close to the right solution (that is, the topic will be alive), then I:

will post the correct solution - the algorithm for generating the dataset

explain why a number of other " Predictor Estimation and Selection" algorithms failed

I'll post my method, which robustly and sensitively solves similar problems - I'll give the theory and post the code in R.

This is done for mutual enrichment of "understanding" of machine learning tasks.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525152615271528152915301531153215331534153515361537153815391540154115421543154415451546154715481549155015511552155315541555155615571558155915601561156215631564156515661567156815691570157115721573157415751576157715781579158015811582158315841585158615871588158915901591159215931594159515961597159815991600160116021603160416051606160716081609161016111612161316141615161616171618161916201621162216231624162516261627162816291630163116321633163416351636163716381639164016411642164316441645164616471648164916501651165216531654165516561657165816591660166116621663166416651666166716681669167016711672167316741675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735173617371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783178417851786178717881789179017911792179317941795179617971798179918001801180218031804180518061807180818091810181118121813181418151816181718181819182018211822182318241825182618271828182918301831183218331834183518361837183818391840184118421843184418451846184718481849185018511852185318541855185618571858185918601861186218631864186518661867186818691870187118721873187418751876187718781879188018811882188318841885188618871888188918901891189218931894189518961897189818991900190119021903190419051906190719081909191019111912191319141915191619171918191919201921192219231924192519261927192819291930193119321933193419351936193719381939194019411942194319441945194619471948194919501951195219531954195519561957195819591960196119621963196419651966196719681969197019711972197319741975197619771978197919801981198219831984198519861987198819891990199119921993199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202420252026202720282029203020312032203320342035203620372038203920402041204220432044204520462047204820492050205120522053205420552056205720582059206020612062206320642065206620672068206920702071207220732074207520762077207820792080208120822083208420852086208720882089209020912092209320942095209620972098209921002101210221032104210521062107210821092110211121122113211421152116211721182119212021212122212321242125212621272128212921302131213221332134213521362137213821392140214121422143214421452146214721482149215021512152215321542155215621572158215921602161216221632164216521662167216821692170217121722173217421752176217721782179218021812182218321842185218621872188218921902191219221932194219521962197219821992200220122022203220422052206220722082209221022112212221322142215221622172218221922202221222222232224222522262227222822292230223122322233223422352236223722382239224022412242224322442245224622472248224922502251225222532254225522562257225822592260226122622263226422652266226722682269227022712272227322742275227622772278227922802281228222832284228522862287228822892290229122922293229422952296229722982299230023012302230323042305230623072308230923102311231223132314231523162317231823192320232123222323232423252326232723282329233023312332233323342335233623372338233923402341234223432344234523462347234823492350235123522353235423552356235723582359236023612362236323642365236623672368236923702371237223732374237523762377237823792380238123822383238423852386238723882389239023912392239323942395239623972398239924002401240224032404240524062407240824092410241124122413241424152416241724182419242024212422242324242425242624272428242924302431243224332434243524362437243824392440244124422443244424452446244724482449245024512452245324542455245624572458245924602461246224632464246524662467246824692470247124722473247424752476247724782479248024812482248324842485248624872488248924902491249224932494249524962497249824992500250125022503250425052506250725082509251025112512251325142515251625172518251925202521252225232524252525262527252825292530253125322533253425352536253725382539254025412542254325442545254625472548254925502551255225532554255525562557255825592560256125622563256425652566256725682569257025712572257325742575257625772578257925802581258225832584258525862587258825892590259125922593259425952596259725982599260026012602260326042605260626072608260926102611261226132614261526162617261826192620262126222623262426252626262726282629263026312632263326342635263626372638