Selecting the indicators to use is one of the most important and difficult aspects of building a successful strategy.

Not only are there thousands of different indicators, but most
indicators have numerous settings which amounts to virtually limitless
indicator combinations. Clearly testing every combination is not
possible, so many traders are left on their own selecting somewhat
random inputs to build their strategy.

Luckily this process, known as “feature selection”, is a very
well-researched topic in the machine learning world. In this post we’ll
use a popular technique known as a “wrapper method” that uses a support
vector machine (SVM) and a “hill climbing” search to find a robust
subset of indicators to use in your strategy.

Feature Selection with Machine Learning: Wrapper Method and Hill Climbing

Selecting from the near limitless possible combinations of
indicators to use in your strategy can be very daunting. However, this
is a problem that machine learning experts and data scientists have been
grappling with for a long time and have come up with a wide range of
tools and techniques to help you out.

One popular technique is a wrapper method. A wrapper method uses a
machine-learning algorithm to evaluate each subset of indicators. The
main advantage to this method, as opposed to evaluating each indicator
on its own, is we can find the relationships and dependencies between
indicators. For example, a 14-period RSI or a 30-period CCI alone may
not be great predictors, but combined could provide some valuable
insight.

The next question is how do you decide which subsets of indicators
to evaluate? If you are only looking at a few indicators then trying
every combination might be possible, but even looking through 15
different indicators for 3 to use in your strategy leads to over 450
different possible combinations! Other techniques either rank the
indicators based on a certain performance metric, like correlation, or
start with the full or empty set and add/subtract indicators one by one.
These type of searches are known as being “greedy”, meaning that they
can only search in one direction, e.g. going from strongest correlation
to weakest, but may miss some important relationships between
indicators.

One method that attempts to get around this is the known as a “hill
climbing” method. A hill climbing method starts with a random subset of
indicators and then evaluates all of the “neighboring” combinations,
selecting the best performing combination. While this does allow us to
find interesting relationships between indicators, there is a chance
that we get stuck in what’s called a “local maximum,” which means that
it may just be the best combination of neighbors but not necessarily the
best overall combination. To get around this, we can run the search
multiple times to see what patterns emerge in the top neighboring
subsets.

Even when using these techniques, we still need to define the
feature space to search from. This is where your domain experience comes
in. Now you may not know the best period to use, but you most likely
have a set of “favorite” indicators that you know and make sense to you
conceptually. You can give this process a huge head start by selecting
meaningful, relevant indicators that you understand.

In the name of simplicity, we’ll start with just 5 different and
commonly used base indicators: SMA, EMA, RSI, CCI and MACD.

We can use different periods and calculations based off these indicators to create features.

For example, here are 20 different "features":

We can see how even looking at only 5 indicators leads to a huge number of possible inputs to your strategy!

Let's get started building our model. We'll explore the usefulness
of the above 20 features in building a profitable strategy.

First, we need to install the necessary libraries:

install.packages(“FSelector”) # the libraryfor the hill climbing search
library(FSelector)
install.packages(“e1071”) # the libraryfor the support vector machine
library(e1071)
install.packages(“ggplot2”) # the libraryfor our plots
library(ggplot2)

Then split the data into our training and test sets. We’ll use ⅔ of
the data to build the model and reserve ⅓ to test out of sample.

We’ll then need to build the function that evaluates each subset
using the SVM. We’ll break our training set into another training and
test set to evaluate these models on data that was not used in the
building process. This is another check to make sure we are not
overfitting.

Interesting, so we can see that the distance between the price and a
5-period simple moving average (“Price_Minus_SMA_5”) appeared in 9 of
the 10 subsets, followed by the distance between the price in a 5-period
exponential moving average (“Price_Minus_EMA_5”) and the distance
between the price in a 50-period exponential moving average
(“Price_Minus_EMA_50”) in 8 of the 10 subsets. Since we are looking to
predict only the next bar, it makes sense that the shorter periods would
perform best but it is interesting that a simple moving average was
selected in more subsets than the exponential moving average.

Now let’s see how many indicators are in each subset:

summary(Replication_Test)

So the 5th and 8th iterations of our search each had top subsets
with only 6 indicators but while I tend to favor subsets with smaller
number of indicators, the 4th iteration was the smallest subset that
included our top feature, the “Price_Minus_SMA_5”, so let’s use that one
as our final model:

Replication_Test[ [ 4 ] ]

We’ll build the final model with the entire Training Set and test it over our yet-to-be-touched Validation set:

SVM_Final<-svm(Predicted_Class~PRICE_Minus_SMA_5+PRICE_Minus_SMA_30+PRICE_Minus_SMA_50+PRICE_Minus_EMA_30+CCI_5+CCI_30+MACD_Signal,data=Training, kernel="radial",cost=1,gamma=1/20)
SVM_Predictions_Final<-predict(SVM_Final,Validation,type="class") # Predict over the Validation set
Accuracy_Final<- Accuracy<-sum(SVM_Predictions_Final == Validation$Predicted_Class) /nrow(Validation)

Very nice, so over our out-of-sample validation set, we were able
to get 53% accuracy! While this does not seem like much, the point of
this process was to be able to find a robust subset of indicators. Now
we have a baseline of 53% accuracy that we can then use to further build
and improve our strategy.

With TRAIDE, we can then
use an ensemble of machine-learning algorithms to find the most
profitable entry signals based on these indicators.

## Selecting the indicators to use is one of the most important and difficult aspects of building a successful strategy.

Not only are there thousands of different indicators, but most indicators have numerous settings which amounts to virtually limitless indicator combinations. Clearly testing every combination is not possible, so many traders are left on their own selecting somewhat random inputs to build their strategy.

Luckily this process, known as “feature selection”, is a very well-researched topic in the machine learning world. In this post we’ll use a popular technique known as a “wrapper method” that uses a support vector machine (SVM) and a “hill climbing” search to find a robust subset of indicators to use in your strategy.

## Feature Selection with Machine Learning: Wrapper Method and Hill Climbing

Selecting from the near limitless possible combinations of indicators to use in your strategy can be very daunting. However, this is a problem that machine learning experts and data scientists have been grappling with for a long time and have come up with a wide range of tools and techniques to help you out.

One popular technique is a wrapper method. A wrapper method uses a machine-learning algorithm to evaluate each subset of indicators. The main advantage to this method, as opposed to evaluating each indicator on its own, is we can find the relationships and dependencies between indicators. For example, a 14-period RSI or a 30-period CCI alone may not be great predictors, but combined could provide some valuable insight.

The next question is how do you decide which subsets of indicators to evaluate? If you are only looking at a few indicators then trying every combination might be possible, but even looking through 15 different indicators for 3 to use in your strategy leads to over 450 different possible combinations! Other techniques either rank the indicators based on a certain performance metric, like correlation, or start with the full or empty set and add/subtract indicators one by one. These type of searches are known as being “greedy”, meaning that they can only search in one direction, e.g. going from strongest correlation to weakest, but may miss some important relationships between indicators.

One method that attempts to get around this is the known as a “hill climbing” method. A hill climbing method starts with a random subset of indicators and then evaluates all of the “neighboring” combinations, selecting the best performing combination. While this does allow us to find interesting relationships between indicators, there is a chance that we get stuck in what’s called a “local maximum,” which means that it may just be the best combination of neighbors but not necessarily the best overall combination. To get around this, we can run the search multiple times to see what patterns emerge in the top neighboring subsets.

For this article, we’ll use a support vector machine (which we previously used to build a strategy for the RSI) and repeat the hill climbing search 10 times to find the best subset.

## Selecting Indicators

Even when using these techniques, we still need to define the feature space to search from. This is where your domain experience comes in. Now you may not know the best period to use, but you most likely have a set of “favorite” indicators that you know and make sense to you conceptually. You can give this process a huge head start by selecting meaningful, relevant indicators that you understand.

In the name of simplicity, we’ll start with just 5 different and commonly used base indicators: SMA, EMA, RSI, CCI and MACD.

We can use different periods and calculations based off these indicators to create features.

For example, here are 20 different "features":

We can see how even looking at only 5 indicators leads to a huge number of possible inputs to your strategy!

Let's get started building our model. We'll explore the usefulness of the above 20 features in building a profitable strategy.

## Building Our Model

Now that we have our features we can

build our model in R(When working with an SVM, it is important to normalize the data: You can download the original data and the normalized data we use for inputs to the model here).First, we need to install the necessary libraries:

Then split the data into our training and test sets. We’ll use ⅔ of the data to build the model and reserve ⅓ to test out of sample.

We’ll then need to build the function that evaluates each subset using the SVM. We’ll break our training set into another training and test set to evaluate these models on data that was not used in the building process. This is another check to make sure we are not overfitting.

Now we can run the hill climbing search once to see what it finds:

But what we really care about is running it 10 times to help stabilize the results (this will take a couple minutes to run):

Let’s dive into the results and take a look at which indicators tended to be included in the best subsets:

Interesting, so we can see that the distance between the price and a 5-period simple moving average (“Price_Minus_SMA_5”) appeared in 9 of the 10 subsets, followed by the distance between the price in a 5-period exponential moving average (“Price_Minus_EMA_5”) and the distance between the price in a 50-period exponential moving average (“Price_Minus_EMA_50”) in 8 of the 10 subsets. Since we are looking to predict only the next bar, it makes sense that the shorter periods would perform best but it is interesting that a simple moving average was selected in more subsets than the exponential moving average.

Now let’s see how many indicators are in each subset:

So the 5th and 8th iterations of our search each had top subsets with only 6 indicators but while I tend to favor subsets with smaller number of indicators, the 4th iteration was the smallest subset that included our top feature, the “Price_Minus_SMA_5”, so let’s use that one as our final model:

We’ll build the final model with the entire Training Set and test it over our yet-to-be-touched Validation set:

Very nice, so over our out-of-sample validation set, we were able to get 53% accuracy! While this does not seem like much, the point of this process was to be able to find a robust subset of indicators. Now we have a baseline of 53% accuracy that we can then use to further build and improve our strategy.

With TRAIDE, we can then use an ensemble of machine-learning algorithms to find the most profitable entry signals based on these indicators.

Now it’s your turn to use machine learning to build your next strategy!