Declarative transformation of JSON drives unified processing of cryptocurrency exchanges' APIs

7 October 2018, 16:15
Stanislav Korotky
0
527
Today we have an opportunity to easily use many web-services including (but not limited to) APIs of numerous cryptocurrency exchanges. Technically, they rely on widely adopted standards and work with semantically similar objects (such as currencies, tickers, orders, trades, quotes, etc.), but the data structures are varied significantly between the services. In particular, almost all modern APIs utilize JSON format, but every service packs its data in its own non-standard way, with different hierarchy and fields. While dealing with these various APIs it would be great to have an option to process them in a unified way. And for cryptocurrency exchanges an attempt is made to accomplish the task in the ccxt library. Its loose port to MQL5 CCXTMT introduces a special declarative JSON-based language to transform JSON into JSON ;-). Let us review step by step, what the language can do.

JSON transformation


First of all, it should be noted that JSON is processed in one single pass as a stream (no look back, no look forward).

All transformations are written as key-value pairs, where the left-hand side (key) denotes target JSON (i.e. result), and the right-hand side (value) specifies some location(s) in the source JSON.

The simplest transformation is renaming of a field. For example, one may need to take a 'timestamp' in a source JSON and save it under different name 'datetime' in a resulting JSON. Then it can be handled by the rule:

{
  ...,
  'datetime' : 'timestamp',
  ...
}

Both left and right names can be complex, including paths (let's call them "JSON selectors" by analogy with CSS selectors or XPATH selectors):

{
  ...,
  'thisobject.datetime' : 'thatstruct.part.timestamp',
  ...
}

During the transfer some additional processing may take place. In fact, the right-hand side is an expression supporting all common operations and many embedded functions. For example, the following rule will adjust the time with a difference between the client and the server, and then convert the result to a datetime string:

{
  ...,
  'datetime' : 'ISO8601(timestamp + #.options.timeDifference)',
  ...
}

Here '#.options.timeDifference' is a reference to specific property in this CCXTMT JSON object with the rules (# means the root). Of course, the property should be first set somehow - either specified in the JSON-file itself or filled from a data received from server. If it's not - the value is just null (0).

Some fields in the source JSON can be arrays. To convert them into an object it's possible to write something like this:

{
  ...,
  'first' : 'array[0]',
  'second' : 'array[1]',
  'third' : 'array[2]',
  ...
}

To make a reverse transformation from an object to an array, write surprisingly obvious:

{
  ...,
  'array[0]' : 'first',
  'array[1]' : 'second',
  'array[2]' : 'third',
  ...
}

To transfer arrays as a whole we can skip the indices:

{
  ...,
  'buys[]' : 'asks[]',
  'sells[]' : 'bids[]',
  ...
}

This will copy arrays 'asks' and 'bids' into arrays 'buys' and 'sells' respectively, preserving the order of elements.

If necessary, you can apply specific indexing:

{
  ...,
  'buys[5]' : 'asks[0]',
  'buys[4]' : 'asks[2]',
  'buys[3]' : 'asks[4]',
  ...
}

If we need to copy only several elements from an array, then do it like this:

{
  ...,
  'OHLCV[]' : '#0 < 6 ? array[]',
  ...
}

Here we see a bunch of new syntactic things. #0 is the iterator variable used internally to traverse the array. The condition in front of '?' is true while the iterator is less than 6, that is for 5 first elements. While it's true, the expression is evaluated to the ending part after the '?', that is array elements. When the condition is false, the expression is evaluated to null and has no effect. BTW, we can add alternative branch and provide something instead of the null:

{
  ...,
  'OHLCV[]' : '#0 < 6 ? array[] : ToDouble(array[])',
  ...
}

Of course, arrays can be nested:

{
  ...,
  {'[][]' : 'Typename([][]) == \"string\" ? StringToDouble([][]) : [][]'},
  ...
}

Here we have not a JSON object, but JSON array of arrays. We copy it to a new array with appropriate elements' type conversion. BTW in such cases we have two iterators and can use them in the expressions as #0 (first index) and #1 (second index). If you have an array of arrays of arrays when there will be #2 as well, and so on.

The notation '[][]' does actually imply '[#0][#1]'. So, it's possible to transpose the matrix if mention the indices in reversed order:

{
  ...,
  {'[#1][#0]' : '[][]'},
  ...
}

There is a special use case of arrays. If you have an array as the source and a scalar as the target, the algorithm will accumulate (sum up) values of array elements (or results of evaluation of an expression with arrays):

{
  ...,
  {'cost' : 'price[] * lot[]'},
  ...
}

Until this moment we have considered only a single rule at a time. To combine rules to a set, we can write them in an object or an array. The object may look like this:

{
  ...,
  'response':
  {
    '[].id' : 'symbols[].symbol',
    '[].symbol' : 'CommonCurrencyCode(symbols[].baseAsset) + \"/\" + CommonCurrencyCode(symbols[].quoteAsset)',
    '[].base' : 'CommonCurrencyCode(symbols[].baseAsset)',
    '[].quote' : 'CommonCurrencyCode(symbols[].quoteAsset)',
    '[].baseId' : 'symbols[].baseAsset',
    '[].quoteId' : 'symbols[].quoteAsset',

    '[].precision.base' : 'symbols[].baseAssetPrecision',
    '[].precision.quote' : 'symbols[].quotePrecision',
    '[].precision.amount' : 'symbols[].baseAssetPrecision',
    '[].precision.price' : 'symbols[].quotePrecision',
    '[].active' : 'symbols[].status == \"TRADING\"',
    '[].lot' : '-1 * Log10(symbols[].baseAssetPrecision)',
    ...
  },
  ...
}

Here a single rule is an element of the object, the name of the element is the target, and the value is the source.

If an array is used to combine rules, it looks like this:

{
  ...,
  'response':
  [
    {'[].timestamp' : 'ToInteger([].T || [].time)'},
    {'[].datetime' : 'ISO8601(ToInteger([].T || [].time))'},
    {'[].symbol' : '([].T || [].time) ? GetMarketSymbol(symbol)'},
    {'[].id' : 'ToString([].a || [].id)'}, 
    {'[].order' : 'ToString([].orderId)'},
    {'[].type' : 'null'}, 
    {'[].takerOrMaker' : 'Typename([].isMaker) == \"bool\" ? ([].isMaker ? \"maker\" : \"taker\")'}, 
    {'[].side' : 'Typename([].isBuyer) == \"bool\" ? ([].isBuyer ? \"buy\" : \"sell\")'}, 
    {'[].side' : 'Typename([].m) == \"bool\" ? ([].m ? \"sell\" : \"buy\")'},
    {'[].price' : 'ToDouble([].p || [].price)'},
    {'[].amount' : 'ToDouble([].q || [].qty)'}, 
    {'[].cost' : 'ToDouble([].p || [].price) * ToDouble([].q || [].qty)'},
    {'[].fee.cost' : 'ToDouble([].commission)'}, 
    {'[].fee.currency' : 'CommonCurrencyCode([].commissionAsset)'},
  ],
  ...
}

Here a single rule is a separate object, which is an element of the array. There are 2 differences between these 2 methods.

In the object the order of execution of the rules is not defined, so it can be (and will most likely be) different to the order in which the rules are written in the object. This is because the objects are internally stored in a hash map (at least in CCXTMT). Also, the objects can not contain properties with the same name (key). In other words, if multiple fields have the same name, only one of them is actually stored in the object and all the other instances are discarded.

In the array the order of execution of the rules is preserved and guaranteed. Also the array can contain objects, which in turn contain properties with the same name. Just compare:

{
  ...,
  'response':
  {
    '[][]' : '[][0]',
    '[][]' : 'StringToDouble([][1])',
    '[][]' : 'StringToDouble([][2])',
    '[][]' : 'StringToDouble([][3])',
    '[][]' : 'StringToDouble([][4])',
    '[][]' : 'StringToDouble([][5])',
  },
  ...
}

This is incorrect: only one of the rules for indices from 1 to 5 will be processed. And this is correct:

{
  ...,
  'response':
  [
    {'[][]' : '[][0]'},
    {'[][]' : 'StringToDouble([][1])'},
    {'[][]' : 'StringToDouble([][2])'},
    {'[][]' : 'StringToDouble([][3])'},
    {'[][]' : 'StringToDouble([][4])'},
    {'[][]' : 'StringToDouble([][5])'},
  ],
  ...
}

BTW, the order of the rules is actually unimportant here, because they will be triggered by incoming data flow in any way. The order may be important, for example, if you find some value in the stream and then store it in internal options to use in subsequent rules. Or if you initialize a field with a default value, and then update it if an actual (but optional) value then occurred in the input.

Finally, let's look at one more interesting trick which may be helpful. Sometimes you need to place data under output keys also taken from the input. For this purpose one can duplicate JSON selectors from the right-hand side in the left-hand side between square brackets: 

  'response':
  {
    '[balances[].asset].free': 'balances[].asset ? balances[].free',
    '[balances[].asset].used': 'balances[].asset ? balances[].locked',
    '[balances[].asset].total': 'balances[].asset && balances[].free && balances[].locked ? balances[].free + balances[].locked',
    'free[balances[].asset]': 'balances[].asset ? balances[].free',
    'used[balances[].asset]': 'balances[].asset ? balances[].locked',
    'total[balances[].asset]': 'balances[].asset && balances[].free && balances[].locked ? balances[].free + balances[].locked',
  }

This way resulting JSON will have properties defined by values from the source JSON, in this case - by various currency names, mentioned in input array 'balances[]'.

More examples of JSON transformations can be found in setup JSON files used by CCXTMT library for unified processing of cryptocurrency exchanges' API data.

Where and how these JSON transformation rules are used is covered in the next blog post.


Share it with friends: