Loading...

Ideas For Extending Length of Timeseries History


Insufficient Length of Timeseries Data
When developing models (machine learning, statistical, linear etc), most practitioners prefer to utilize as much data history as possible. This is prioritized so that the model can be calibrated across a variety of trading environments and market regimes, making the model more robust. When seeking historical cryptocurrency data, the length of the data history per exchange can present some challenges. For starters, not all exchanges were founded and came into existence at the same time. Binance, for example, was not on the scene until 2017 - so it is impossible to get BTC/USDT history from Binance before they existed! Other exchanges have been around since 2014 (Gemini), and so a user may wonder how can he extend his Binance data using the Gemini dataset. Given this is a common problem that others will encounter, we will outline how to extend the Binance timeseries using Gemini's data, and how to blend the new Gemini volumes into sync with Binance volumes.

Blending the Data
At first glance, it may seem to be a very easy solution --> Start with the Binance data, and then just add the Gemini data to it! Very easy indeed. But most models will want to use the Volume field as a feature / variable to the model; and the volumes between Binance and Gemini are drastically different. These differences reflect a different size user base that trade on each of the respective exchanges. What might be a way to resolve this? Or said another way, "How can we convert Gemini volumes into Binance volumes?"

    1) Create a new column for the ratio between Binance volumes and Gemini Volumes
    2) For each time interval, divide the Binance volume by the Gemini Volume to arrive at a ratio. These ratios will change daily (as volumes and participation change daily), and so you will want to decide on a window in time of a month or greater
    3) Take a simple average of all the calculated ratios over the window period to come up with one ratio
    4) For each of the Gemini volumes, apply the calculated ratio to the proxied Gemini data volumes by multiplying the Gemini volume by the ratio

    Congratulations, you've now extended your Bitcoin timeseries data history by 3+ years (and many many time intervals if you are using minute data!) Please note that there is one consideration that is not accounted for here when blending the data: volumes between exchanges can cause the ratio to shift over time as more (or less) users use a particular exchange. For this reason, you may want to set your ratio window to be around the period of time where the two distinct data sets are joined together.




    Notice: Information contained herein is not and should not be construed as an offer, solicitation, or recommendation to buy or sell securities. The information has been obtained from sources we believe to be reliable; however no guarantee is made or implied with respect to its accuracy, timeliness, or completeness. Author does not own the any crypto currency discussed. The information and content are subject to change without notice. CryptoDataDownload and its affiliates do not provide investment, tax, legal or accounting advice.

    This material has been prepared for informational purposes only and is the opinion of the author, and is not intended to provide, and should not be relied on for, investment, tax, legal, accounting advice. You should consult your own investment, tax, legal and accounting advisors before engaging in any transaction. All content published by CryptoDataDownload is not an endorsement whatsoever. CryptoDataDownload was not compensated to submit this article. Please also visit our Privacy policy; disclaimer; and terms and conditions page for further information.

    THE PERFORMANCE OF TRADING SYSTEMS IS BASED ON THE USE OF COMPUTERIZED SYSTEM LOGIC. IT IS HYPOTHETICAL. PLEASE NOTE THE FOLLOWING DISCLAIMER. CFTC RULE 4.41: HYPOTHETICAL OR SIMULATED PERFORMANCE RESULTS HAVE CERTAIN LIMITATIONS. UNLIKE AN ACTUAL PERFORMANCE RECORD, SIMULATED RESULTS DO NOT REPRESENT ACTUAL TRADING. ALSO, SINCE THE TRADES HAVE NOT BEEN EXECUTED, THE RESULTS MAY HAVE UNDER-OR-OVER COMPENSATED FOR THE IMPACT, IF ANY, OF CERTAIN MARKET FACTORS, SUCH AS LACK OF LIQUIDITY. SIMULATED TRADING PROGRAMS IN GENERAL ARE ALSO SUBJECT TO THE FACT THAT THEY ARE DESIGNED WITH THE BENEFIT OF HINDSIGHT. NO REPRESENTATION IS BEING MADE THAT ANY ACCOUNT WILL OR IS LIKELY TO ACHIEVE PROFIT OR LOSSES SIMILAR TO THOSE SHOWN. U.S. GOVERNMENT REQUIRED DISCLAIMER: COMMODITY FUTURES TRADING COMMISSION. FUTURES AND OPTIONS TRADING HAS LARGE POTENTIAL REWARDS, BUT ALSO LARGE POTENTIAL RISK. YOU MUST BE AWARE OF THE RISKS AND BE WILLING TO ACCEPT THEM IN ORDER TO INVEST IN THE FUTURES AND OPTIONS MARKETS. DON’T TRADE WITH MONEY YOU CAN’T AFFORD TO LOSE. THIS IS NEITHER A SOLICITATION NOR AN OFFER TO BUY/SELL FUTURES OR OPTIONS. NO REPRESENTATION IS BEING MADE THAT ANY ACCOUNT WILL OR IS LIKELY TO ACHIEVE PROFITS OR LOSSES SIMILAR TO THOSE DISCUSSED ON THIS WEBSITE. THE PAST PERFORMANCE OF ANY TRADING SYSTEM OR METHODOLOGY IS NOT NECESSARILY INDICATIVE OF FUTURE RESULTS.

Latest Posts
Follow Us
Notify me of new content