Chris Conlan

Financial Data Scientist

  • About
  • Blog
    • Business Management
    • Programming with Python
    • Programming with R
    • Automated Trading
    • 3D Technology and Virtual Reality
  • Books
    • The Financial Data Playbook
    • Fast Python
    • Algorithmic Trading with Python
    • The Blender Python API
    • Automated Trading with R
  • Snippets

Moving from Single-Asset to Multi-Asset Algorithmic Trading

April 12, 2020 By Chris Conlan Leave a Comment

In my latest book, Algorithmic Trading with Python (2020), readers work through the process of developing a trading strategy, simulator, and optimizer against a portfolio of 100 assets. Each asset has 10 years of end-of-day data, creating about 2,500 data points per asset, totaling 250,000 data points.

A lot of similar work in this field focuses on hyper-analyzing very dense data on single assets. For example, tick-by-tick data on S&P 500 futures can have 100,000+ data points per day at times. I chose to structure my book in this way for a few reasons. The reasons not only inform the structure of my book, but also provide certain advantages to traders that might be interested in structuring their strategies in a similar way.

Effects on Trading Frequency

Retail traders often encounter trading frequency restrictions in the form of government regulation or broker-level rules. These rules typically prevent traders from making more than three trades per week on a specific asset. Thus, for a trader with a small account, it is not worthwhile to study high-frequency data on a single asset. Any discoveries or trading strategies developed against that data will not translate into a workable investment plan. On the other hand, if your strategy includes 100, 500, or 1,000 assets, it is likely not making frequent trades against any single asset. Multi-asset strategies seem to be operate inline with government and brokerage expectations of low-risk investing without limiting profit opportunities.

Effects on Sample Size

The multi-asset setup provides additional benefits during the research phase. Many financial machine learning strategies utilize event-based labels that can potentially overlap. A high incidence of overlapping labels has the potential to create significant data leakage problems in your modeling step. In my experience, this effect can be mitigated significantly by using multi-asset strategies that generate signals less frequently on a per-asset basis, but more frequently on a per-strategy basis. The resulting machine learning data set that can be built from a multi-asset strategy will have a higher number of unique samples per row of data than the alternative high-frequency single-asset strategy.

Collaboration and Reproducibility

I mentioned that the quantity of data in the aforementioned book was about 250,000 rows. When compressed, the data is about 11MB total. I chose this format for the book, because I considered it important that the data was freely hosted and shared on a GitHub repo. Since the data fits, I figured it was the perfect size to serve as a small-scale benchmark for reproducible finance research.

Most financial literature is plagued by reproducibility issues because of the size and propriety of the data. Other fields of study have large-scale open-source data sets against which researchers test and compare the results. Finance has no such thing, because the data under consideration has such high value. I hope that the data sets published with Algorithmic Trading with Python will start to get people thinking about what a good benchmark data set looks like in this field.

Effects on Diversification

One interesting and unintended effect of transitioning to a multi-asset strategy is that stocks tend to move in lockstep during major macroeconomic events like to 2008 recession or the 2020 COVID-19 crash. At the same time, stocks tend to rise simultaneously during bull markets. When your multi-asset strategy uses a basket of highly correlated U.S. equities, you have to be sure to compare your performance to the market. Some periods in history, like 2010-2020, were massively long bull markets. As a result, beating the base case is fairly difficult on simulated data during the 2010’s.

Conclusion

Getting a trading strategy off the ground with high-frequency data is a daunting undertaking. Retail traders should start their algo-trading efforts by looking at end-of-day data and multi-asset strategies. Otherwise, it will be pretty easy to get discouraged and give up.

Filed Under: Automated Trading

Leave a Reply Cancel reply

Latest Release: The Financial Data Playbook

The Financial Data Playbook

Available for purchase at Amazon.com.

Algorithmic Trading

Pulling All Sorts of Financial Data in Python [Updated for 2021]

Calculating Triple Barrier Labels from Advances in Financial Machine Learning

Calculating Financial Performance Metrics in Pandas

Topics

  • 3D Technology and Virtual Reality (8)
  • Automated Trading (9)
  • Business Management (9)
  • Chris Conlan Blog (5)
  • Computer Vision (2)
  • Programming with Python (16)
  • Programming with R (6)
  • Snippets (8)
  • Email
  • LinkedIn
  • RSS
  • YouTube

Copyright © 2022 · Enterprise Pro Theme On Log in