Contributed so kindly by Joe Wojniak
It may seem obvious, but financial research requires data — and a lot of it. If financial research isn’t your day job, it can be surprisingly difficult to come by. Here are some suggestions for acquiring data to use in your financial research project.
Data Sources
Every project has to start somewhere. In development, you need a data that is representative of the production data. This can be a historical or a static data set. In development, you keep a local copy and run your testing on it. In production, you update it and re-run your analysis on it regularly. Here are some great places to get this data.
Simulated End-of-day and Alternative Data
Financial research tends to produce unique efforts. Rarely does a researcher ask that their research be reproduced (and hence to be validated by being reproduced.) Chris Conlan has done something incredible here. Chris has provided Python trading algorithms and the data to reproduce the results. In a world where it seems that the zero-sum game rules, Chris has provided the methods and the data so that you can improve your trading chops. Check out the repo and accompanying book https://github.com/chrisconlan/algorithmic-trading-with-python.
Stooq for Multiple Asset Classes
See this site https://stooq.com/.
Stooq is a great website because it provides data for free. It’s a little too good to believe, but get your free data while it’s available for:
- Indices
- Currencies
- Stocks
- Bonds
- Historical data – want to know what the Nasdaq was back in 1938? Stooq has it!. Just make sure you have plenty of bandwidth for this download. Daily, Hourly, and 5 minute data is available for all major exchanges. See this link https://stooq.com/db/h/
Yahoo Finance Still Exists
The Yahoo Finance API isn’t as easy to access as it used to be. But there is a work around. The Quantstat Github repo written by Ran Aroussi provides a Python library that connects to the Yahoo Finance API. The Quantstat library also provides a ready-to-go tearsheet with plots.
See the repo: https://github.com/ranaroussi/quantstats.
Google Finance
This is another popular API that is harder to access now. The Google Finance API is now only available through Google Sheets. Type the following into a cell: =GOOGLEFINANCE({SYMBOL})
. Replace {SYMBOL}
with a ticker symbol to get the current price of a market traded asset.
Stooq Pandas Datareader Interface
Surprised? Stooq has an API via a Pandas datareader
interface that you can use to access daily, weekly, monthly, quarterly and yearly data. See https://pandas-datareader.readthedocs.io/en/latest/readers/stooq.html.
Other Pandas Datareaders
Need to refresh intraday? Check out the Pandas datareader list. The data sources here can be accessed programmatically using the Pandas datareader
matching the data source listed. There is example code given here https://pandas-datareader.readthedocs.io/en/latest/remote_data.html.
Here are some Pandas datareader
interfaces.
- Tiingo – (as described on the Pandas datareader website): Tiingo is a trading platform that provides a data API with historical end of day prices on equities, mutual funds, and ETFs. Registration required for a free API key. Free accounts are rate limited and can access a limited number of symbols. Accounts for individual use are $10/month or $99/year. https://api.tiingo.com/about/pricing.
- IEX – The Investors Exchange: $9/month for an individual account. https://iexcloud.io/pricing/.
- Alpha Vantage – the free API tops out pretty quick (5 API requests per minute, 500 requests per day.) The free API is not suitable for development when you’re debugging code. A subscription is required to increase the number of requests. Intraday data is available. Alpha Vantage also provides calculated technical indicators (SMA, EMA, etc.) 30 API requests per minute costs $29.99/month. https://www.alphavantage.co/premium/.
- EconDB.com – the database of economic indicators. These are general economic indicators, think GDP, etc. https://www.econdb.com/docs.
- Enigma – sector data and research company information. Basic attribute information is free, after you sign up for an account. https://console.enigma.com/register?d=1.
- Quandl – delivers financial, economic, and alternative data. Used by top hedge funds, asset managers, and investment banks. Free sample End of Day data available: https://www.quandl.com/data/EOD-End-of-Day-US-Stock-Prices
- St.Louis FED (FRED) – data for economic research: https://fred.stlouisfed.org/docs/api/fred/.
Kenneth French’s data library – asset pricing models, white papers, and data: https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=1455. - World Bank – global development data: https://data.worldbank.org/.
- OECD – Organization For Economic Cooperation and Development – key short term economic indicators: https://stats.oecd.org/Index.aspx?DataSetCode=KEI.
- Eurostat – European statistics: https://ec.europa.eu/eurostat/.
- Thrift Savings Plan – TSP mutual fund index prices: https://www.tsp.gov/fund-performance/share-price-history/.
- Nasdaq Trader symbol definitions – just what it says, but still useful: ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqtraded.txt.
- Stooq – have I bragged on Stooq enough? https://stooq.com/.
- MOEX – Moscow Exchange historical data: https://www.moex.com/a2864.
- Naver Finance – Korean historical stock market data: https://finance.naver.com/.
- Coinbase.com has data endpoints where the cryptocurrency prices don’t need an API key. Example code in curl, Ruby, Python, and Node is also given to access the API endpoints: https://developers.coinbase.com/api/v2#data-endpoints.
The Ever Changing World of Data
The world of financial data is constantly changing. Quantopian provided an online community where quants could practice and improve their skills. Then, as of November 2020, Quantopian is no more and many of the people behind Quantopian are now working at Robinhood! Pyfolio and Zipline continue to be available on Github but they don’t seem to be actively supported anymore. So, why do I mention this? The data sources listed above may work and they may not work depending upon when you’re reading this (I am writing this in December 2020, so this is my effort at future proofing this article!).
Code Sample for Stooq
By installing pandas
and the pandas_datareader
packages from Python’s pip
, you can easily get data from Stooq using the following code snippet.
# requires ... # pip install pandas pandas-datareader import pandas as pd import pandas_datareader.data as web df = web.DataReader('AAPL', 'stooq')
Leave a Reply