Chris Conlan

Financial Data Scientist

  • About
  • Blog
    • Business Management
    • Programming with Python
    • Programming with R
    • Automated Trading
    • 3D Technology and Virtual Reality
  • Books
    • The Financial Data Playbook
    • Fast Python
    • Algorithmic Trading with Python
    • The Blender Python API
    • Automated Trading with R
  • Snippets

Download Historical Stock Data with R and Python

July 24, 2017 By Chris Conlan 35 Comments

Update #2: Track this Issue Here

Updates to this post are more about which API’s are still supported than how to access them with R, Python, or any other language. Follow the hilarious change history of EOD stock data API’s at my other post: https://chrisconlan.com/download-daily-data-every-sp-500-stock-r/.

Update: Using Quandl’s API

Because everything I write about breaks, the Google Finance API stopped taking requests at this URL. It only returns a year’s worth of daily data as of the time of writing.

We will move on to Quandl. They have a stable key-driven API that doesn’t seem to be going anywhere. Python code for Quandl coming soon. R code below.

Users will need to visit Quandl’s website and sign up for an API key to access the data.

Python Code

Users will need install the Quandl library from pip to use the script with: pip install quandl.

import quandl
import datetime

quandl.ApiConfig.api_key = 'your_api_key'

def quandl_stocks(symbol, start_date=(2000, 1, 1), end_date=None):
	"""
	symbol is a string representing a stock symbol, e.g. 'AAPL'

	start_date and end_date are tuples of integers representing the year, month,
	and day

	end_date defaults to the current date when None
	"""

	query_list = ['WIKI' + '/' + symbol + '.' + str(k) for k in range(1, 13)]

	start_date = datetime.date(*start_date)

	if end_date:
		end_date = datetime.date(*end_date)
	else:
		end_date = datetime.date.today()

	return quandl.get(query_list, 
			returns='pandas', 
			start_date=start_date,
			end_date=end_date,
			collapse='daily',
			order='asc'
			)


if __name__ == '__main__':

	apple_data = quandl_stocks('AAPL')
	print(apple_data)

R Code

Users will need to download the Quandl package from CRAN to run this using: install.packages(‘Quandl’).

Credit to GitHub user johnatasjmo for this solution:

# Quandl package must be installed
library(Quandl)

# Get your API key from quandl.com
quandl_api = "MYAPIKEY"

# Add the key to the Quandl keychain
Quandl.api_key(quandl_api)

quandl_get <-
function(sym, start_date = "2017-01-01") {
    require(devtools)
    require(Quandl)
    # create a vector with all lines
    tryCatch(Quandl(c(
        paste0("WIKI/", sym, ".8"),  #  Adj. Open
        paste0("WIKI/", sym, ".9"),  # Adj. High
        paste0("WIKI/", sym, ".10"), # Adj. Low
        paste0("WIKI/", sym, ".11"), # Adj. Close
        paste0("WIKI/", sym, ".12")), # Adj. Volume
        start_date = start_date,
        type = "zoo"
        ))
}

Original Post

In one of my most popular posts, Download Price History for Every S&P 500 Stock, other traders and I despaired over the death of the Yahoo! Finance API. After batting around a lot of potential replacements, I was still left searching for a good free source of data to use for education and retail trading. I found the answer by searching through the R package quantmod, which was successfully downloading data from Google despite this message on developers.google.com/finance/.

It was Hidden!

As it turns out, quantmod was using a hidden Google Finance API that was quite easy to reverse engineer. In this post, we will build functions for accessing that API in both R and Python. For readers of my book, Automated Trading with R, this will serve as a replacement for the often-referenced yahoo() function, but not as a perfect replacement.

R Code

# Make sure data.table is installed
if(!'data.table' %in% installed.packages()[,1]) install.packages('data.table')

# Function to fetch google stock data
google_stocks <- function(sym, current = TRUE, sy = 2005, sm = 1, sd = 1, ey, em, ed)
{
  # sy, sm, sd, ey, em, ed correspond to
  # start year, start month, start day, end year, end month, and end day

  # If TRUE, use the date as the enddate
  if(current){
    system_time <- as.character(Sys.time())
    ey <- as.numeric(substr(system_time, start = 1, stop = 4))
    em <- as.numeric(substr(system_time, start = 6, stop = 7))
    ed <- as.numeric(substr(system_time, start = 9, stop = 10))
  }
  
  require(data.table)
  
  # Fetch data from google
  google_out = tryCatch(
    suppressWarnings(
      fread(paste0("http://www.google.com/finance/historical",
                   "?q=", sym,
                   "&startdate=", paste(sm, sd, sy, sep = "+"),
                   "&enddate=", paste(em, ed, ey, sep = "+"),
                   "&output=csv"), sep = ",")),
    error = function(e) NULL
  )
  
  # If successful, rename first column
  if(!is.null(google_out)){
    names(google_out)[1] = "Date"
  }
  
  return(google_out)
}

# Test it out
apple_data = google_stocks('AAPL')

Giving this a quick plot, we can see it is working.

RStudio screenshot of results

Python Code

See the below Python code that accomplishes the same thing using the pandas, io, requests, and time modules. Many people will not have requests or pandas installed by default, so check your package managers if need be.

import pandas as pd
import io
import requests
import time

def google_stocks(symbol, startdate = (1, 1, 2005), enddate = None):

	startdate = str(startdate[0]) + '+' + str(startdate[1]) + '+' + str(startdate[2])

	if not enddate:
		enddate = time.strftime("%m+%d+%Y")
	else:
		enddate = str(enddate[0]) + '+' + str(enddate[1]) + '+' + str(enddate[2])

	stock_url = "http://www.google.com/finance/historical?q=" + symbol + \
				"&startdate=" + startdate + "&enddate=" + enddate + "&output=csv"

	raw_response = requests.get(stock_url).content

	stock_data = pd.read_csv(io.StringIO(raw_response.decode('utf-8')))

	return stock_data


if __name__ == '__main__':
	apple_data = google_stocks('AAPL')
	print(apple_data)

	apple_truncated = google_stocks('AAPL', enddate = (1, 1, 2006))
	print(apple_truncated)

Running this code, we can see the function works correctly with the dates and fetches the data quickly. Popping it into matplotlib look good, too.

Matplotlib screenshot of results

Notes on Data Structure

Unlike the Yahoo! Finance API, this will not return the adjusted close as a separate column. In Automated Trading with R, we go to great lengths to use the adjusted close to obtain adjusted open, adjusted high, and adjusted low. The Google Finance API used here returns all of that information for you. So, there is no column named “adjusted close”. All of the data is adjusted in advance.

Advanced users may be disappointed by this, because there is some interesting information regarding splits and dividends to be gained from the disparity between raw and adjusted closing prices. Regardless, most users should be able to use the data returned by this API to accomplish their simulation goals.

Filed Under: Automated Trading, Programming with Python, Programming with R

Comments

  1. Rafael Rodrigues says

    August 8, 2017 at 10:49 pm

    It is useless without Yahoo’s Adjusted Close, which means adjusts to dividends, stock splits, etc.

    Reply
    • Chris Conlan says

      August 8, 2017 at 10:53 pm

      Thanks for bringing this up. I should have noted that this data is adjusted by default.

      If you wanted to infer dividends and splits from the Adj Close vs. Close disparity, then you are out of luck here. For all other purposes, it is helpful that the data is adjusted in advance.

      Reply
  2. Phillip says

    August 10, 2017 at 12:21 am

    Nice work man.

    I didn’t know that book I bought was written you. Haven’t read it yet, but now I am looking forward to it!

    Thank you!

    Reply
    • Chris Conlan says

      August 10, 2017 at 7:41 pm

      Sounds like I’m doing a good job finding my audience! Thanks for checking out the book.

      Reply
      • projectincharge says

        February 28, 2019 at 7:40 pm

        Very interesting example for a new developer who is trying to do Mutual fund analysis using R.

        Reply
  3. Giuseppe says

    August 12, 2017 at 7:29 am

    Maybe you should look at the Tiingo API. They provide raw and adjusted data for splits and dividends

    @RafaelRodrigues

    Reply
  4. HalfLama says

    August 16, 2017 at 8:03 am

    Do you have any idea why the tickers on the Copenhagen exchange are not valid through the API? For example, you can get the historical prices for CPH:WDH and the URI formatting of the ticker becomes CPH%3AWDH. Either way it dosnt work with the current URI, which seems strange to me, because the Frankfurt tickers e.g. ETR:BMW works fine this way.

    Reply
    • HalfLama says

      August 16, 2017 at 2:45 pm

      Oh, i see the CPH tickers are missing the download button, so there is probably some limitation for goole on those, and hence they dont allow historical downloads on those.

      Reply
    • Chris Conlan says

      August 16, 2017 at 4:23 pm

      You may be right about the CPH tickers. There are no docs, so we are more or less writing them in this comment thread.

      Quick note – %3A is the URL encoding for a colon. So the conversion from “:” to “%3A” is expected behavior.

      Reply
  5. Giannini says

    September 14, 2017 at 1:48 pm

    Seems it does not work anymore ???

    Reply
    • Chris Conlan says

      September 14, 2017 at 10:46 pm

      I am seeing that it works… but ignores all arguments except the stock ticker. It only downloads the most recent year of data. I’ll investigate and update the post to reflect this change if it doesn’t have a workaround.

      Reply
  6. Diego says

    September 14, 2017 at 9:41 pm

    For some reason the python code is only going back one year . |Has anything changed with the API?
    Thanks for the code !

    Reply
    • Chris Conlan says

      September 14, 2017 at 10:46 pm

      Thanks for letting me know. It seems that this is affecting the whole API. I’ll take a look at how other packages are dealing with this. It may be a permanent change.

      Reply
  7. alfred aita says

    September 18, 2017 at 12:56 pm

    The following is now working, however you must download and install
    “pandas_datareader”
    using conda or pip in the usual way( python 2.7 or 3.6)

    import datetime
    from pandas_data_reader import data
    symbol = ‘MSFT’
    start = datetime.datetime(2008, 1, 5) # as example
    end = datetime.datetime(2008, 9, 17)

    #Unfortunately the google version of the following only returns 1 year:
    stock_data = data.get_data_yahoo(symbol = symbol, start , end)

    Reply
  8. Victor says

    October 13, 2017 at 2:12 pm

    http://finance.google.com/finance/historical?q=AAPL&startdate=Oct+18+2012&enddate=Oct+13+2017&output=csv

    Works

    Reply
    • rich dev says

      May 11, 2020 at 11:55 pm

      When I use the Google endpoint API

      http://finance.google.com/finance/historical?q=AAPL&startdate=Oct+18+2012&enddate=Oct+13+2017&output=csv

      The result from Google is now:

      “We’re sorry…
      … but your computer or network may be sending automated queries. To protect our users, we can’t process your request right now.

      See Google Help for more information.”

      Clicking on the Help link in this first webpage gives me another error message that says

      “Unusual traffic from your computer network”
      If devices on your network seem to be sending automated traffic to Google, you might see “Our systems have detected unusual traffic from your computer network.”

      What Google considers automated traffic
      Sending searches from a robot, computer program, automated service, or search scraper
      Using software that sends searches to Google to see how a website or webpage ranks on Google”

      The API no longer works using this simple http call.

      Reply
  9. Andrea says

    October 15, 2017 at 6:37 pm

    Hi i’m writing my master thesis related to automated trading and i found your book really usefull but i can’t go on with my work. In order to download stock prices i’m using the following code
    # Get quantmod
    if (!require(“quantmod”)) {
    install.packages(“quantmod”)
    library(quantmod)
    }

    end<- as.Date(Sys.time())
    start<-as.Date("2011-01-01")
    getSymbols("AAPL" , src = "google", from = start, to = end)

    I hope this may help someone.

    ps
    @Chris Conlan do you know why i'm getting this error?
    "Error in quantile.default(coredata(x), …) : 'probs' outside [0,1]"

    Reply
    • Chris Conlan says

      October 19, 2017 at 8:13 pm

      Hi Andrea,

      See these issues in QuantMod:
      https://github.com/joshuaulrich/quantmod/issues/156
      https://github.com/joshuaulrich/quantmod/issues/188

      I am afraid I can’t resolve that bug without knowing what other code you are running. Feel free to elaborate.

      Best,
      Chris

      Reply
  10. Yu says

    December 1, 2017 at 8:56 am

    Hi,

    For hk market stocks, what code need to provide in ‘q’ for retrieve the data?

    I tried HKG:941 or 941, in the parameters, both not work.

    Reply
  11. Bruce says

    December 9, 2017 at 9:27 am

    It work. Thanks for provide the url~~

    Reply
  12. Ines says

    December 10, 2017 at 9:19 am

    Hello Chris,

    Thank you for your post. I am trying your python code locally on my computer. I am getting an encoding error at the level of:

    stock_data = pd.read_csv(io.StringIO(raw_response.decode(‘utf-8’)))

    The error is: UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xfc in position 18140: invalid start byte

    I googled for the error but without any advancement on why I am getting it. Did you get this error before and if yes, how can we fix it?

    Thank you

    Reply
    • Chris Conlan says

      December 10, 2017 at 8:22 pm

      Hi Ines,

      The Google Finance API works less and less everyday. It seems like now it isn’t evening processing any requests. I recommend you switch to the Quandl API. Python code has just been posted for it.

      Best,
      Chris

      Reply
      • Ines says

        December 11, 2017 at 8:54 am

        Hi Chris! Wonderful! Thank you for your quick response and for the updates using Quandl API 🙂 I will definitely have a look at it.
        Nice day to you and Kind regards,
        Ines.

        Reply
  13. Ryan says

    January 3, 2018 at 8:52 pm

    Hi Chris,

    Great script! Two things:

    – I had to alter the R script to get the AAPL stock because it looks like the URL changed(?):

    fread(paste0(“http://finance.google.com/finance/historical”,

    – Is there any way to get Canadian stocks (or to change the exchange)? It seems this only wants to pull from NASDAQ? I tried with just the symbol and adding the exchange and symbol and it doesn’t want to work:

    Aurora_data = google_stocks(‘TSE:WEED’) etc…

    I realize this has nothing to do with your code, but just curious if you know how to get this from Google. Thanks!

    R.

    Reply
  14. Amy says

    March 28, 2018 at 4:07 pm

    Hello Chris,

    Thank you so much for sharing.
    By running R code for the Quandl, the error message occurs, would you kindly suggest the solution for this problem?

    Thank you so much!

    Warning messages:
    1: In merge.data.frame(data, merge_data, by = 1, all = TRUE) :
    column names ‘WIKI.DIS – ERROR.x’, ‘WIKI.DIS – ERROR.y’ are duplicated in the result
    2: In merge.data.frame(data, merge_data, by = 1, all = TRUE) :
    column names ‘WIKI.DIS – ERROR.x’, ‘WIKI.DIS – ERROR.y’ are duplicated in the result

    Reply
    • ACE says

      April 29, 2018 at 3:46 am

      I get the same error, except in my case I get another error first:
      (1) Error in rval[i, , drop = drop., …] : subscript out of bounds
      In addition: Warning messages:
      and
      (2) the routine trips over already with symbole “ACE”.

      I am looking at the rval call now.

      Reply
    • ACE says

      April 29, 2018 at 5:36 am

      After I removed these 20 symbols I got a clean run:
      ACE
      GAS
      ARG
      BXLT
      BRCM
      CVC
      CAM
      CPGX
      EMC
      GMCR
      MHFI
      POM
      PCL
      PCP
      SNDK
      HOT
      TE
      TWC
      TYC

      Reply
  15. Alejandro says

    May 6, 2018 at 3:01 am

    I am getting the following response:

    Sorry… body { font-family: verdana, arial, sans-serif; background-color: #fff; color: #000; }GoogleSorry…We’re sorry…… but your computer or network may be sending automated queries. To protect our users, we can’t process your request right now.See Google Help for more information.Google Home

    Reply
    • André Salerno, MSc says

      July 11, 2018 at 5:07 pm

      Hi Alejandro! I’m in the same situation……Have you already solved this issue? Best

      Reply
    • Chris Conlan says

      July 11, 2018 at 6:31 pm

      The Google Finance API no longer works. Since they re-purposed the URL for customer-facing (as opposed to developer facing) use, they are complaining that you are sending automated queries. They detect this from the fact that your user-agent doesn’t appear to be from a common web browser.

      Reply
      • carly says

        August 3, 2018 at 5:12 pm

        Does anyone know what is happening with Google data downloads from finance.google.com/finance. It appears to no longer work as of August 2,2018

        Reply
  16. tomerboyarski says

    July 16, 2018 at 2:57 pm

    thank you very much! 🙂
    I’m totally new to this.
    while using your code, I’m only getting data till March 27th, 2018.
    any idea why this might be happening?

    Reply
    • Chris Conlan says

      July 16, 2018 at 3:25 pm

      If you’re using the Quandl code, that date corresponds to when Quandl deprecated the API. It was being provided for free by a third party to Quandl. They seem like they will be keeping the existing data online for a while, though.

      Reply
  17. Alex Read says

    September 13, 2018 at 4:44 pm

    Another good API option that’s been working for me is https://intrinio.com. They have a bunch of historical data you can access through API, CSV, and Excel.

    Reply
  18. Bomberr working says

    July 30, 2019 at 11:39 am

    Hi Chris, i’m reading your book trying to make work a connection with an API (listing 2-2 to 2-8)

    Is it necessary to do all the yahoo() code (changing the API option) and the rest of coding?
    I’m using riingo package, does that package have all set?

    I’m very frustaded doing this, i bought your book and i cannot advance due to this connection problem.

    Thank you

    Reply

Leave a Reply Cancel reply

For Traders

Algorithmic Trading with Python by Chris Conlan

Available for purchase at Amazon.com.

Algorithmic Trading

Pulling All Sorts of Financial Data in Python [Updated for 2021]

Calculating Triple Barrier Labels from Advances in Financial Machine Learning

Calculating Financial Performance Metrics in Pandas

Topics

  • 3D Technology and Virtual Reality (8)
  • Automated Trading (9)
  • Business Management (9)
  • Chris Conlan Blog (5)
  • Computer Vision (2)
  • Programming with Python (16)
  • Programming with R (6)
  • Snippets (8)
  • Email
  • LinkedIn
  • RSS
  • YouTube

Copyright © 2022 · Enterprise Pro Theme On Log in