【问题标题】:quantmod getFinancials() not pulling financialsquantmod getFinancials() 不拉动财务
【发布时间】:2017-08-19 09:02:35
【问题描述】:

我希望下载上市公司的基本数据。利用quantmod 包,我试图使用getFinancials() 来提取数据,它适用于某些公司但结果不同(我阅读并理解关于免费数据的免责声明)但想确认我正在拉这个正确。

对于摩根大通: 在雅虎财经网站上,我确实看到了财务信息,但下面的调用似乎将"google" 拉为src 而不是"yahoo",因为财务信息很少。

谷歌 - https://www.google.com/finance?q=NYSE%3AJPM&fstype=ii&ei=9kh-WejLE5e_etbzmpgP

雅虎 - https://finance.yahoo.com/quote/JPM/financials?p=JPM

library(quantmod)
JPM <- getFinancials("JPM", src = "yahoo", auto.assign = FALSE)
viewFin(JPM, type = "IS", period = "A")

是否有正确的方法来指定src?还有一种方法可以使用getFinancials(),但如果指示栏中有NA(例如收入)切换来源(谷歌与雅虎)?

【问题讨论】:

    标签: r quantmod


    【解决方案1】:

    getFinancials 的帮助页面顶部显示(强调),

    Google 财经 下载损益表、资产负债表和现金流量表。

    目前无法将 Yahoo Finance 指定为来源。这样做需要有人编写一种方法来从 Yahoo Finance 抓取和解析 HTML,因为无法像价格数据那样将其下载到文件中。

    【讨论】:

      【解决方案2】:

      我认为雅虎最近更改了它的 API。从标题为“Get Excel Spreadsheet to Download Bulk Historical Stock Data from Google Finance”的链接下载文件

      http://investexcel.net/multiple-stock-quote-downloader-for-excel/

      这适用于 Excel,您可以轻松地将其加载到 R 中。

      你也可以试试这样的。

      # assumes codes are known beforehand
      codes <- c("MSFT","SBUX","S","AAPL","ADT")
      urls <- paste0("https://www.google.com/finance/historical?q=",codes,"&output=csv")
      paths <- paste0(codes,"csv")
      missing <- !(paths %in% dir(".", full.name = TRUE))
      missing
      
      # simple error handling in case file doesn't exists
      downloadFile <- function(url, path, ...) {
      # remove file if exists already
      if(file.exists(path)) file.remove(path)
      # download file
      tryCatch(
      download.file(url, path, ...), error = function(c) {
      # remove file if error
      if(file.exists(path)) file.remove(path)
      # create error message
      c$message <- paste(substr(path, 1, 4),"failed")
      message(c$message)
      }
      )
      }
      # wrapper of mapply
      Map(downloadFile, urls[missing], paths[missing])
      

      或者,这个。

      ## downloads historic prices for all constituents of SP500
      library(zoo)
      library(tseries)                        
      
      ## read in list of constituents, with company name in first column and
      ## ticker symbol in second column
      
      ## CREATE A FILE TO READ DATA FROM!!!
      spComp <- read.csv("C:/Users/Excel/Desktop/stocks.csv" ) 
      
      ## specify time period
      dateStart <- "2013-01-01"               
      dateEnd <- "2015-05-08"
      
      ## extract symbols and number of iterations
      symbols <- spComp[, 1]
      nAss <- length(symbols)
      
      ## download data on first stock as zoo object
      z <- get.hist.quote(instrument = symbols[1], start = dateStart,
                          end = dateEnd, quote = "AdjClose",
                          retclass = "zoo", quiet = T)
      
      ## use ticker symbol as column name 
      dimnames(z)[[2]] <- as.character(symbols[1])
      
      ## download remaining assets in for loop
      for (i in 2:nAss) {
         ## display progress by showing the current iteration step
         cat("Downloading ", i, " out of ", nAss , "\n")
      
         result <- try(x <- get.hist.quote(instrument = symbols[i],
                                           start = dateStart,
                                           end = dateEnd, quote = "AdjClose",
                                           retclass = "zoo", quiet = T))
         if(class(result) == "try-error") {
            next
         }
         else {
            dimnames(x)[[2]] <- as.character(symbols[i])
      
            ## merge with already downloaded data to get assets on same dates 
            z <- merge(z, x)                      
      
         }
      
      
      }
      
      ## save data
      #  CREATE A FILE TO WRITE DATA TO!!!
      write.zoo(z, file = "C:/Users/Excel/Desktop/all_sp500_price_data.csv", index.name = "time")
      

      这是您可以考虑的另一种选择。

      Method #1:
      ---
      layout: post
      title: "2014-11-20-Download-Stock-Data-1"
      description: ""
      category: R
      tags: [knitr,lubridate,stringr,plyr,dplyr]
      ---
      {% include JB/setup %}
      
      This article illustrates how to download stock price data files from Google, save it into a local drive and merge them into a single data frame. This script is slightly modified from a script which downloads RStudio package download log data. The original source can be found [here](https://github.com/hadley/cran-logs-dplyr/blob/master/1-download.r).  
      
      First of all, the following three packages are used.
      
      
      {% highlight r %}
      library(knitr)
      library(lubridate)
      library(stringr)
      library(plyr)
      library(dplyr)
      {% endhighlight %}
      
      The script begins with creating a folder to save data files.
      
      
      {% highlight r %}
      # create data folder
      dataDir <- paste0("data","_","2014-11-20-Download-Stock-Data-1")
      if(file.exists(dataDir)) { 
            unlink(dataDir, recursive = TRUE)
            dir.create(dataDir)
      } else {
            dir.create(dataDir)
      }
      {% endhighlight %}
      
      After creating urls and file paths, files are downloaded using `Map` function - it is a warpper of `mapply`. Note that, in case the function breaks by an error (eg when a file doesn't exist), `download.file` is wrapped by another function that includes an error handler (`tryCatch`). 
      
      
      {% highlight r %}
      # assumes codes are known beforehand
      codes <- c("MSFT", "TCHC") # codes <- c("MSFT", "1234") for testing
      urls <- paste0("http://www.google.com/finance/historical?q=NASDAQ:",
                     codes,"&output=csv")
      paths <- paste0(dataDir,"/",codes,".csv") # back slash on windows (\\)
      
      # simple error handling in case file doesn't exists
      downloadFile <- function(url, path, ...) {
            # remove file if exists already
            if(file.exists(path)) file.remove(path)
            # download file
            tryCatch(            
                  download.file(url, path, ...), error = function(c) {
                        # remove file if error
                        if(file.exists(path)) file.remove(path)
                        # create error message
                        c$message <- paste(substr(path, 1, 4),"failed")
                        message(c$message)
                  }
            )
      }
      # wrapper of mapply
      Map(downloadFile, urls, paths)
      {% endhighlight %}
      
      
      Finally files are read back using `llply` and they are combined using `rbind_all`. Note that, as the merged data has multiple stocks' records, `Code` column is created.
      
      
      
      {% highlight r %}
      # read all csv files and merge
      files <- dir(dataDir, full.name = TRUE)
      dataList <- llply(files, function(file){
            data <- read.csv(file, stringsAsFactors = FALSE)
            # get code from file path
            pattern <- "/[A-Z][A-Z][A-Z][A-Z]"
            code <- substr(str_extract(file, pattern), 2, nchar(str_extract(file, pattern)))
            # first column's name is funny
            names(data) <- c("Date","Open","High","Low","Close","Volume")
            data$Date <- dmy(data$Date)
            data$Open <- as.numeric(data$Open)
            data$High <- as.numeric(data$High)
            data$Low <- as.numeric(data$Low)
            data$Close <- as.numeric(data$Close)
            data$Volume <- as.integer(data$Volume)
            data$Code <- code
            data
      }, .progress = "text")
      
      data <- rbind_all(dataList)
      {% endhighlight %}
      
      Some of the values are shown below.
      
      
      |Date       |  Open|  High|   Low| Close|   Volume|Code |
      |:----------|-----:|-----:|-----:|-----:|--------:|:----|
      |2014-11-26 | 47.49| 47.99| 47.28| 47.75| 27164877|MSFT |
      |2014-11-25 | 47.66| 47.97| 47.45| 47.47| 28007993|MSFT |
      |2014-11-24 | 47.99| 48.00| 47.39| 47.59| 35434245|MSFT |
      |2014-11-21 | 49.02| 49.05| 47.57| 47.98| 42884795|MSFT |
      |2014-11-20 | 48.00| 48.70| 47.87| 48.70| 21510587|MSFT |
      |2014-11-19 | 48.66| 48.75| 47.93| 48.22| 26177450|MSFT |
      
      This way wouldn't be efficient compared to the way where files are read directly without being saved into a local drive. This option may be useful, however, if files are large and the API server breaks connection abrubtly.
      
      I hope this article is useful and I'm going to write an article to show the second way.
      
      Method #2:
      ---
      layout: post
      title: "2014-11-20-Download-Stock-Data-2"
      description: ""
      category: R
      tags: [knitr,lubridate,stringr,plyr,dplyr]
      ---
      {% include JB/setup %}
      
      In an [earlier article](http://jaehyeon-kim.github.io/r/2014/11/20/Download-Stock-Data-1/), a way to download stock price data files from Google, save it into a local drive and merge them into a single data frame. If files are not large, however, it wouldn't be effective and, in this article, files are downloaded and merged internally.
      
      The following packages are used.
      
      
      {% highlight r %}
      library(knitr)
      library(lubridate)
      library(stringr)
      library(plyr)
      library(dplyr)
      {% endhighlight %}
      
      Taking urls as file locations, files are directly read using `llply` and they are combined using `rbind_all`. As the merged data has multiple stocks' records, `Code` column is created. Note that, when an error occurrs, the function returns a dummy data frame in order not to break the loop - values of the dummy data frame(s) are filtered out at the end.
      
      
      {% highlight r %}
      # assumes codes are known beforehand
      codes <- c("MSFT", "TCHC") # codes <- c("MSFT", "1234") for testing
      files <- paste0("http://www.google.com/finance/historical?q=NASDAQ:",
                      codes,"&output=csv")
      
      dataList <- llply(files, function(file, ...) {
            # get code from file url
            pattern <- "Q:[0-9a-zA-Z][0-9a-zA-Z][0-9a-zA-Z][0-9a-zA-Z]"
            code <- substr(str_extract(file, pattern), 3, nchar(str_extract(file, pattern)))
      
            # read data directly from a URL with only simple error handling
            # for further error handling: http://adv-r.had.co.nz/Exceptions-Debugging.html
            tryCatch({
                  data <- read.csv(file, stringsAsFactors = FALSE)
                  # first column's name is funny
                  names(data) <- c("Date","Open","High","Low","Close","Volume")
                  data$Date <- dmy(data$Date)
                  data$Open <- as.numeric(data$Open)
                  data$High <- as.numeric(data$High)
                  data$Low <- as.numeric(data$Low)
                  data$Close <- as.numeric(data$Close)
                  data$Volume <- as.integer(data$Volume)
                  data$Code <- code
                  data               
            },
            error = function(c) {
                  c$message <- paste(code,"failed")
                  message(c$message)
                  # return a dummy data frame
                  data <- data.frame(Date=dmy(format(Sys.Date(),"%d%m%Y")), Open=0, High=0,
                                     Low=0, Close=0, Volume=0, Code="NA")
                  data
            })
      })
      
      # dummy data frame values are filtered out
      data <- filter(rbind_all(dataList), Code != "NA")
      {% endhighlight %}
      
      Some of the values are shown below.
      
      
      |Date       |  Open|  High|   Low| Close|   Volume|Code |
      |:----------|-----:|-----:|-----:|-----:|--------:|:----|
      |2014-11-26 | 47.49| 47.99| 47.28| 47.75| 27164877|MSFT |
      |2014-11-25 | 47.66| 47.97| 47.45| 47.47| 28007993|MSFT |
      |2014-11-24 | 47.99| 48.00| 47.39| 47.59| 35434245|MSFT |
      |2014-11-21 | 49.02| 49.05| 47.57| 47.98| 42884795|MSFT |
      |2014-11-20 | 48.00| 48.70| 47.87| 48.70| 21510587|MSFT |
      |2014-11-19 | 48.66| 48.75| 47.93| 48.22| 26177450|MSFT |
      
      It took a bit longer to complete the script as I had to teach myself how to handle errors in R. And this is why I started to write articles in this blog.
      
      I hope this article is useful.
      
      
      Summarize Stock returns From Multiple Files:
      ---
      layout: post
      title: "2014-11-27-Summarise-Stock-Returns-from-Multiple-Files"
      description: ""
      category: R
      tags: [knitr,lubridate,stringr,reshape2,plyr,dplyr]
      ---
      {% include JB/setup %}
      
      This is a slight extension of the previous two articles ( [2014-11-20-Download-Stock-Data-1](http://jaehyeon-kim.github.io/r/2014/11/20/Download-Stock-Data-1/), [2014-11-20-Download-Stock-Data-2](http://jaehyeon-kim.github.io/r/2014/11/20/Download-Stock-Data-2/) ) and it aims to produce gross returns, standard deviation and correlation of multiple shares.
      
      The following packages are used.
      
      
      {% highlight r %}
      library(knitr)
      library(lubridate)
      library(stringr)
      library(reshape2)
      library(plyr)
      library(dplyr)
      {% endhighlight %}
      
      The script begins with creating a data folder in the format of *data_YYYY-MM-DD*.
      
      
      {% highlight r %}
      # create data folder
      dataDir <- paste0("data","_",format(Sys.Date(),"%Y-%m-%d"))
      if(file.exists(dataDir)) {
        unlink(dataDir, recursive = TRUE)
        dir.create(dataDir)
      } else {
        dir.create(dataDir)
      }
      {% endhighlight %}
      
      Given company codes, URLs and file paths are created. Then data files are downloaded by `Map`, which is a wrapper of `mapply`. Note that R's `download.file` function is wrapped by `downloadFile` so that the function does not break when an error occurs.
      
      
      {% highlight r %}
      # assumes codes are known beforehand
      codes <- c("MSFT", "TCHC")
      urls <- paste0("http://www.google.com/finance/historical?q=NASDAQ:",
                     codes,"&output=csv")
      paths <- paste0(dataDir,"/",codes,".csv") # backward slash on windows (\)
      
      # simple error handling in case file doesn't exists
      downloadFile <- function(url, path, ...) {
        # remove file if exists already
        if(file.exists(path)) file.remove(path)
        # download file
        tryCatch(
          download.file(url, path, ...), error = function(c) {
            # remove file if error
            if(file.exists(path)) file.remove(path)
            # create error message
            c$message <- paste(substr(path, 1, 4),"failed")
            message(c$message)
          }
        )
      }
      # wrapper of mapply
      Map(downloadFile, urls, paths)
      {% endhighlight %}
      
      Once the files are downloaded, they are read back to combine using `rbind_all`. Some more details about this step is listed below.
      
      * only Date, Close and Code columns are taken
      * codes are extracted from file paths by matching a regular expression
      * data is arranged by date as the raw files are sorted in a descending order
      * error is handled by returning a dummy data frame where its code value is NA.
      * individual data files are merged in a long format
          * 'NA' is filtered out
      
      
      {% highlight r %}
      # read all csv files and merge
      files <- dir(dataDir, full.name = TRUE)
      dataList <- llply(files, function(file){
        # get code from file path
        pattern <- "/[A-Z][A-Z][A-Z][A-Z]"
        code <- substr(str_extract(file, pattern), 2, nchar(str_extract(file, pattern)))
        tryCatch({
          data <- read.csv(file, stringsAsFactors = FALSE)
          # first column's name is funny
          names(data) <- c("Date","Open","High","Low","Close","Volume")
          data$Date <- dmy(data$Date)
          data$Close <- as.numeric(data$Close)
          data$Code <- code
          # optional
          data$Open <- as.numeric(data$Open)
          data$High <- as.numeric(data$High)
          data$Low <- as.numeric(data$Low)
          data$Volume <- as.integer(data$Volume)
          # select only 'Date', 'Close' and 'Code'
          # raw data should be arranged in an ascending order
          arrange(subset(data, select = c(Date, Close, Code)), Date)
        },
        error = function(c){
          c$message <- paste(code,"failed")
          message(c$message)
          # return a dummy data frame not to break function
          data <- data.frame(Date=dmy(format(Sys.Date(),"%d%m%Y")), Close=0, Code="NA")
          data
        })
      }, .progress = "text")
      
      # data is combined to create a long format
      # dummy data frame values are filtered out
      data <- filter(rbind_all(dataList), Code != "NA")
      {% endhighlight %}
      
      Some values of this long format data is shown below.
      
      
      |Date       | Close|Code |
      |:----------|-----:|:----|
      |2013-11-29 | 38.13|MSFT |
      |2013-12-02 | 38.45|MSFT |
      |2013-12-03 | 38.31|MSFT |
      |2013-12-04 | 38.94|MSFT |
      |2013-12-05 | 38.00|MSFT |
      |2013-12-06 | 38.36|MSFT |
      
      The data is converted into a wide format data where the x and y variables are Date and Code respectively (`Date ~ Code`) while the value variable is Close (`value.var="Close"`). Some values of the wide format data is shown below.
      
      
      {% highlight r %}
      # data is converted into a wide format
      data <- dcast(data, Date ~ Code, value.var="Close")
      kable(head(data))
      {% endhighlight %}
      
      
      
      |Date       |  MSFT|  TCHC|
      |:----------|-----:|-----:|
      |2013-11-29 | 38.13| 13.52|
      |2013-12-02 | 38.45| 13.81|
      |2013-12-03 | 38.31| 13.48|
      |2013-12-04 | 38.94| 13.71|
      |2013-12-05 | 38.00| 13.55|
      |2013-12-06 | 38.36| 13.95|
      
      The remaining steps are just differencing close price values after taking log and applying `sum`, `sd`, and `cor`.
      
      
      {% highlight r %}
      # select except for Date column
      data <- select(data, -Date)
      
      # apply log difference column wise
      dailyRet <- apply(log(data), 2, diff, lag=1)
      
      # obtain daily return, variance and correlation
      returns <- apply(dailyRet, 2, sum, na.rm = TRUE)
      std <- apply(dailyRet, 2, sd, na.rm = TRUE)
      correlation <- cor(dailyRet)
      
      returns
      {% endhighlight %}
      
      
      
      {% highlight text %}
      ##      MSFT      TCHC 
      ## 0.2249777 0.6293973
      {% endhighlight %}
      
      
      
      {% highlight r %}
      std
      {% endhighlight %}
      
      
      
      {% highlight text %}
      ##       MSFT       TCHC 
      ## 0.01167381 0.03203031
      {% endhighlight %}
      
      
      
      {% highlight r %}
      correlation
      {% endhighlight %}
      
      
      
      {% highlight text %}
      ##           MSFT      TCHC
      ## MSFT 1.0000000 0.1481043
      ## TCHC 0.1481043 1.0000000
      {% endhighlight %}
      
      Finally the data folder is deleted.
      
      
      {% highlight r %}
      # delete data folder
      if(file.exists(dataDir)) { unlink(dataDir, recursive = TRUE) }
      {% endhighlight %}
      

      【讨论】:

        猜你喜欢
        • 2018-09-02
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多