【问题标题】:R. How to append loop (for) results into a Data Frame?R. 如何将循环(for)结果附加到数据框中?
【发布时间】:2016-07-24 08:26:18
【问题描述】:

我正在尝试通过访问网络服务并搜索邮政编码来构建一个包含巴西地址的数据框。 实际上,我能够接收一个结果并将其存储在数据框中,但是当我尝试搜索多个邮政编码(例如在向量中)时,我的数据框只保留最后一个元素。 有人可以帮帮我吗?

请看下面的代码:

###############
library(httr)
library(RCurl)
library(XML)
library(dplyr)
###############

# ZIPs I want to search for:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)

# loop:
for(i in 1:j) {

# Save the URL of the xml file in a variable:
xml.url <- getURL(paste("http://cep.republicavirtual.com.br/web_cep.php?cep=",vectorzip[i], sep = ""), encoding = "ISO-8859-1")
xml.url

# Use the xmlTreeParse-function to parse xml file directly from the web:
xmlfile <- xmlTreeParse(xml.url)
xmlfile
# the xml file is now saved as an object you can easily work with in R:
class(xmlfile)

# Use the xmlRoot-function to access the top node:
xmltop = xmlRoot(xmlfile)

# have a look at the XML-code of the first subnodes:
print(xmltop)

# To extract the XML-values from the document, use xmlSApply:
zips <- xmlSApply(xmlfile, function(x) xmlSApply(x, xmlValue))
zips
# Finally, get the data in a data-frame and have a look at the first rows and columns:
zips <- NULL
zips <- rbind(zips_df, data.frame(t(zips),row.names=NULL))

View(zips_df)}

【问题讨论】:

  • 什么是 zips
  • 使用 rbind 增长对象通常不是一个好主意。更好的方法是定义一个特定大小的空 data.frame(从而分配必要的内存)并随后填充行。

标签: r loops web service append


【解决方案1】:

你想:

a) 定义 zips_df
b) 在循环之外定义 zips_df。
c)不要在循环内将 zips_df 设置为 null :)

###############
library(httr)
library(RCurl)
library(XML)
library(dplyr)
###############

# ZIPs I want to search for:
vectorzip <- c("71938360", "70673052", "71020510")
j <- length(vectorzip)
zips_df <- data.frame()

i<-1
# loop:
for(i in 1:j) {

  # Save the URL of the xml file in a variable:
  xml.url <- getURL(paste("http://cep.republicavirtual.com.br/web_cep.php?cep=",vectorzip[i], sep = ""), encoding = "ISO-8859-1")
  xml.url

  # Use the xmlTreeParse-function to parse xml file directly from the web:
  xmlfile <- xmlTreeParse(xml.url)
  xmlfile
  # the xml file is now saved as an object you can easily work with in R:
  class(xmlfile)

  # Use the xmlRoot-function to access the top node:
  xmltop = xmlRoot(xmlfile)

  # have a look at the XML-code of the first subnodes:
  print(xmltop)

  # To extract the XML-values from the document, use xmlSApply:
  zips <- xmlSApply(xmlfile, function(x) xmlSApply(x, xmlValue))
  zips
  # Finally, get the data in a data-frame and have a look at the first rows and columns:

  zips_df <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
}

  View(zips_df)

你明白了:

> zips_df
  resultado.text     resultado_txt.text uf.text cidade.text         bairro.text tipo_logradouro.text  logradouro.text
1              1 sucesso - cep completo      DF  Taguatinga Sul (Ãguas Claras)                  Rua               09
2              1 sucesso - cep completo      DF    Cruzeiro      Setor Sudoeste               Quadra      300 Bloco O
3              1 sucesso - cep completo      DF      Guará            Guará I               Quadra QI 11 Conjunto U

【讨论】:

  • 非常感谢您,塞尔班!
【解决方案2】:

请尝试提供一个最低限度的工作示例。您的示例有大量与您的实际问题无关的代码行。如果你试图删除这个不必要的代码,你可能会在保存之前发现zips &lt;- NULL 正在删除拉链信息的行。其次,您引用了一个 zips_df 对象,但这不是在您的代码中创建的。

回答你的问题:

  • 在开始循环之前添加一行将zips_df 创建为空数据框对象:

    vectorzip <- c("71938360", "70673052", "71020510")
    j <- length(vectorzip)
    zips_df <- data.frame()
    
  • 删除删除 zips 对象的行 (zips &lt;- NULL)

  • 更改增长 zips_df data.frame 的行以将完整数据保存到 data.frame 对象,而不是临时的“zips”变量:

    zips <- rbind(zips_df, data.frame(t(zips),row.names=NULL))
    

我还建议删除“查看”行并使用打印检查 data.frame:

print(zips_df)
resultado.text     resultado_txt.text uf.text cidade.text              bairro.text tipo_logradouro.text  logradouro.text
1              1 sucesso - cep completo      DF  Taguatinga Sul (Ã\u0081guas Claras)                  Rua               09
2              1 sucesso - cep completo      DF    Cruzeiro           Setor Sudoeste               Quadra      300 Bloco O
3              1 sucesso - cep completo      DF      Guará                 Guará I               Quadra QI 11 Conjunto U

【讨论】:

  • 非常感谢安德烈。感谢您的推荐和回答!
猜你喜欢
  • 1970-01-01
  • 2021-07-29
  • 2019-04-09
  • 2016-01-31
  • 2017-10-21
  • 1970-01-01
  • 2015-04-05
  • 2021-10-18
  • 2019-02-22
相关资源
最近更新 更多