【发布时间】:2015-05-28 17:44:48
【问题描述】:
我想用R发布xml,python中的代码是
import urllib2
url = 'http://www.rcsb.org/pdb/rest/search'
queryText = """
<?xml version="1.0" encoding="UTF-8"?>
<orgPdbQuery>
<version>B0907</version>
<queryType>org.pdb.query.simple.ExpTypeQuery</queryType>
<description>Experimental Method Search : Experimental Method=SOLID-STATE NMR</description>
<mvStructure.expMethod.value>SOLID-STATE NMR</mvStructure.expMethod.value>
</orgPdbQuery>
"""
print "query:\n", queryText
print "querying PDB...\n"
req = urllib2.Request(url, data=queryText)
f = urllib2.urlopen(req)
result = f.read()
if result:
print "Found number of PDB entries:", result.count('\n')
else:
print "Failed to retrieve results"
现在,我想用 R 完成同样的功能,怎么做。
我试过好几次了。
library(RCurl)
library(httr)
library(XML)
url1 <- 'http://www.rcsb.org/pdb/rest/search'
xml_text <- '<?xml version="1.0" encoding="UTF-8"?>
<orgPdbQuery>
<version>B0907</version>
<queryType>org.pdb.query.simple.ExpTypeQuery</queryType>
<description>Experimental Method Search : Experimental Method=SOLID-STATE NMR</description>
<mvStructure.expMethod.value>SOLID-STATE NMR</mvStructure.expMethod.value>
</orgPdbQuery>'
# first try ----
xml_txt <- xmlTreeParse(xml_text,useInternalNodes=T)
postForm(url1, "xml"=saveXML(xml_txt), style="post")
#failed
“从 XML 创建查询的问题:prolog 中不允许内容。\nxml=\n\n B0907\n org.pdb.query.simple.ExpTypeQuery\n 实验方法搜索:实验方法=固态核磁共振\ n 固态核磁共振\n\n\n" attr(,"内容类型") 字符集 “文本/纯文本”“ISO-8859-1”
# second try ----
xml_out <- 'tmp.xml'
saveXML(xml_txt, xml_out)
result <- POST(url1, body = list(x = upload_file(xml_out)), encode = 'multipart', )
content(result)
# failed
返回网站 html 代码。
# 3rd try ----
httpPOST(url1, content = xml_txt)
# failed
"!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">http://www.w3.org/1999/xhtml\">错误
错误
错误无法显示此页面。如需更多信息,请联系支持人员。事件 ID 为:N/A。
# 4th try ----
h = basicTextGatherer()
result <- curlPerform(url = url1,
httpheader=c(Accept="text/xml", Accept="multipart/*",
'Content-Type' = "text/xml; charset=utf-8"),
postfields=xml_text,
writefunction = h$update,
verbose = TRUE
)
result
h$value
# failed
结果
好的
0
h$value()
[1] ""
我已经解决了这个问题。
url1 <- 'http://www.rcsb.org/pdb/rest/search'
xml_text <- '<?xml version="1.0" encoding="UTF-8"?>
<orgPdbQuery>
<version>B0907</version>
<queryType>org.pdb.query.simple.ExpTypeQuery</queryType>
<description>Experimental Method Search : Experimental Method=SOLID-STATE NMR</description>
<mvStructure.expMethod.value>SOLID-STATE NMR</mvStructure.expMethod.value>
</orgPdbQuery>'
h = basicTextGatherer()
httpheader=c(Accept="*/*",
"Content-Type"="application/x-www-form-urlencoded")
result <- curlPerform(url = url1,
httpheader=httpheader,
postfields=xml_text,
writefunction = h$update,
verbose = TRUE
)
result
h$value()
【问题讨论】:
-
这是有效的网址吗?
-
我已经改成真正的代码了。
-
Kang,你应该看看
httr和XML包小插曲,它们会给你一个好的开始。一旦你有一个更精致的问题(除了“在 R 中为我做”),我们可以帮助你。投票结束,直到有更具体的问题。 -
谢谢。我试过POST、POST、postForm、curlPerform的功能,我浏览过stackoverflow.com/questions/15240408/…stackoverflow.com/questions/24050773/…stackoverflow.com/questions/26706159/post-xml-form-using-httr等网站,但无法解决问题
-
@Brandon,感谢您的帮助,我已经通过更改 RCurl::curlPerform 函数中的 httpheader 解决了这个问题。 httpheader=c(Accept="/", "Content-Type"="application/x-www-form-urlencoded")