【问题标题】:cant access string methods in dbplyr无法访问 dbplyr 中的字符串方法
【发布时间】:2019-09-03 14:16:33
【问题描述】:

我正在尝试使用dbplyr 中的str_detectstr_replacestr_replace_all 方法和oracle 作为后端数据库,但似乎无法访问这些方法。

这是错误:

db_tbl %>% mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% show_query()
Error: str_detect() is not available in this SQL variant

我已经重新安装了所有的软件包,但仍然没有用。 但是,我可以看到它是在dbplyr 1.2.0 中实现的,参见here?

尝试使用 grepl 转换为:

db_tbl %>% mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% show_query()
<SQL>
Named arguments ignored for SQL greplSELECT grepl("COMMENTS", '[^[:alnum:]]' AS "pattern") AS "COMMENTS_NEW"
FROM ("schema".table) 

也返回错误。这是回溯:

20.
stop(structure(list(message = "<SQL> 'SELECT * FROM (SELECT \"COMMENTS\", \"TYPE_28\", grepl(\"COMMENTS\", '[^[:alnum:]]' AS \"pattern\") AS \"COMMENTS_NEW\"\nFROM (\"schema\".table) ) \"zzz3\" WHERE ROWNUM <= 6.0'\n nanodbc/nanodbc.cpp:1587: HY000: [Oracle][ODBC][Ora]ORA-00907: missing right parenthesis\n ", call = NULL, cppstack = NULL), class = c("odbc::odbc_error", "C++Error", "error", "condition")))
19.
new_result(connection@ptr, statement)
18.
OdbcResult(connection = conn, statement = statement)
17.
dbSendQuery(con, sql)
16.
dbSendQuery(con, sql)
15.
db_collect.DBIConnection(x$src$con, sql, n = n, warn_incomplete = warn_incomplete)
14.
db_collect(x$src$con, sql, n = n, warn_incomplete = warn_incomplete)
13.
collect.tbl_sql(x, n = n)
12.
collect(x, n = n)
11.
as.data.frame(collect(x, n = n))
10.
as.data.frame.tbl_sql(head(x, n + 1))
9.
as.data.frame(head(x, n + 1))
8.
trunc_mat(x, n = n, width = width, n_extra = n_extra)
7.
format.tbl(x, ..., n = n, width = width, n_extra = n_extra)
6.
format(x, ..., n = n, width = width, n_extra = n_extra)
5.
paste0(..., "\n")
4.
cat(paste0(..., "\n"), sep = "")
3.
cat_line(format(x, ..., n = n, width = width, n_extra = n_extra))
2.
print.tbl_sql(x)
1.
function (x, ...) UseMethod("print")(x)

这是我的会议:

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dbplot_0.3.2           pool_0.1.4.2           dbplyr_1.4.2           DBI_1.0.0              odbc_1.1.6             data.table_1.11.8     
 [7] qdap_2.3.0             RColorBrewer_1.1-2     qdapTools_1.3.3        qdapRegex_0.7.2        qdapDictionaries_1.0.7 textclean_0.9.3       
[13] drlib_0.1.0            lubridate_1.7.4        ggrepel_0.8.0          fpp2_2.3               expsmooth_2.3          fma_2.3               
[19] forecast_8.5           recipes_0.1.5          textSummary_0.1.0      scales_1.0.0           janitor_1.1.1          forcats_0.3.0         
[25] stringr_1.4.0          dplyr_0.8.1            purrr_0.2.5            readr_1.2.1            tidyr_0.8.2            tibble_2.1.1          
[31] ggplot2_3.2.0          tidyverse_1.2.1       

loaded via a namespace (and not attached):
  [1] openNLPdata_1.5.3-4 colorspace_1.4-1    class_7.3-14        rprojroot_1.3-2     fs_1.2.6            base64enc_0.1-3    
  [7] rstudioapi_0.8      remotes_2.0.2       bit64_0.9-7         prodlim_2018.04.18  fansi_0.4.0         xml2_1.2.0         
 [13] splines_3.5.0       knitr_1.20          pkgload_1.0.2       jsonlite_1.6        venneuler_1.1-0     rJava_0.9-10       
 [19] broom_0.5.1         compiler_3.5.0      httr_1.3.1          backports_1.1.2     assertthat_0.2.1    Matrix_1.2-14      
 [25] lazyeval_0.2.2      cli_1.1.0           later_0.8.0         prettyunits_1.0.2   tools_3.5.0         igraph_1.2.2       
 [31] NLP_0.2-0           gtable_0.3.0        glue_1.3.1          reshape2_1.4.3      Rcpp_1.0.1          slam_0.1-43        
 [37] cellranger_1.1.0    fracdiff_1.4-2      urca_1.3-0          gdata_2.18.0        nlme_3.1-137        lmtest_0.9-36      
 [43] timeDate_3043.102   gower_0.1.2         gender_0.5.2        ps_1.2.1            xlsxjars_0.6.1      testthat_2.0.1     
 [49] rvest_0.3.2         devtools_2.0.1      gtools_3.8.1        XML_3.98-1.16       xlsx_0.6.1          MASS_7.3-49        
 [55] zoo_1.8-5           ipred_0.9-8         hms_0.4.2           parallel_3.5.0      yaml_2.2.0          quantmod_0.4-14    
 [61] curl_3.3            memoise_1.1.0       gridExtra_2.3       rpart_4.1-13        stringi_1.4.3       desc_1.2.0         
 [67] tseries_0.10-46     plotrix_3.7-4       TTR_0.23-4          pkgbuild_1.0.2      openNLP_0.2-6       lava_1.6.4         
 [73] chron_2.3-53        rlang_0.4.0         pkgconfig_2.0.2     bitops_1.0-6        lattice_0.20-35     processx_3.2.0     
 [79] bit_1.1-14          tidyselect_0.2.5    plyr_1.8.4          magrittr_1.5        R6_2.4.0            generics_0.0.2     
 [85] pillar_1.3.1        haven_2.0.0         withr_2.1.2         xts_0.11-2          survival_2.41-3     RCurl_1.95-4.11    
 [91] nnet_7.3-12         modelr_0.1.2        crayon_1.3.4        utf8_1.1.4          wordcloud_2.6       usethis_1.4.0      
 [97] grid_3.5.0          readxl_1.1.0        callr_3.0.0         blob_1.1.1          reports_0.1.4       digest_0.6.18      
[103] tm_0.7-5            munsell_0.5.0       sessioninfo_1.1.1   quadprog_1.5-5     

【问题讨论】:

  • 你可以试试grepl
  • 您似乎将 3 个参数传递给 str_detect。在我的文档中,它需要 2 个字符串和 1 个可选布尔值。这能解释吗?我刚刚测试过,str_detect 在我的设置上运行良好
  • @BenoîtFayolle 你在使用 Oracle 作为后端吗?
  • 我已经删除了第三个参数...错误说:Error: str_detect() is not available in this SQL variant 但是,我可以毫无问题地使用complaints_tbl %&gt;% mutate(ADVISOR_COMMENTS_NEW = str_to_upper(ADVISOR_COMMENTS)) %&gt;% select(ADVISOR_COMMENTS_NEW)
  • @Shery 不,我在红移集群上试过。您是否在 oracle 以外的另一个后端尝试过str_detect?刚刚在mysql上试过,它也适用于我。我没有可以尝试的 oracle 数据库

标签: r dbplyr


【解决方案1】:

这并不是真正的答案,而是一个简单的解决方法。

问题是dbplyr:: 无法创建适当的 SQL 子句(SQL 没有名称为 str_detectgrepl 的函数),因此它会丢掉毛巾(并出现错误)。

在这两个表达式中,您都会收到错误,因为dbplyr cannot translate neitherstringr::str_detect()norbase::grepl()to a valid SQL expression. One way to get almost what you want is tocollect()before youfilter()`:

db_tbl %>% 
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% 
  show_query()
db_tbl %>% 
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", "")) %>% 
  collect()
db_tbl %>% 
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% 
  show_query()
db_tbl %>% 
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]')) %>% 
  collect()

但是,如果您将collect() 放在...之前...

db_tbl %>% 
  collect() %>%
  mutate(COMMENTS_NEW = str_detect(COMMENTS,"[^[:alnum:]///' ]", ""))
db_tbl %>% 
  collect() %>%
  mutate(COMMENTS_NEW = grepl(COMMENTS,pattern = '[^[:alnum:]]'))

您的远程表变成本地表,您可以在其上和平地申请str_detect()

作为旁注,show_query() 出于显而易见的原因不再有意义。

【讨论】:

  • 不是真正的解决方法,但是将所有数据从数据库中提取并使用正常功能。我认为唯一可能的解决方法是手动向查询中添加其他内容,因此这也不是真正的解决方法
猜你喜欢
  • 2021-02-08
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-11-19
  • 1970-01-01
  • 2016-11-19
  • 1970-01-01
  • 2018-06-19
相关资源
最近更新 更多