【问题标题】:Conditional changing a value of a variable in dplyr. Can't get `replace()` to work有条件地更改 dplyr 中变量的值。无法让 `replace()` 工作
【发布时间】:2021-12-30 06:47:03
【问题描述】:

巴西股票“PFRM3.SA”的雅虎财经历史价格存在一些错误。我试图使用dplyr 有条件地更改一个值。我设法使用case_when() 做到了这一点,但我无法使用replace() 使其工作,我认为这应该是一个更有效的解决方案。

library(tidyverse) 
library(tidyquant)

price <- tq_get("PFRM3.SA", from = "2020-01-01", get = "stock.prices") #Data

#visual inspection

price %>% 
  ggplot(aes(x = date, y = close)) +
  geom_line() 

#the decimal digit was misplaced a few times

#this works fine

price %>%  
  mutate(close = case_when(close > 100 ~ close/100, TRUE ~ close)) 

#However I don't understand why this do not work

price %>%  
  mutate(close = replace(close, close > 100, close/100))

最后一行代码产生这个错误:

Warning messages:
1: Problem with `mutate()` input `close`.
ℹ number of items to replace is not a multiple of replacement length
ℹ Input `close` is `replace(close, close > 100, close/100)`. 
2: In x[list] <- values :
  number of items to replace is not a multiple of replacement length

我不明白为什么replace() 在不满足条件时不返回原始关闭值。另外,我不知道如何使它工作。谢谢

【问题讨论】:

    标签: r dplyr replace


    【解决方案1】:
    • R 语法中的replace() 函数非常简单,易于实现。
    • 它包括向量、索引向量和替换值,如下所示。

    replace(x, list, values)

    x = 有一些值的向量 -> close

    list = 这可以是一个索引向量 -> close &gt; 100

    值 = 替换值 -> close[close&gt;100]/100

    • 说明:

    您正在有条件地索引close 列,因此在replaceValues 参数(= 替换值)中,您必须有条件地索引替换值;这里close/100close[close&gt;100]/100

    price %>%  
      mutate(close = replace(close, close > 100, close[close>100]/100)) %>% 
      data.frame()
    
        symbol       date   open   high    low close  volume   adjusted
    1   PFRM3.SA 2020-01-02   6.59   6.67   6.49  6.60  549100   6.263065
    2   PFRM3.SA 2020-01-03   6.50   6.99   6.40  6.96 1202000   6.604687
    3   PFRM3.SA 2020-01-06   7.07   7.71   7.07  7.69 2692600   7.297420
    4   PFRM3.SA 2020-01-07   7.42   7.50   7.05  7.05 2325900   6.690092
    5   PFRM3.SA 2020-01-08   7.01   7.28   6.80  7.13 1588900   6.766008
    6   PFRM3.SA 2020-01-09   7.21   7.22   6.98  7.12  638800   6.756518
    7   PFRM3.SA 2020-01-10   7.12   7.23   6.96  6.96  658000   6.604687
    8   PFRM3.SA 2020-01-13   7.01   7.17   6.96  7.10  586700   6.737539
    9   PFRM3.SA 2020-01-14   7.12   7.15   6.95  7.00  663700   6.642644
    10  PFRM3.SA 2020-01-15   7.02   7.39   7.02  7.27 1173000   6.898861
    11  PFRM3.SA 2020-01-16   7.35   7.45   7.09  7.16  816300   6.794476
    12  PFRM3.SA 2020-01-17   7.24   7.29   7.07  7.17  669900   6.803966
    13  PFRM3.SA 2020-01-20   7.20   7.29   7.10  7.23  548300   6.860903
    14  PFRM3.SA 2020-01-21   7.23   7.40   7.13  7.17  722500   6.803966
    15  PFRM3.SA 2020-01-22   7.23   7.29   6.88  7.09  697400   6.728050
    16  PFRM3.SA 2020-01-23   7.09   7.15   6.95  7.04  670500   6.680603
    17  PFRM3.SA 2020-01-24   7.07   7.29   7.02  7.15  552000   6.784987
    18  PFRM3.SA 2020-01-27   7.08   7.08   6.70  6.95  681300   6.595197
    19  PFRM3.SA 2020-01-28   6.94   7.06   6.61  6.90  703200   6.547750
    20  PFRM3.SA 2020-01-29   6.92   7.03   6.72  6.72  406900   6.376938
    21  PFRM3.SA 2020-01-30   6.60   6.65   6.40  6.64  786600   6.301023
    22  PFRM3.SA 2020-01-31   6.51   6.83   6.49  6.49  379200   6.158680
    23  PFRM3.SA 2020-02-03   6.45   6.69   6.37  6.66  491100   6.320002
    24  PFRM3.SA 2020-02-04   6.71   6.90   6.68  6.90  388800   6.547750
    25  PFRM3.SA 2020-02-05   6.85   6.94   6.64  6.81  499100   6.462344
    26  PFRM3.SA 2020-02-06   6.89   6.89   6.72  6.83  302400   6.481323
    27  PFRM3.SA 2020-02-07   6.66   6.82   6.57  6.60  280100   6.263065
    28  PFRM3.SA 2020-02-10   6.60   6.60   6.23  6.30  537700   5.978380
    29  PFRM3.SA 2020-02-11   6.35   6.52   6.26  6.50  455100   6.168170
    30  PFRM3.SA 2020-02-12   6.51   6.57   6.37  6.45  361200   6.120722
    31  PFRM3.SA 2020-02-13   6.35   6.38   6.18  6.30  338800   5.978380
    32  PFRM3.SA 2020-02-14   6.29   6.35   6.12  6.26  366100   5.940423
    33  PFRM3.SA 2020-02-17   6.32   6.44   6.20  6.20  582900   5.883485
    34  PFRM3.SA 2020-02-18   6.20   6.50   6.15  6.48  341200   6.149191
    35  PFRM3.SA 2020-02-19   6.50   6.83   6.40  6.67  722800   6.329491
    36  PFRM3.SA 2020-02-20   6.70   6.78   6.62  6.66  206200   6.320002
    37  PFRM3.SA 2020-02-21   6.61   6.62   6.41  6.52  261500   6.187149
    38  PFRM3.SA 2020-02-26   6.24   6.24   5.89  5.99  514500   5.684206
    39  PFRM3.SA 2020-02-27   5.85   6.04   5.61  5.75  654500   5.456458
    40  PFRM3.SA 2020-02-28   5.70   5.76   5.51  5.76  538900   5.465948
    41  PFRM3.SA 2020-03-02   5.82   6.10   5.71  6.05  294000   5.741143
    42  PFRM3.SA 2020-03-03   6.16   6.25   5.92  6.05  415000   5.741143
    43  PFRM3.SA 2020-03-04   6.09   6.16   5.90  6.05  465100   5.741143
    44  PFRM3.SA 2020-03-05   6.05   6.06   5.60  5.75  415300   5.456458
    45  PFRM3.SA 2020-03-06   5.46   5.60   5.29  5.60  609400   5.314116
    46  PFRM3.SA 2020-03-09   4.89   5.16   4.63  4.75  920400   4.507509
    47  PFRM3.SA 2020-03-10   5.12   5.32   4.77  5.32 1060000   5.048410
    48  PFRM3.SA 2020-03-11   5.28   5.41   4.69  5.21  870500   4.944026
    49  PFRM3.SA 2020-03-12   4.50   4.52   4.02  4.02  515900   3.814776
    50  PFRM3.SA 2020-03-13   4.50   4.68   4.10  4.49  706000   4.260782
    51  PFRM3.SA 2020-03-16   4.01   4.65   3.90  4.35  564900   4.127929
    52  PFRM3.SA 2020-03-17   4.50   4.52   3.96  4.24  672800   4.023544
    53  PFRM3.SA 2020-03-18   3.80   3.95   3.29  3.50  870200   3.321322
    54  PFRM3.SA 2020-03-19   3.20   3.58   2.89  3.58 1378800   3.397238
    55  PFRM3.SA 2020-03-20   3.67   3.96   3.22  3.22 1539200   3.055617
    56  PFRM3.SA 2020-03-23   3.23   3.34   2.87  3.10 1183900   2.941742
    57  PFRM3.SA 2020-03-24   3.22   3.42   3.13  3.13 1064000   2.970211
    58  PFRM3.SA 2020-03-25   3.21   3.66   3.15  3.52  890600   3.340301
    59  PFRM3.SA 2020-03-26   3.59   4.30   3.50  4.18 1004100   3.966608
    60  PFRM3.SA 2020-03-27   3.90   4.33   3.90  4.25  572100   4.033034
    61  PFRM3.SA 2020-03-30   4.26   4.68   4.17  4.27  771200   4.052013
    62  PFRM3.SA 2020-03-31   4.20   4.38   4.07  4.30  371300   4.080482
    63  PFRM3.SA 2020-04-01   4.11   4.25   3.98  4.11  411100   3.900181
    64  PFRM3.SA 2020-04-02 415.00 425.00 402.00  4.18  282600 396.660767
    65  PFRM3.SA 2020-04-03 410.00 420.00 383.00  4.00  351100 379.579681
    66  PFRM3.SA 2020-04-06 421.00 423.00 395.00  3.99  639500 378.630737
    67  PFRM3.SA 2020-04-07 418.00 434.00 397.00  4.05  455800 384.324432
    68  PFRM3.SA 2020-04-08 410.00 430.00 400.00  4.28  443600 406.150269
    69  PFRM3.SA 2020-04-09 431.00 447.00 424.00  4.29  417900 407.099213
    70  PFRM3.SA 2020-04-13 429.00 445.00 423.00  4.39  287500 416.588715
    71  PFRM3.SA 2020-04-14 450.00 455.00 442.00  4.43  273200 420.384491
    72  PFRM3.SA 2020-04-15 432.00 452.00 421.00  4.44  278800 421.333466
    73  PFRM3.SA 2020-04-16   4.50   4.53   4.26  4.34  530800   4.118440
    74  PFRM3.SA 2020-04-17   4.54   4.78   4.44  4.64  908400   4.403124
    75  PFRM3.SA 2020-04-20   4.46   4.76   4.43  4.60  447200   4.365166
    76  PFRM3.SA 2020-04-22   4.60   5.04   4.57  4.89  788600   4.640362
    77  PFRM3.SA 2020-04-23   4.97   5.17   4.63  4.77  576500   4.526488
    78  PFRM3.SA 2020-04-24   4.62   4.72   4.22  4.49  703200   4.260782
    79  PFRM3.SA 2020-04-27   4.61   4.65   4.35  4.63  296400   4.393635
    80  PFRM3.SA 2020-04-28   4.65   4.82   4.41  4.58  583300   4.346187
    81  PFRM3.SA 2020-04-29   4.62   4.75   4.50  4.74  391800   4.498019
    82  PFRM3.SA 2020-04-30   4.75   4.79   4.47  4.59  288300   4.355677
    83  PFRM3.SA 2020-05-04   4.41   4.63   4.33  4.63  288400   4.393635
    84  PFRM3.SA 2020-05-05   4.63   4.76   4.54  4.70  389700   4.460061
    85  PFRM3.SA 2020-05-06   4.69   5.26   4.57  5.22 1389800   4.953515
    86  PFRM3.SA 2020-05-07   5.35   5.37   4.61  4.82 1877800   4.573936
    87  PFRM3.SA 2020-05-08   4.93   4.98   4.50  4.72 1040300   4.479040
    88  PFRM3.SA 2020-05-11   4.70   4.74   4.40  4.49  532600   4.260782
    89  PFRM3.SA 2020-05-12   4.50   4.58   4.22  4.36  585100   4.137419
    90  PFRM3.SA 2020-05-13   4.40   4.40   4.13  4.19  456100   3.976097
    91  PFRM3.SA 2020-05-14   4.13   4.27   3.97  4.27  504600   4.052013
    92  PFRM3.SA 2020-05-15   4.24   4.36   4.13  4.19  255500   3.976097
    93  PFRM3.SA 2020-05-18   4.30   4.32   4.12  4.19  351000   3.976097
    94  PFRM3.SA 2020-05-19   4.19   4.30   4.14  4.18  335200   3.966608
    95  PFRM3.SA 2020-05-20   4.21   4.25   4.10  4.20  347500   3.985587
    96  PFRM3.SA 2020-05-21   4.16   4.23   4.10  4.20  234300   3.985587
    97  PFRM3.SA 2020-05-22   4.20   4.39   4.11  4.39  368500   4.165887
    98  PFRM3.SA 2020-05-25   4.42   4.66   4.42  4.64  680200   4.403124
    99  PFRM3.SA 2020-05-26   4.74   4.84   4.49  4.60  631300   4.365166
    100 PFRM3.SA 2020-05-27   4.60   4.74   4.55  4.66  422400   4.422103
    101 PFRM3.SA 2020-05-28   4.66   4.78   4.57  4.60  451200   4.365166
    102 PFRM3.SA 2020-05-29   4.65   4.65   4.44  4.56  387400   4.327209
    103 PFRM3.SA 2020-06-01   4.50   4.93   4.46  4.74  962300   4.498019
    104 PFRM3.SA 2020-06-02   4.77   5.05   4.77  5.00 1063000   4.744746
    105 PFRM3.SA 2020-06-03   5.05   5.31   4.86  4.86  924300   4.611893
    106 PFRM3.SA 2020-06-04   4.85   5.17   4.71  4.86  768600   4.611893
    107 PFRM3.SA 2020-06-05   4.94   5.12   4.85  4.85  661700   4.602404
    108 PFRM3.SA 2020-06-08   4.95   5.86   4.95  5.79 2498400   5.494416
    109 PFRM3.SA 2020-06-09   5.64   5.93   5.50  5.66 1288500   5.371052
    110 PFRM3.SA 2020-06-10   5.76   5.81   5.35  5.35 1140600   5.076878
    111 PFRM3.SA 2020-06-12   5.25   5.36   4.92  5.27  894600   5.000962
    112 PFRM3.SA 2020-06-15   5.06   5.30   5.00  5.22  618700   4.953515
    113 PFRM3.SA 2020-06-16   5.31   5.55   5.25  5.41  656000   5.133815
    114 PFRM3.SA 2020-06-17   5.41   5.61   5.37  5.59  597600   5.304626
    115 PFRM3.SA 2020-06-18   5.61   5.74   5.48  5.65  440600   5.361563
    116 PFRM3.SA 2020-06-19   5.71   5.74   5.47  5.50  558500   5.219221
    117 PFRM3.SA 2020-06-22   5.52   5.75   5.50  5.72  554600   5.427989
    118 PFRM3.SA 2020-06-23   5.75   6.25   5.74  6.00 1652300   5.693695
    119 PFRM3.SA 2020-06-24   6.00   6.15   5.77  5.99  830800   5.684206
    120 PFRM3.SA 2020-06-25   5.99   6.04   5.79  5.94  827100   5.636758
    121 PFRM3.SA 2020-06-26   5.99   6.03   5.59  5.59  767000   5.304626
    122 PFRM3.SA 2020-06-29   5.69   5.74   5.33  5.58  739600   5.295136
    123 PFRM3.SA 2020-06-30   5.60   5.70   5.45  5.65  457900   5.361563
    124 PFRM3.SA 2020-07-01   5.70   5.75   5.49  5.53  634200   5.247689
    125 PFRM3.SA 2020-07-02   5.61   5.64   5.41  5.42  448700   5.143305
     [ reached 'max' / getOption("max.print") -- omitted 369 rows ]
    

    基准测试:

    【讨论】:

    • 感谢您对基准的详细解释和出色的视觉效果。
    【解决方案2】:

    您只需要在replace 之前添加rowwise。本质上,在您的replace 语句中,close/100 为每一行返回一个新的计算值。实际上,您只想替换大于 100 的值。因此,使用 rowwise 时,它只会按行计算,而不是为 values 参数计算整个 cost 列。

    library(tidyverse)
    
    price %>%  
      rowwise %>% 
      mutate(close = replace(close, close > 100, close/100))
    

    输出

    # A tibble: 493 × 8
    # Rowwise: 
       symbol   date        open  high   low close  volume adjusted
       <chr>    <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl>
     1 PFRM3.SA 2020-01-02  6.59  6.67  6.49  6.6   549100     6.26
     2 PFRM3.SA 2020-01-03  6.5   6.99  6.4   6.96 1202000     6.60
     3 PFRM3.SA 2020-01-06  7.07  7.71  7.07  7.69 2692600     7.30
     4 PFRM3.SA 2020-01-07  7.42  7.5   7.05  7.05 2325900     6.69
     5 PFRM3.SA 2020-01-08  7.01  7.28  6.8   7.13 1588900     6.77
     6 PFRM3.SA 2020-01-09  7.21  7.22  6.98  7.12  638800     6.76
     7 PFRM3.SA 2020-01-10  7.12  7.23  6.96  6.96  658000     6.60
     8 PFRM3.SA 2020-01-13  7.01  7.17  6.96  7.1   586700     6.74
     9 PFRM3.SA 2020-01-14  7.12  7.15  6.95  7     663700     6.64
    10 PFRM3.SA 2020-01-15  7.02  7.39  7.02  7.27 1173000     6.90
    # … with 483 more rows
    

    另一个选项(由@DonaldSeinen 提供)是使用transform,我已将其添加到下面的基准测试中。

    transform(price, close = ifelse(close > 100, close / 100, close))
    

    基准测试

    所以,看起来transform(由@DonaldSeinen 提供)是迄今为止最快的。 @TarJae 提供了一个很好的视觉效果!

    library(microbenchmark)
    
    microbenchmark(
      replace = price %>%
        rowwise %>%
        mutate(close = replace(close, close > 100, close / 100)),
      case_when = price %>%
        mutate(close = case_when(close > 100 ~ close / 100, TRUE ~ close)),
      times = 1000L,
      control = list(order = 'block')
    )
    
    # Unit: microseconds
    #        expr      min        lq      mean   median       uq      max neval
    #     replace 4149.300 4997.8325 5785.2692 5469.816 5993.926 28077.15  1000
    #   case_when 1410.959 1514.8980 1906.9804 1662.030 2138.092 11863.49  1000
    # DonaldSeinen  247.421  271.7735  410.3955  315.499  472.813 11546.28  1000
    #      TarJae 1536.112 1755.8475 2254.4749 2086.729 2560.107 12802.07  1000
    

    【讨论】:

    • 如果速度是一个问题 - transform(price, close = ifelse(close &gt; 100, close / 100, close)) 可能更快
    • @DonaldSeinen Seinen。你说的对。我已经在我的答案中添加了情节!
    • @Andrew Gillreath-Brown。 OP 想知道为什么replace 在这种情况下不起作用price %&gt;% mutate(close = replace(close, close &gt; 100, close/100)) OP 已经有了price %&gt;% mutate(close = case_when(close &gt; 100 ~ close/100, TRUE ~ close)) 的有效答案??? :-)
    • @DonaldSeinen 太好了,谢谢!我将它添加到基准中;虽然我更喜欢 @TarJae 视觉!
    • 谢谢大家的解释。有趣的讨论和基准测试结果。添加rowwise() 解决了我的原始代码,这就是我想要做的,但仅索引replace values 更快。我猜是本地索引。
    猜你喜欢
    • 2022-12-03
    • 2021-07-03
    • 1970-01-01
    • 2011-12-25
    • 2017-04-28
    • 2019-09-11
    • 2020-02-18
    • 2017-09-25
    • 1970-01-01
    相关资源
    最近更新 更多