R+dplyr：有条件地交换两列的元素答案

【问题标题】：R+dplyr: conditionally swap the elements of two columnsR+dplyr：有条件地交换两列的元素
【发布时间】：2021-11-26 09:52:23
【问题描述】：

考虑文章末尾的数据框 df。我只是想在 x>y 时交换列 x 和 y 的元素。

数据框中可能还有我不想触摸的其他列。

从某种意义上说，我想按行对列 x 和 y 进行排序。

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union



df<-tibble(x=1:10, y=10:1, extra=LETTERS[1:10])
   

df
#> # A tibble: 10 × 3
#> # Rowwise: 
#>        x     y extra
#>    <int> <int> <chr>
#>  1     1    10 A    
#>  2     2     9 B    
#>  3     3     8 C    
#>  4     4     7 D    
#>  5     5     6 E    
#>  6     6     5 F    
#>  7     7     4 G    
#>  8     8     3 H    
#>  9     9     2 I    
#> 10    10     1 J

^{由reprex package 创建于 2021-10-06 (v2.0.1)}

【问题讨论】：

我可能完全错了，但如果目标是“在 x>y 时交换列 x 和 y 的元素”，那么这似乎可以解决问题：df %>% mutate(x1 = ifelse(x > y, y, x))
当然。总是可以按照这些思路做一些事情，然后对另一个 y2 变量进行相同的操作，然后丢弃 x 和 y，然后重命名 x2--->x 和 y2--->y，但是有更优雅的方法吗？
为什么需要rename？使用@ChrisRuehlemann 方法：df %>% mutate(x = ifelse(x > y, y, x))?
因为我还需要改变y的值。
@larry77 我明白了，请参阅下面可能有效的解决方案

标签： r sorting dplyr rowwise

【解决方案1】：

这对我来说看起来像是排序：

library(tidyverse)
df <- tibble(x=1:10, y=10:1, extra=LETTERS[1:10])
df
#> # A tibble: 10 x 3
#>        x     y extra
#>    <int> <int> <chr>
#>  1     1    10 A    
#>  2     2     9 B    
#>  3     3     8 C    
#>  4     4     7 D    
#>  5     5     6 E    
#>  6     6     5 F    
#>  7     7     4 G    
#>  8     8     3 H    
#>  9     9     2 I    
#> 10    10     1 J

extra_cols <- df %>% colnames() %>% setdiff(c("x", "y"))
extra_cols
#> [1] "extra"

df %>%
  mutate(row = row_number()) %>%
  pivot_longer(-c(row, extra_cols)) %>%
  group_by_at(c("row", extra_cols)) %>%
  transmute(
    value = value %>% sort(),
    name = c("x", "y"),
  ) %>%
  pivot_wider() %>%
  ungroup() %>%
  select(-row)
#> Note: Using an external vector in selections is ambiguous.
#> ℹ Use `all_of(extra_cols)` instead of `extra_cols` to silence this message.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
#> This message is displayed once per session.
#> # A tibble: 10 x 3
#>    extra     x     y
#>    <chr> <int> <int>
#>  1 A         1    10
#>  2 B         2     9
#>  3 C         3     8
#>  4 D         4     7
#>  5 E         5     6
#>  6 F         5     6
#>  7 G         4     7
#>  8 H         3     8
#>  9 I         2     9
#> 10 J         1    10

^{由reprex package (v2.0.1) 于 2021-10-06 创建}

【讨论】：

抱歉，我改进了我的示例。您的解决方案不适用于修改后的代表。
@larry77 我修改了我的例子

【解决方案2】：

尝试在轴 1 上使用 apply 并将其与 t 转置，然后使用 as_tibble 将其转换为 tibble。

然后最后更改列名：

> df <- as_tibble(t(apply(df, 1, sort)))
> names(df) <- c('x', 'y')
> df
# A tibble: 10 x 2
       x     y
   <int> <int>
 1     1    10
 2     2     9
 3     3     8
 4     4     7
 5     5     6
 6     5     6
 7     4     7
 8     3     8
 9     2     9
10     1    10

【讨论】：

【解决方案3】：

谢谢大家！

我编写了一个小函数，它可以满足我的需要并推广到多个变量的情况。查看代表

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

set.seed(1234)

set_colnames <- `colnames<-`

df<-tibble(x=1:10, y=10:1, z=rnorm(10), extra=LETTERS[1:10]) %>%
    rowwise() 

df
#> # A tibble: 10 × 4
#> # Rowwise: 
#>        x     y      z extra
#>    <int> <int>  <dbl> <chr>
#>  1     1    10 -1.21  A    
#>  2     2     9  0.277 B    
#>  3     3     8  1.08  C    
#>  4     4     7 -2.35  D    
#>  5     5     6  0.429 E    
#>  6     6     5  0.506 F    
#>  7     7     4 -0.575 G    
#>  8     8     3 -0.547 H    
#>  9     9     2 -0.564 I    
#> 10    10     1 -0.890 J


sort_rows <- function(df, col_names, dec=F){

    temp <- df %>%
        select(all_of(col_names))

    extra_names <- setdiff(colnames(df), col_names)

    temp2 <- df %>%
        select(all_of(extra_names))
    

    res <- t(apply(temp, 1, sort, decreasing=dec)) %>%
        as_tibble %>%
        set_colnames(col_names) %>%
        bind_cols(temp2)

    return(res)
    


}



col_names <- c("x", "y", "z")

df_s <- df %>%
    sort_rows(col_names, dec=F)
#> Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if `.name_repair` is omitted as of tibble 2.0.0.
#> Using compatibility `.name_repair`.


df_s
#> # A tibble: 10 × 4
#>         x     y     z extra
#>     <dbl> <dbl> <dbl> <chr>
#>  1 -1.21      1    10 A    
#>  2  0.277     2     9 B    
#>  3  1.08      3     8 C    
#>  4 -2.35      4     7 D    
#>  5  0.429     5     6 E    
#>  6  0.506     5     6 F    
#>  7 -0.575     4     7 G    
#>  8 -0.547     3     8 H    
#>  9 -0.564     2     9 I    
#> 10 -0.890     1    10 J

^{由reprex package 创建于 2021-10-06 (v2.0.1)}

【讨论】：

【解决方案4】：

base解决方案：

使用which(df$x > df$y) 确定要更改的行号，然后使用rev 交换这些值：

df[which(df$x > df$y), c("x", "y")] <- rev(df[which(df$x > df$y), c("x", "y")])
df
#        x     y extra
#    <int> <int> <chr>
#  1     1    10 A    
#  2     2     9 B    
#  3     3     8 C    
#  4     4     7 D    
#  5     5     6 E    
#  6     5     6 F    
#  7     4     7 G    
#  8     3     8 H    
#  9     2     9 I    
# 10     1    10 J

【讨论】：