基于R中的另一个向量复制数据帧的行[重复]答案

【问题标题】：Duplicating rows of a dataframe based on another vector in R [duplicate]基于R中的另一个向量复制数据帧的行[重复]
【发布时间】：2018-08-31 13:18:46
【问题描述】：

假设我有以下数据框：

set.seed(1)
df <- data.frame("x" = 1:5, "y" = rnorm(5))

  x          y
1 1 -0.6264538
2 2  0.1836433
3 3 -0.8356286
4 4  1.5952808
5 5  0.3295078

我想按照x 中指示的次数复制每一行，如下所示：

   x          y
1  1 -0.6264538
2  2  0.1836433
3  2  0.1836433
4  3 -0.8356286
5  3 -0.8356286
6  3 -0.8356286
7  4  1.5952808
8  4  1.5952808
9  4  1.5952808
10 4  1.5952808
11 5  0.3295078
12 5  0.3295078
13 5  0.3295078
14 5  0.3295078
15 5  0.3295078

我该怎么做呢？虽然我更喜欢使用 tidyverse 解决方案，但我愿意接受任何其他建议。

【问题讨论】：

标签： r

【解决方案1】：

我们可以使用rep 来复制数据框的行，并使用times 参数来说明每行重复多少次。

df[rep(1:nrow(df), times = df$x), ]
    x          y
1   1 -0.6264538
2   2  0.1836433
2.1 2  0.1836433
3   3 -0.8356286
3.1 3 -0.8356286
3.2 3 -0.8356286
4   4  1.5952808
4.1 4  1.5952808
4.2 4  1.5952808
4.3 4  1.5952808
5   5  0.3295078
5.1 5  0.3295078
5.2 5  0.3295078
5.3 5  0.3295078
5.4 5  0.3295078

【讨论】：

我显然需要出去散散步，因为我无法考虑问题的简单性。谢谢。
这其实和我多年前问的一个问题很相似，How to repeat a data frame?得到答案的时候我也有同感。
或者另一个选项是expandRows(df, 'x', drop = FALSE)

【解决方案2】：

使用dplyr：

dplyr::slice(df, rep(1:n(), x))                # as per Sir Gregor's recommendation

显式或

dplyr::slice(df,rep(1:nrow(df), df$x))

【讨论】：

噢，但如果你要使用dplyr，请使用酷炫的n() 而不是蹩脚的旧nrow()。它节省了三个字符的打字！ ;)

【解决方案3】：

with(df,df[rep(1:nrow(df),x),])
    x          y
1   1 -0.6264538
2   2  0.1836433
2.1 2  0.1836433
3   3 -0.8356286
3.1 3 -0.8356286
3.2 3 -0.8356286
4   4  1.5952808
4.1 4  1.5952808
4.2 4  1.5952808
4.3 4  1.5952808
5   5  0.3295078
5.1 5  0.3295078
5.2 5  0.3295078
5.3 5  0.3295078
5.4 5  0.3295078

【讨论】：

【解决方案4】：

df[ rep(seq_len(nrow(df)), df$x), ]

    x           y
1   1 -1.31142059
2   2 -0.09652492
2.1 2 -0.09652492
3   3  2.36971991
3.1 3  2.36971991
3.2 3  2.36971991
4   4  0.89062648
4.1 4  0.89062648
4.2 4  0.89062648
4.3 4  0.89062648
5   5 -0.25218316
5.1 5 -0.25218316
5.2 5 -0.25218316
5.3 5 -0.25218316
5.4 5 -0.25218316

看起来我们几个人同时达到了它......

【讨论】：

【解决方案5】：

我最近发现 dplyr::uncount() 也可以：

dplyr::uncount(df, x)

【讨论】：