【发布时间】:2021-12-02 02:15:02
【问题描述】:
我正在苦苦思索如何计算按日期、按组排列的第一个值和最后一个值之间的差异。这是一个玩具示例:
test1 = data.frame(my_groups = c("A", "A", "A", "B", "B", "B", "C", "C", "C", "A", "A", "A"),
measure = c(10, 20, 5, 64, 2, 62 ,2, 5, 4, 6, 7, 105),
#distance = c(),
time= as.Date(c("20-09-2020", "25-09-2020", "19-09-2020", "20-05-2020", "20-05-2020", "20-06-2021",
"11-01-2021", "13-01-2021", "13-01-2021", "15-01-2021", "15-01-2021", "19-01-2021"), format = "%d-%m-%Y"))
# test1 %>% arrange(my_groups, time)
# my_groups measure time
# 1 A 5 2020-09-19
# 2 A 10 2020-09-20
# 3 A 20 2020-09-25
# 4 A 6 2021-01-15
# 5 A 7 2021-01-15
# 6 A 105 2021-01-19
# 7 B 64 2020-05-20
# 8 B 2 2020-05-20
# 9 B 62 2021-06-20
# 10 C 2 2021-01-11
# 11 C 5 2021-01-13
# 12 C 1 2021-01-13
#desired result
# my_groups diff
# 1 A 100 (105 - 5)
# 2 B 2 (64 - 62)
# 3 C 1 (1 - 2)
desired result 中括号内的等式只是为了说明diff 的来源。
任何提示我该怎么做?
【问题讨论】:
-
A 不应该是 105-5 = 100 吗?
-
和 C 1 - 2 = -1 ?
-
另外.. 你会如何处理 C 中的领带?
-
对于领带,我将采取最小措施。为此,我想到了先安排数据框