【发布时间】:2021-07-06 08:27:24
【问题描述】:
我正在尝试使用 pivot_longer() 做一些事情以使宽表变长,但我不太明白。
这是我要操作的数据框的头部
head(stack)
unique.pair Area.IN Area.NEAR ALLEVEN.IN ALLEVEN.NEAR TREERICH.IN TREERICH.NEAR HEMIAB.IN HEMIAB.NEAR
1 AGFO 1_AGFO 5 100 100 0.7309552 0.3724176 2 1 1.00 0
2 AGFO 27_AGFO 24 100 100 0.8990520 0.6306221 1 0 1.00 0
3 AGFO 6_AGFO 23 100 100 0.7956735 0.7022392 1 1 1.00 0
4 ALFL LAMR.7_ALFL LAMR.103 100 400 0.4425270 0.6838157 4 6 0.50 0
5 APCO 10_APCO 2 400 400 0.5730378 0.5453876 18 19 0.55 0
6 APCO 4_APCO 9 400 400 0.6349441 0.7078960 22 23 0.55 0
基本上,每一行都是一对唯一的 2 个 ID 及其对某些指标(.IN 和.NEAR)的相应度量;我现在需要做到这一点,所以每个唯一对有两行,并且我拆分了它们的指标..例如,我在为“ALLEVEN.IN 和 ALLEVEN.NEAR”执行此操作方面取得了一定的成功。我还需要 AREA 指标
master.long <- master.JH %>%
select(unique.pair, ALLEVEN.IN, ALLEVEN.NEAR, HEMIAB.IN, HEMIAB.NEAR, Area.IN, Area.NEAR) %>%
pivot_longer(cols = c(ALLEVEN.IN, ALLEVEN.NEAR), names_to = "HEMI", values_to = "ALLEVEN") %>%
pivot_longer(cols = c(Area.IN, Area.NEAR), names_to = "Area", values_to = "Area_sampled") %>%
separate(HEMI, into = c(NA, "HEMI"))%>%
separate(Area, into = c(NA , "AREA")) %>%
mutate(HEMI.status = case_when(HEMI == "IN" & AREA == "IN" ~ "HEMI",
HEMI == "NEAR" & AREA =="NEAR" ~ "NO.HEMI"))
输出是:
# A tibble: 6 x 8
unique.pair HEMIAB.IN HEMIAB.NEAR HEMI ALLEVEN AREA Area_sampled HEMI.status
<chr> <dbl> <dbl> <chr> <dbl> <chr> <dbl> <chr>
1 AGFO 6_AGFO 23 1 0 IN 0.796 IN 100 HEMI
2 AGFO 6_AGFO 23 1 0 IN 0.796 NEAR 100 NA
3 AGFO 6_AGFO 23 1 0 NEAR 0.702 IN 100 NA
4 AGFO 6_AGFO 23 1 0 NEAR 0.702 NEAR 100 NO.HEMI
5 AGFO 27_AGFO 24 1 0 IN 0.899 IN 100 HEMI
6 AGFO 27_AGFO 24 1 0 IN 0.899 NEAR 100 NA
2 个问题
1.) 我明白为什么 HEMI.status 有 NA,但我不确定如何告诉代码只删除这些值。我以后可以轻松地做到这一点,但想知道是否有办法在 pivot 更长的时间内完成
2.) 有没有办法对所有列执行此操作,所有列的一个枢轴代码更长; IE。我可以将“TREERICH.IN”和“TREERICH.NEAR”合并到其中,使用相同的 HEMI 列吗?我试过了,但是当我对 TREERICH 说 "names_to" = "HEMI" 时(见下文)我得到一个明显的错误
master.long <- master.JH %>%
select(unique.pair, ALLEVEN.IN, ALLEVEN.NEAR, HEMIAB.IN, HEMIAB.NEAR, Area.IN, Area.NEAR) %>%
pivot_longer(cols = c(ALLEVEN.IN, ALLEVEN.NEAR), names_to = "HEMI", values_to = "ALLEVEN") %>%
pivot_longer(cols = c(TREERICH.IN, TREERICH.NEAR), names_to = "HEMI", values_to = "TREERICH")
pivot_longer(cols = c(Area.IN, Area.NEAR), names_to = "Area", values_to = "Area_sampled") %>%
separate(HEMI, into = c(NA, "HEMI"))%>%
separate(Area, into = c(NA , "AREA")) %>%
mutate(HEMI.status = case_when(HEMI == "IN" & AREA == "IN" ~ "HEMI",
HEMI == "NEAR" & AREA =="NEAR" ~ "NO.HEMI"))
希望我解释得足够好。感谢您的帮助!
【问题讨论】:
-
这能回答你的问题吗? Gather multiple sets of columns