【发布时间】:2023-12-14 18:31:01
【问题描述】:
我有一个带有列索引的大型数据框,它重复分配给特定行活动的数值。我希望能够运行引用此索引列的计算并计算从包含该参考值的第一个日期作为单独列的天数以及单独列执行逻辑测试该值包含在单独列中匹配该列中该索引值的第一个值。我一直在使用 dplyr 并拥有以下脚本:
test <- InsiderList3 %>%
group_by(`Insider CIK`) %>%
mutate(rf.diff = first(`Transaction Date`)-`Transaction Date`) %>%
mutate(IssuerCheck = first(`Issuer`) ==Issuer)
标记为“Insider CIK”的列是索引,所有其他列的信息都与此相关联,直到弹出下一个索引值并重复该过程。有一个单独的日期列和标识公司的信息。
前 20 行样本的输入:
dput(head(InsiderList3[c('Insider CIK', 'Transaction Date', 'Issuer')], 75))
structure(list(`Insider CIK` = c("0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001008134", "0001008134",
"0001008134", "0001008134", "0001008134", "0001009891", "0001012859",
"0001012859", "0001012859", "0001012859"), `Transaction Date` = structure(c(18358,
18358, 18101, 18065, 18065, 18039, 17729, 17700, 17674, 17674,
17345, 17345, 17326, 17014, 17014, 17014, 17014, 17014, 17014,
17001, 16964, 16964, 16598, 16590, 16582, 16582, 16409, 16288,
16288, 16245, 16245, 16217, 16161, 16072, 16052, 15967, 15880,
15869, 15771, 15710, 15710, 15687, 15603, 15523, 15354, 15354,
15030, 14979, 14840, 14049, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 18358, 18358,
18358, 18261), class = "Date"), Issuer = c("TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"SANDRIDGE ENERGY INC", "SANDRIDGE ENERGY INC", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "Seventy Seven Energy Inc.",
"Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.",
"Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "Seventy Seven Energy Inc.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.", "Seventy Seven Energy Inc.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP",
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.",
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP",
"CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "TRANSATLANTIC PETROLEUM LTD.",
"TRANSATLANTIC PETROLEUM LTD.", "QUEST RESOURCE CORP", "QUEST RESOURCE CORP",
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP",
"CHESAPEAKE ENERGY CORP", "CHESAPEAKE ENERGY CORP", "TRANSATLANTIC PETROLEUM LTD.",
"CHESAPEAKE ENERGY CORP", "Seventy Seven Energy Inc.", "CHESAPEAKE OILFIELD OPERATING LLC",
"TRANSATLANTIC PETROLEUM LTD.", "QUEST RESOURCE CORP", "CHESAPEAKE ENERGY CORP",
"CHESAPEAKE ENERGY CORP", "CVR ENERGY INC", "CHESAPEAKE ENERGY CORP",
"SANDRIDGE ENERGY INC", "TRANSATLANTIC PETROLEUM LTD.", "Seventy Seven Energy Inc.",
"CHESAPEAKE ENERGY CORP", NA, "NATIONAL HEALTHCARE CORP", "NATIONAL HEALTHCARE CORP",
"NATIONAL HEALTHCARE CORP", "NATIONAL HEALTHCARE CORP")), row.names = c(NA,
75L), class = "data.frame")
感谢您的帮助。
【问题讨论】:
-
first(Issuer) =Issuer需要==。 -
可能发帖
head(InsiderList3[c('Insider CIK','交易日期','Issuer')], 20)?它只有 3 列和 20 行。 -
我已经进行了更改,但不幸的是它仍然无法正常工作。每个后续日期计算哪个应该相对于第一个而不是第一个整体不起作用(我有一些负值)。逻辑测试也与表中的第一个值相关,而不是与索引中的下一个值相关。
-
感谢您的帮助。