【发布时间】:2016-10-21 17:06:44
【问题描述】:
数据:
DB <- data.frame(orderID = c(1,2,3,4,4,5,6,6,7,8),
orderDate = c("1.1.12","1.1.12","1.1.12","13.1.12","13.1.12","12.1.12","10.1.12","10.1.12","21.1.12","24.1.12"),
itemID = c(2,3,2,5,12,4,2,3,1,5),
customerID = c(1, 2, 3, 1, 1, 3, 2, 2, 1, 1),
itemPrice = c(9.99, 14.99, 9.99, 19.99, 29.99, 4.99, 9.99, 14.99, 49.99, 19.99)
orderItemStatus = c(sold, sold, sold, refunded, sold, refunded, sold, refunded, sold, refunded))
预期结果:
DB <- data.frame(orderID = c(1,2,3,4,6,7),
orderDate = c("1.1.12","1.1.12","1.1.12","13.1.12","10.1.12","21.1.12"),
itemID = c(2,3,2,12,2,1),
customerID = c(1, 2, 3, 1, 2, 1,),
itemPrice = c(9.99, 14.99, 9.99, 29.99, 9.99, 49.99,)
orderItemStatus = c(sold, sold, sold, sold, sold, sold)
为了理解:
orderID 是连续的。同一天从同一customerID 订购的产品将获得相同的orderID。当同一位客户在另一天订购产品时,他/她是新的orderID。
我想删除所有 orderItemStatus = 已退款的订单。我怎样才能做到这一点? (我认为这很简单,我发现 Removing specific rows from a dataframe: 但我不明白它是如何工作的 - 所以请帮助我:()
-> 原始数据大约有 500k 行:所以请给出一个只需要很少性能的解决方案...
非常感谢您的支持!
【问题讨论】:
-
试试
DB <- DB[ DB$orderItemStatus != "refunded", ] -
成功了!谢谢!
标签: r