您可以通过“制表”每个客户来做到这一点:
table(year2014$UniSA_Customer_No)
例如可以比较 2 个变量:
tabulate(year2014$UniSA_Customer_No, year2014$Sale_Date)
但是,在这种情况下,我建议先删除重复项(有关详细信息,请参阅 this answer)。
#select data from the year 2014
year2014 <- year2014[grep("^2014-", year2014$Sale_Date),]
#extract only columns to define duplicates
cust_date <- cbind(year2014$UniSA_Customer_No, year2014$Sale_Date)
#detect duplicates
dup_rows <- duplicated(cust_date)
#subset to unique rows
year2014unique <- year2014[!dup_rows,]
#tabulate without duplicates (customers counted once per day)
table(year2014unique$UniSA_Customer_No, year2014unique$Sale_Date)
举个简单的例子:
> unique(c(1, 2, 3, 1))
[1] 1 2 3
> table(c(1, 2, 3, 1))
1 2 3
2 1 1
不需要外部包来执行此操作。