【发布时间】:2018-04-14 18:55:28
【问题描述】:
我有一些看起来像这样的交易数据。
SHOP_ID, DATE , DAY, IN_TIME , OUT_TIME
shop007, 2017/5/20, mon, 05:03:38, 05:05:33
shop0010, 2017/4/13, sat, 08:53:42, 08:53:45
shop005, 2017/10/25, wed, 03:02:42, 03:04:15
shop001, 2017/10/5, sun, 19:09:37, 19:11:35
shop008, 2017/1/19, sat, 14:33:01, 14:35:00
shop004, 2017/3/13, sun, 02:16:06, 02:17:59
shop0010, 2016/7/4, thu, 10:25:54, 10:25:59
shop008, 2016/11/6, sat, 22:52:21, 22:53:49
shop004, 2016/11/13, tue, 08:30:51, 08:32:04
shop007, 2016/10/2, wed, 19:28:29, 19:29:48
shop006, 2017/9/25, mon, 01:11:19, 01:12:12
shop003, 2017/1/14, mon, 00:43:33, 00:43:53
shop009, 2017/7/7, fri, 16:35:52, 16:36:54
shop008, 2017/4/26, tue, 06:31:23, 06:33:10
shop007, 2016/3/19, fri, 04:46:34, 04:48:04
shop001, 2016/11/4, mon, 11:16:55, 11:18:22
shop001, 2017/8/31, sat, 07:07:25, 07:09:16
shop005, 2017/3/16, mon, 17:17:00, 17:18:47
shop001, 2017/4/23, fri, 04:35:37, 04:37:24
shop003, 2016/9/18, thu, 08:53:55, 08:55:35
shop001, 2016/1/12, sun, 10:25:43, 10:26:09
shop009, 2017/4/9, mon, 17:44:45, 17:45:54
shop004, 2017/7/1, mon, 01:23:14, 01:24:37
shop002, 2017/12/28, thu, 18:00:34, 18:00:50
shop009, 2016/4/6, tue, 00:48:25, 00:49:50
shop009, 2016/4/10, sat, 14:21:41, 14:22:19
shop001, 2016/5/16, wed, 15:07:17, 15:09:14
shop005, 2016/10/6, wed, 23:09:58, 23:10:07
shop009, 2016/5/6, tue, 09:39:47, 09:39:55
shop002, 2017/6/16, sat, 19:35:08, 19:35:53
shop005, 2017/5/26, wed, 10:08:24, 10:09:31
shop003, 2016/8/7, fri, 06:52:28, 06:52:54
shop006, 2017/5/5, thu, 17:28:06, 17:28:50
shop001, 2016/1/7, wed, 10:39:07, 10:39:24
我想创建一个时间序列模型,使用收集的语料库数据来预测当前一周、一天和一小时的顾客数量。
我想要的模型是ncustomers ~ time,其中ncustomers 是客户总数,time 可以是周、日和小时。
我不知道我们是否可以使用线性回归模型,因为自变量是分类类型,因变量是连续类型。
【问题讨论】:
-
您可以将分类变量映射为一组二进制变量,每个变量仅表示“X 属于 Y”(= 1)或“X 不属于 Y”(= 0)。然后可以将此类二元变量作为自变量提供给许多不同的模型。
标签: r machine-learning