分类列和密集列有什么区别？答案

【问题标题】：What is the difference between a Categorical Column and a Dense Column?分类列和密集列有什么区别？
【发布时间】：2018-07-19 18:00:58
【问题描述】：

在Tensorflow中，有9个不同的特征列，分为三组：categorical、dense和hybrid.

通过阅读guide，我了解到分类列用于表示具有数值的离散输入数据。它给出了一个名为 categorical identity column 的分类列的示例：

ID   Represented using one-hot encoding
0    [1, 0, 0, 0]
1    [0, 1, 0, 0]
2    [0, 0, 1, 0]
3    [0, 0, 0, 1]

但是您还有一个称为 indicator 列的密集列，它“包装”（？）一个分类列以产生看起来几乎相同的东西：

Category (from category column)   Represented as...
0                                 [1, 0, 0, 0]
1                                 [0, 1, 0, 0]
2                                 [0, 0, 1, 0]
3                                 [0, 0, 0, 1]

所以“分类”和“密集”列似乎都能够表示离散数据，并且都可以使用 one-hot 编码，所以这不是区分它们的原因。

我的问题是：原则上，“分类列”和“密集列”有什么区别？

【问题讨论】：

标签： tensorflow machine-learning one-hot-encoding

【解决方案1】：

我刚刚在 DataScience StackExchange 上找到答案之前遇到了这个问题，你可以找到原答案here

如果我理解正确，答案很简单，虽然分类列确实会将数据编码为one-hot，但indicatorcolumn 会将其编码为multi-hot

【讨论】：