tfidf 上的 CNN 作为输入答案

【问题标题】：CNN on tfidf as inputtfidf 上的 CNN 作为输入
【发布时间】：2020-10-10 00:50:24
【问题描述】：

我正在使用 CNN 进行假新闻检测，我是在 keras 和 tensorflow 中对 CNN 进行编码的新手。我需要有关创建 CNN 的帮助，该 CNN 将输入作为向量形式的语句，每个长度为 100，并根据其预测值为 false 或 true 输出 0 或 1。

X_train.shape
# 10229, 100

X_train = np.expand_dims(X_train, axis=2)
X_train.shape
# 10229,100,1

# actual cnn model here
import tensorflow as tf
from tensorflow.keras import layers

# Conv1D + global max pooling

from keras.layers import Conv1D, MaxPooling1D, Embedding, Dropout, Flatten, Dense
from keras.layers import Input
text_len=100
from keras.models import Model

inp = Input(batch_shape=(None, text_len, 1))
conv2 = Conv1D(filters=128, kernel_size=5, activation='relu')(inp)
drop21 = Dropout(0.5)(conv2)
conv22 = Conv1D(filters=64, kernel_size=5, activation='relu')(drop21)
drop22 = Dropout(0.5)(conv22)
pool2 = MaxPooling1D(pool_size=2)(drop22)
flat2 = Flatten()(pool2)
out = Dense(1, activation='softmax')(flat2)

model = Model(inp, out)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X_train, Y_train)

如果有人能给我一个工作代码并稍微解释一下，我将不胜感激

【问题讨论】：

为什么需要Conv1D？如果你使用 tfidf，一个简单的密集网络就可以了
我假设您的意思是密集网络的 ann，即简单的层。但是我正在尝试所有类型的 CNN，你可以说是某种研究，我不担心它是否有任何有用的目的，我希望看到结果并根据一些因素进行比较.. I刚刚写了 tfidf，因为在我对语句的 100 维编码中，我有浮点数，与热编码不同，实际上它是从 tfidf 和其他一些类似的计算派生的，以获得每个语句的 100 维向量。我对这些东西很陌生，任何帮助将不胜感激

标签： machine-learning keras deep-learning nlp conv-neural-network

【解决方案1】：

在这个虚拟示例中，我使用具有 2D 特征的 Conv1D。 Conv1D 接受 3D 格式的输入序列（n_samples、time_steps、features）。如果您使用 2D 功能，则必须将其调整为 3D。通常的选择是考虑您的特征，因为它只是扩展时间维度（轴 1 上的 expand_dims），没有理由假设 tfidf/one-hot 特征上的位置/时间模式。

当你构建你的神经网络时，你从 3D 维度开始，你必须传入 2D。从 3D 到 2D 有很多可能性，简单的帖子是扁平化的，1 个时间暗淡，池化层是没用的。如果您使用 softmax 作为最后一个激活层，请记住向密集层传递与您的类数相等的维度

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *

## define variable
n_sample = 10229
text_len = 100

## create dummy data
X_train = np.random.uniform(0,1, (n_sample,text_len))
y_train = np.random.randint(0,2, n_sample)

## expand train dimnesion: pass from 2d to 3d
X_train = np.expand_dims(X_train, axis=1)
print(X_train.shape, y_train.shape)

## create model
inp = Input(shape=(1,text_len))
conv2 = Conv1D(filters=128, kernel_size=5, activation='relu', padding='same')(inp)
drop21 = Dropout(0.5)(conv2)
conv22 = Conv1D(filters=64, kernel_size=5, activation='relu', padding='same')(drop21)
drop22 = Dropout(0.5)(conv22)
pool2 = Flatten()(drop22) # this is an option to pass from 3d to 2d
out = Dense(2, activation='softmax')(pool2) # the output dim must be equal to the num of class if u use softmax

model = Model(inp, out)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X_train, y_train, epochs=5)

【讨论】：