使用 Tensorflow 基于矩阵分解的推荐答案

【问题标题】：Matrix factorization based recommendation using Tensorflow使用 Tensorflow 基于矩阵分解的推荐
【发布时间】：2018-04-20 10:07:06
【问题描述】：

我是张量流的新手，正在探索使用张量流的推荐系统。我已经在 github 中验证了一些示例代码，并且遇到的情况与以下内容基本相同

https://github.com/songgc/TF-recomm/blob/master/svd_train_val.py

但问题是，如何在上述代码中为用户 U1 挑选最佳推荐？

如果有任何示例代码或方法，请分享。谢谢

【问题讨论】：

标签： python tensorflow deep-learning recommendation-engine matrix-factorization

【解决方案1】：

有点难！基本上，当svd 返回时，它会关闭会话，并且张量会丢失它们的值（您仍然保留图表）。有几个选项：

将模型保存到文件并稍后恢复；
不要将会话放在with tf.Session() as sess: .... 块中，而是返回会话；
在with ... 块内进行用户处理

最糟糕的选择是选项 3：您应该单独训练模型而不使用它。最好的方法是将模型和权重保存在某处，然后恢复会话。但是，您仍然有一个问题，即在恢复会话对象后如何使用它。为了演示这部分，我将使用选项 3 来解决这个问题，假设您知道如何恢复会话。

def svd(train, test):
    samples_per_batch = len(train) // BATCH_SIZE

    iter_train = dataio.ShuffleIterator([train["user"],
                                     train["item"],
                                     train["rate"]],
                                    batch_size=BATCH_SIZE)

    iter_test = dataio.OneEpochIterator([test["user"],
                                     test["item"],
                                     test["rate"]],
                                    batch_size=-1)

    user_batch = tf.placeholder(tf.int32, shape=[None], name="id_user")
    item_batch = tf.placeholder(tf.int32, shape=[None], name="id_item")
    rate_batch = tf.placeholder(tf.float32, shape=[None])

    infer, regularizer = ops.inference_svd(user_batch, item_batch, user_num=USER_NUM, item_num=ITEM_NUM, dim=DIM,
                                       device=DEVICE)
    global_step = tf.contrib.framework.get_or_create_global_step()
    _, train_op = ops.optimization(infer, regularizer, rate_batch, learning_rate=0.001, reg=0.05, device=DEVICE)

    init_op = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init_op)
        summary_writer = tf.summary.FileWriter(logdir="/tmp/svd/log", graph=sess.graph)
        print("{} {} {} {}".format("epoch", "train_error", "val_error", "elapsed_time"))
        errors = deque(maxlen=samples_per_batch)
        start = time.time()
        for i in range(EPOCH_MAX * samples_per_batch):
            users, items, rates = next(iter_train)
            _, pred_batch = sess.run([train_op, infer], feed_dict={user_batch: users, item_batch: items, rate_batch: rates})
            pred_batch = clip(pred_batch)
            errors.append(np.power(pred_batch - rates, 2))
            if i % samples_per_batch == 0:
                train_err = np.sqrt(np.mean(errors))
                test_err2 = np.array([])
                for users, items, rates in iter_test:
                    pred_batch = sess.run(infer, feed_dict={user_batch: users,item_batch: items})
                    pred_batch = clip(pred_batch)
                    test_err2 = np.append(test_err2, np.power(pred_batch - rates, 2))
                end = time.time()
                test_err = np.sqrt(np.mean(test_err2))
                print("{:3d} {:f} {:f} {:f}(s)".format(i // samples_per_batch, train_err, test_err, end - start))
                train_err_summary = make_scalar_summary("training_error", train_err)
                test_err_summary = make_scalar_summary("test_error", test_err)
                summary_writer.add_summary(train_err_summary, i)
                summary_writer.add_summary(test_err_summary, i)
                start = end

        # Get the top rated movie for user #1 for every item in the set
        userNumber = 1
        user_prediction = sess.run(infer, feed_dict={user_batch: np.array([userNumber]), item_batch: np.array(range(ITEM_NUM))})
        # The index number is the same as the item number. Orders from lowest (least recommended)
        # to largeset
        index_rating_order = np.argsort(user_prediction)

        print "Top ten recommended items for user {} are".format(userNumber)
        print index_rating_order[-10:][::-1]  # at the end, reverse the list

        # If you want to include the score:
        items_to_choose = index_rating_order[-10:][::-1]
        for item, score in zip(items_to_choose, user_prediction[items_to_choose]):
            print "{}:  {}".format(item,score)

我所做的唯一更改从第一行注释开始。再次强调，最佳实践是在这个函数中进行训练，但实际上要单独做出预测。

【讨论】：

我制作了一个 GitHub 存储库，展示了如何制作预测器。它基于 TF-recomm：github.com/kiwidamien/TensorFlowRec
非常感谢您的回答。它有帮助