【发布时间】:2017-07-11 22:54:24
【问题描述】:
gensim 中的 Word2Vec 对象有一个 null_word 参数,文档中没有解释。
class gensim.models.word2vec.Word2Vec(sentences=None, size=100, alpha=0.025, window=5, min_count=5, max_vocab_size=None, sample=0.001, seed=1, workers=3, min_alpha= 0.0001, sg=0, hs=0,negative=5, cbow_mean=1, hashfxn=, iter=5, null_word=0, trim_rule=None, sorted_vocab=1, batch_words=10000)
null_word 参数是做什么用的?
检查https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/word2vec.py#L680的代码,它指出:
if self.null_word:
# create null pseudo-word for padding when using concatenative L1 (run-of-words)
# this word is only ever input – never predicted – so count, huffman-point, etc doesn't matter
word, v = '\0', Vocab(count=1, sample_int=0)
v.index = len(self.wv.vocab)
self.wv.index2word.append(word)
self.wv.vocab[word] = v
什么是“连接 L1”?
【问题讨论】:
标签: python null deep-learning gensim word2vec