【问题标题】:Updating Lucene index on the fly即时更新 Lucene 索引
【发布时间】:2016-08-10 03:29:48
【问题描述】:

我是 Lucene 的绝对新手,在更新索引时遇到了问题。

目前我可以每天重建整个索引,但索引只更新到构建的时间,但是我怎样才能更新索引,比如附加它,使它是最新的?目前有代码试图更新索引,但它只更新段文件而不是其他文件。

每次从我的网站添加条目时,它都会运行 RefreshFromDatabase 方法并尝试添加最新的索引,但是在搜索索引文件夹中,它会更新两个文件segment.gen和segments_t,但是所有其他文件(.fdt .fdx .fnm .frq .nrm .prx .tii .tis .del .cfs) 未更新。 截图如下:folder screenshot

代码:

using (ISiteScope scope = _scopeFactory.GetSiteScope(site)){
        scope.Get<ISearchIndexUpdater>().RefreshFromDatabase(primaryId, secondaryId);
        scope.Commit();
}

public void RefreshFromDatabase(long primaryId, int? secondaryId){
    Process process = _processRepo.GetById(primaryId);
    IList<Decision> allDecisions = _decisionRepo.GetByProcess(process);
    IList<Link> allLinks = _linkRepo.GetActiveByProcess(process);
    Decision current = allDecisions.OrderByDescending(x => x.DTG).FirstOrDefault();
    _luceneRepository.Add(process, allDecisions, allLinks);
}

public void Add(Process process, IList<Decision> decisions, IList<Link> links){
    if (null == decisions)
    decisions = new List<Decision>();

    using (LuceneWriter writer = BeginWriter(false)) {
        Add(writer.Writer,
            new SearchIndexProcess {
                // properties
            },
            decisions.Select(x => new SearchIndexDecision {
                // params
            }).ToArray(),
            (links ?? new List<Link>()).Select(x => new SearchIndexLink {
                // properties
            }).ToArray()
        );
        writer.Commit();
    }
}

和 LuceneWriter 类:

public class LuceneWriter : IDisposable
       {
              Directory _directory;
              Analyzer _analyzer;
              IndexWriter _indexWriter;

              bool _commit;
              bool _optimise;

              /// <summary>
              /// Constructor for LuceneWriter.
              /// </summary>
              /// <param name="fileSystem">An IFileSystem.</param>
              /// <param name="luceneDir">The directory that contains the Lucene index. Need not exist.</param>
              public LuceneWriter(IFileSystem fileSystem, string luceneDir)
                     : this(fileSystem, luceneDir, false)
              {
              }

              /// <summary>
              /// Constructor for LuceneWriter.
              /// </summary>
              /// <param name="fileSystem">An IFileSystem.</param>
              /// <param name="luceneDir">The directory that contains the Lucene index. Need not exist.</param>
              /// <param name="optimiseWhenDone">Optimse the index on Dispose(). This is an expensive operation.</param>
              public LuceneWriter(IFileSystem fileSystem, string luceneDir, bool optimiseWhenDone)
              {
                     Init(fileSystem, luceneDir, optimiseWhenDone);
              }

              //init has its own single use method for mocking reasons.
              /// <summary>
              /// Initialise the LuceneWriter.
              /// </summary>
              /// <param name="fileSystem">An IFileSystem.</param>
              /// <param name="luceneDir">The directory containing the Lucene index.</param>
              /// <param name="optimiseWhenDone">Whether or not to optimise the Lucene index upon Dispose().</param>
              protected virtual void Init(IFileSystem fileSystem, string luceneDir, bool optimiseWhenDone)
              {
                     _optimise = optimiseWhenDone;

                     bool exists = true;
                     if (!fileSystem.DirectoryExists(luceneDir)) {
                           fileSystem.CreateDirectory(luceneDir);
                           exists = false;
                     }

                     _directory = FSDirectory.Open(new DirectoryInfo(luceneDir));
                     _analyzer = new StandardAnalyzer(Version.LUCENE_30);
                     _indexWriter = new IndexWriter(_directory, _analyzer, !exists, IndexWriter.MaxFieldLength.UNLIMITED);
              }

              /// <summary>
              /// Flags writer to commit and optimise. Does not commit until Dispose() is called.
              /// </summary>
              public void Commit()
              {
                     _commit = true;
              }

              /// <summary>
              /// The IndexWriter.
              /// </summary>
              public IndexWriter Writer { get { return _indexWriter; } }

              /// <summary>
              /// Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
              /// </summary>
              /// <filterpriority>2</filterpriority>
              public void Dispose()
              {
                     if ((null != _indexWriter) && (_commit)) {
                           if (_optimise)
                                  _indexWriter.Optimize(true);
                           _indexWriter.Commit();
                           _indexWriter.Close(true);
                     }

                     if (null != _indexWriter)
                           _indexWriter.Dispose();
                     if (null != _analyzer)
                           _analyzer.Dispose();
                     if (null != _directory) {
                           _directory.Close();
                           _directory.Dispose();
                     }
              }
       }

【问题讨论】:

    标签: lucene


    【解决方案1】:

    在 Lucene 中没有更新文档。 它实际上是删除和添加。 当您更新文档时,在旧段中,它将被标记为已删除,并且将创建新段并将文档添加到该段中

    【讨论】:

    • 是的,我其实是加进去的,但是只修改了segment.gen和segments_t这两个segment文件和.cfs文件,其余没有修改
    • 是的,在你进行合并之前不会修改其余部分
    • 原谅我的无知,请告诉我怎么做,谢谢。或者没有必要合并并且仍然作为“合并”执行?我的情况是,我在 2/8/2016 重建了索引,所以所有 .prx .frq 等文件都没有被修改,并且每次向数据库添加记录时都有写索引的方法,所以.cfs 文件不断修改。我的问题是,索引真的更新了吗?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-05-05
    相关资源
    最近更新 更多