【问题标题】:Hibernate search manual indexingHibernate 搜索手动索引
【发布时间】:2018-05-24 00:46:53
【问题描述】:

我是 Hibernate Search 的新手。我正在尝试集成 Hibernate Search 来搜索地址。我正在使用 Hibernate Search 5.5.6.Final。我的地址表有超过 1500 万条记录。我使用手动索引为现有地址表创建 lucene 索引。索引已完成,但当我通过 Luke 浏览它们时,它只有不到 70,000 个文档。这看起来对吗?文件号不应该比记录数多很多吗?有没有办法确保索引遍历所有记录?请帮忙...

这是我的实体:

@Entity
@Table (name = "ADDRESSES_LOOKUP")
@AnalyzerDef(name = "customanalyzer",
        tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
        filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
                        @Parameter(name = "language", value = "English")
                })
        })
@Indexed
public class Address {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    @Column (name = "ADDRESS_ID")
    private String id;

    @Column (name = "BUILDING_NAME")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    @Analyzer(definition = "customanalyzer")
    private String buildingName;

    @Column (name = "FLAT_NUMBER")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String flatNumber;

    @Column (name = "FLAT_TYPE")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String flatType;

    @Column (name = "LEVEL_NUMBER")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String levelNumber;

    @Column (name = "LEVEL_TYPE")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String levelType;

    @Column (name = "NUMBER_FIRST")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String numberFirst;

    @Column (name = "NUMBER_LAST")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String numberLast;

    @Column (name = "STREET_NAME")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String streetName;

    @Column (name = "STREET_TYPE_CODE")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String streetType;

    @Column (name = "LOCALITY_NAME")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String locality;

    @Column (name = "STATE_ABBREVIATION")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String state;

    @Column (name = "POSTCODE")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    private String postcode;

    @Column (name = "ADDRESS")
    @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
    @Analyzer(definition = "customanalyzer")
    private String address;

这是索引的代码

public void initializeHibernateSearch() {
    logger.info("Start initialising hibernate search index.");
    try {
        FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
        fullTextEntityManager
                .createIndexer()
                .typesToIndexInParallel( 3 )
                .batchSizeToLoadObjects( 50 )
                .cacheMode( CacheMode.IGNORE )
                .threadsToLoadObjects( 30 )
                .idFetchSize( 150 )
                .transactionTimeout( 1800 )
                .startAndWait();

    } catch (InterruptedException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    logger.info("HIBERNATE SEARCH INDEX INITIALISED.");
}

【问题讨论】:

  • 代码看起来正确。没有例外?
  • 没有例外。现在我正在尝试独立于休眠搜索运行 lucene 索引,看看情况如何。

标签: java hibernate lucene hibernate-search


【解决方案1】:

一个好的起点是使用 ProgressMonitor(SimpleIndexingProgressMonitor 或您定义的自定义方法)并逐步使用一些可用的方法,例如 addToTotalCount 它应该告诉您它打算有多少个地址指数。还有一个printStatusMessage 方法可以让您了解一些进度。

SimpleIndexingProgressMonitor progressMonitor = new SimpleIndexingProgressMonitor();
fullTextSession
                .createIndexer(Address.class)
                .typesToIndexInParallel(1)
                .batchSizeToLoadObjects(50)
                .cacheMode(CacheMode.IGNORE)
                .threadsToLoadObjects(30)
                .idFetchSize(150)
                .progressMonitor(progressMonitor)
                .startAndWait();

该表中还有其他列吗?我想知道您是否只有 70,000 个实际上在这些索引列中有数据。

【讨论】:

    猜你喜欢
    • 2012-05-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-07-06
    • 2014-11-12
    • 2023-03-24
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多