【发布时间】:2014-07-10 12:30:38
【问题描述】:
我想将基于 Lucene 3.0 的《Lucene in Action 2nd Edition》一书中的示例迁移到 Lucene 的当前版本。以下是需要迁移的代码:
public void testUpdate() throws IOException {
assertEquals(1, getHitCount("city", "Amsterdam"));
IndexWriter writer = getWriter();
Document doc = new Document();
doc.add(new Field("id", "1", Field.Store.YES, Field.Index.NOT_ANALYZED));
doc.add(new Field("country", "Netherlands", Field.Store.YES, Field.Index.NO));
doc.add(new Field("contents", "Den Haag has a lot of museums", Field.Store.NO, Field.Index.ANALYZED));
doc.add(new Field("city", "Den Haag", Field.Store.YES, Field.Index.ANALYZED));
writer.updateDocument(new Term("id", "1"), doc);
writer.close();
assertEquals(0, getHitCount("city", "Amsterdam"));
assertEquals(1, getHitCount("city", "Den Haag"));
}
我正在尝试根据Lucene Migration Guide 执行迁移,使用前 Field 构造函数的等效项来创建 Document 对象。代码如下:
@Test
public void testUpdate() throws IOException
{
assertEquals(1, getHitCount("city", "Amsterdam"));
IndexWriter writer = getWriter();
Document doc = new Document();
FieldType ft = new FieldType(StringField.TYPE_STORED);
ft.setOmitNorms(false);
doc.add(new Field("id", "1", ft));
doc.add(new StoredField("country", "Netherlands"));
doc.add(new TextField("contents", "Den Haag has a lot of museums", Store.NO));
doc.add(new Field("city", "Den Haag", TextField.TYPE_STORED));
writer.updateDocument(new Term("id", "1"), doc);
writer.close();
assertEquals(0, getHitCount("city", "Amsterdam"));
assertEquals(1, getHitCount("city", "Den Haag");
}
第二个断言方法失败,因为它没有找到字符串“Den Haag”(虽然只有“Den”或“Haag”有效)。如果我改用 StringField 对象,则测试通过,因为“city”属性未进行分析(即标记化),因此保持不变。但显然,该示例的意图不是将此属性视为例如一个身份证。我读过“Field.Store.YES / Field.Index.ANALYZED”组合适用于介绍文本、摘要或标题等小型文本内容,因此它还应匹配“Den Haag”等连接字符串或我错误的?谁能澄清一下。
作者使用 Term 对象创建搜索字符串:
protected int getHitCount(String fieldName, String searchString) throws IOException {
DirectoryReader dr = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(dr);
Term t = new Term(fieldName, searchString);
Query query = new TermQuery(t);
int hitCount = TestUtil.hitCount(searcher, query);
return hitCount;
}
TestUtil 类只包含一行代码
public static int hitCount(IndexSearcher searcher, Query query) {
return searcher.search(query, 1).totalHits;
}
【问题讨论】:
-
你是怎么搜索的(我指的是
getHitCount()方法)?您是在使用原始Terms 还是只是解析查询? -
@mindas 作者使用 Term 对象创建搜索字符串:
protected int getHitCount(String fieldName, String searchString) throws IOException { DirectoryReader dr = DirectoryReader.open(directory); IndexSearcher searcher = new IndexSearcher(dr); Term t = new Term(fieldName, searchString); Query query = new TermQuery(t); int hitCount = TestUtil.hitCount(searcher, query); return hitCount; }。 TestUtil 类只包含一行代码public static int hitCount(IndexSearcher searcher, Query query) { return searcher.search(query, 1).totalHits; }。