【发布时间】:2018-09-24 09:19:28
【问题描述】:
wildcard cts:element-value-query 的行为不符合预期。
插入文档查询:
xdmp:document-insert('/sample/2.xml', <data>the living Theater</data>)
cts 查询:
cts:search(
doc(),
cts:element-value-query(xs:QName('data'), 'theater* *', ('wildcarded', 'case-insensitive', 'unstemmed', 'punctuation-sensitive', 'whitespace-sensitive')),
'unfiltered'
)
以上 cts 查询返回给我/sample/2.xml 文档。据我了解,此查询不应返回上述文档,而应仅返回以 theater 文本开头的文档。
似乎问题出在下面的文本模式上。
在文档中显示文本:@@@ word @@@text
搜索词:@@@t* *
@ - 可以是任何字符。
我也可以使用以下数据重现问题。
在文档中显示文本:mark the marklogic
搜索文字:markl* *
通配符相关索引设置为true。
我已经粘贴了数据库配置,它可能有助于找到问题。
数据库配置:
<package-database xmlns="http://marklogic.com/manage/package/databases">
<config>
<name>publishers</name>
<package-database-properties>
<enabled>true</enabled>
<retired-forest-count>0</retired-forest-count>
<language>en</language>
<stemmed-searches>advanced</stemmed-searches>
<word-searches>true</word-searches>
<word-positions>true</word-positions>
<fast-phrase-searches>true</fast-phrase-searches>
<fast-reverse-searches>false</fast-reverse-searches>
<triple-index>true</triple-index>
<triple-positions>true</triple-positions>
<fast-case-sensitive-searches>true</fast-case-sensitive-searches>
<fast-diacritic-sensitive-searches>true</fast-diacritic-sensitive-searches>
<fast-element-word-searches>true</fast-element-word-searches>
<element-word-positions>true</element-word-positions>
<fast-element-phrase-searches>true</fast-element-phrase-searches>
<element-value-positions>true</element-value-positions>
<attribute-value-positions>true</attribute-value-positions>
<field-value-searches>true</field-value-searches>
<field-value-positions>true</field-value-positions>
<three-character-searches>true</three-character-searches>
<three-character-word-positions>true</three-character-word-positions>
<fast-element-character-searches>true</fast-element-character-searches>
<trailing-wildcard-searches>true</trailing-wildcard-searches>
<trailing-wildcard-word-positions>true</trailing-wildcard-word-positions>
<fast-element-trailing-wildcard-searches>true</fast-element-trailing-wildcard-searches>
<word-lexicons>
<word-lexicon>http://marklogic.com/collation/codepoint</word-lexicon>
</word-lexicons>
<two-character-searches>false</two-character-searches>
<one-character-searches>false</one-character-searches>
<uri-lexicon>true</uri-lexicon>
<collection-lexicon>true</collection-lexicon>
<reindexer-enable>true</reindexer-enable>
<reindexer-throttle>5</reindexer-throttle>
<reindexer-timestamp>0</reindexer-timestamp>
<directory-creation>manual</directory-creation>
<maintain-last-modified>false</maintain-last-modified>
<maintain-directory-last-modified>false</maintain-directory-last-modified>
<inherit-permissions>false</inherit-permissions>
<inherit-collections>false</inherit-collections>
<inherit-quality>false</inherit-quality>
<in-memory-limit>174080</in-memory-limit>
<in-memory-list-size>341</in-memory-list-size>
<in-memory-tree-size>85</in-memory-tree-size>
<in-memory-range-index-size>11</in-memory-range-index-size>
<in-memory-reverse-index-size>11</in-memory-reverse-index-size>
<in-memory-triple-index-size>44</in-memory-triple-index-size>
<large-size-threshold>1024</large-size-threshold>
<locking>fast</locking>
<journaling>fast</journaling>
<journal-size>682</journal-size>
<journal-count>2</journal-count>
<preallocate-journals>false</preallocate-journals>
<preload-mapped-data>false</preload-mapped-data>
<preload-replica-mapped-data>false</preload-replica-mapped-data>
<range-index-optimize>facet-time</range-index-optimize>
<positions-list-max-size>256</positions-list-max-size>
<format-compatibility>automatic</format-compatibility>
<index-detection>automatic</index-detection>
<expunge-locks>none</expunge-locks>
<tf-normalization>scaled-log</tf-normalization>
<merge-priority>lower</merge-priority>
<merge-max-size>32768</merge-max-size>
<merge-min-size>1024</merge-min-size>
<merge-min-ratio>2</merge-min-ratio>
<merge-timestamp>0</merge-timestamp>
<retain-until-backup>false</retain-until-backup>
<assignment-policy-name>bucket</assignment-policy-name>
</package-database-properties>
</config>
</package-database>
【问题讨论】:
-
使用
'filtered'选项执行搜索时是否得到正确的结果? -
@MadsHansen 是的.. 过滤后我得到了正确的结果,但我不能使用过滤选项,因为它很慢。
-
尝试启用元素词位置。您需要它来准确解析多令牌值而不进行过滤..
-
@grtjn
element word position设置为真。还是同样的问题。尝试使用http://marklogic.com/collation/codepoint排序规则添加单词词典,但没有收获。 -
请帮忙,因为我经常遇到这个问题,无法确定我做错了什么。看起来
value查询的行为类似于word查询
标签: marklogic