【问题标题】:solr search in a "content" field does not work“内容”字段中的 solr 搜索不起作用
【发布时间】:2016-05-01 08:21:20
【问题描述】:

我已在 solr 6.0.0 中上传并提取文档,我看到它已使用以下查询进行索引:

http://localhost:8983/solr/techproducts/select?indent=on&q=id:doc1&wt=json

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"id:doc1",
      "indent":"on",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "links":["http://www.education.gov.yk.ca/"],
        "id":"doc1",
        "last_modified":"2008-06-04T22:47:36Z",
        "title":[" PDF Test Page"],
        "content_type":["application/pdf"],
        "author":"Yukon Canada Yukon Department of Education",
        "author_s":"Yukon Canada Yukon Department of Education",
        "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  PDF Test Page \n \n    \n  \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader!  You should be able to view any of the PDF documents and forms available on \nour site.  PDF forms are indicated by these icons:   or  .   \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at:  http://www.education.gov.yk.ca/\n    \n  \n    \n \n  "],
        "_version_":1533049305513852928}]
  }}

我看到字段内容中有多次出现单词PDF

当字段名称为content 并且其中包含PDF 时,为什么我在以下查询中没有得到任何结果?:

select?q=*:*&fq=content:PDF

{
  "responseHeader":{
    "status":0,
    "QTime":4,
    "params":{
      "q":"*:*",
      "indent":"on",
      "fq":"content:PDF",
      "rows":"50",
      "wt":"json"}},
  "response":{"numFound":0,"start":0,"docs":[]
  }}

当我使用不同的字段查询时,例如title,那么我得到了正确的结果:

select?q=*:*&fq=title:PDF

{
  "responseHeader":{
    "status":0,
    "QTime":3,
    "params":{
      "q":"*:*",
      "indent":"on",
      "fq":"title:PDF",
      "rows":"50",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "links":["http://www.education.gov.yk.ca/"],
        "id":"doc1",
        "last_modified":"2008-06-04T22:47:36Z",
        "title":[" PDF Test Page"],
        "content_type":["application/pdf"],
        "author":"Yukon Canada Yukon Department of Education",
        "author_s":"Yukon Canada Yukon Department of Education",
        "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  PDF Test Page \n \n    \n  \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader!  You should be able to view any of the PDF documents and forms available on \nour site.  PDF forms are indicated by these icons:   or  .   \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at:  http://www.education.gov.yk.ca/\n    \n  \n    \n \n  "],
        "_version_":1533049305513852928}]
  }}

【问题讨论】:

  • 能否分享一下 schema.xml...

标签: pdf select search solr field


【解决方案1】:

检查您的schema.xml 以获取为内容field 定义的field type

比较内容的字段类型和标题字段。

可能是您没有为字段内容定义正确的字段类型。这些字段类型不会为您的文本生成任何标记,或者必须将整个文本视为一个。如果您的字段使用 keywordtokenizerstring 字段类型,则会发生这种情况。

您可以在Solr调试器工具中进行检查或分析。

在这里您可以检查文本是如何被索引的以及文本是如何被搜索的。

当您想搜索 field 时,您必须提及属性 indexed=true 并且您希望 solr 返回相同的值,然后您需要添加 stored=true

这两个attribute帮你实现搜索和检索字段的原始值

【讨论】:

  • 我在托管模式中有以下内容:。那么我应该将内容字段的索引属性更改为 true 并重新启动 sorl 服务器吗?
  • 我把它改成了true,没有任何变化...我仍然无法搜索content字段。
  • 当你想在一个字段上搜索时,你必须提到属性 indexed=true 并且你希望 solr 返回相同的值,那么你需要添加 stored=true...这两个属性帮助您实现搜索和检索字段的原始值。
猜你喜欢
  • 1970-01-01
  • 2014-05-01
  • 1970-01-01
  • 2015-03-15
  • 1970-01-01
  • 2015-12-16
  • 2010-12-15
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多