【问题标题】:How to search a content of a document attached in elasticsearch index如何搜索弹性搜索索引中附加的文档内容
【发布时间】:2016-10-08 12:01:34
【问题描述】:

我在 elasticsearch 中创建了索引

this.client.CreateIndex("documents", c => c.Mappings(mp => mp.Map<DocUpload>
              (m => m.Properties(ps => ps.Attachment
                                     (a => a.Name(o => o.Document)
                                            .TitleField(t => t.Name(x =>  x.Title).TermVector(TermVectorOption.WithPositionsOffsets))
                                             )))));

附件在索引之前是 base64 编码的。我无法在任何文档中搜索内容。 base64 编码是否会产生任何问题。有人可以帮忙吗?

浏览器响应就像

    {
 "documents": {
   "aliases": {},
   "mappings": {
  "indexdocument": {
    "properties": {
      "document": {
        "type": "attachment",
        "fields": {
          "content": {
            "type": "string"
          },
          "author": {
            "type": "string"
          },
          "title": {
            "type": "string",
            "term_vector": "with_positions_offsets"
          },
          "name": {
            "type": "string"
          },
          "date": {
            "type": "date",
            "format": "strict_date_optional_time||epoch_millis"
          },
          "keywords": {
            "type": "string"
          },
          "content_type": {
            "type": "string"
          },
          "content_length": {
            "type": "integer"
          },
          "language": {
            "type": "string"
          }
        }
      },
      "documentType": {
        "type": "string"
      },
      "id": {
        "type": "long"
      },
      "lastModifiedDate": {
        "type": "date",
        "format": "strict_date_optional_time||epoch_millis"
      },
      "location": {
        "type": "string"
      },
      "title": {
        "type": "string"
      }
    }
  }
},
"settings": {
  "index": {
    "creation_date": "1465193502636",
    "number_of_shards": "5",
    "number_of_replicas": "1",
    "uuid": "5kCRvhmsQAGyndkswLhLrg",
    "version": {
      "created": "2030399"
    }
  }
},
"warmers": {}
}
 }

【问题讨论】:

  • 我认为您的内容映射有问题。对我来说,它类似于 this 您的附件类定义可能有问题。

标签: c# elasticsearch nest elasticsearch-plugin


【解决方案1】:

我通过添加分析器找到了解决方案。

var fullNameFilters = new List<string> { "lowercase", "snowball" };
        client.CreateIndex("mydocs", c => c
              .Settings(st => st
                        .Analysis(anl => anl
                        .Analyzers(h => h
                            .Custom("full", ff => ff
                                 .Filters(fullNameFilters)
                                 .Tokenizer("standard"))
                            )
                            .TokenFilters(ba => ba
                                .Snowball("snowball", sn => sn
                                    .Language(SnowballLanguage.English)))                    
                             ))
                         .Mappings(mp => mp
                         .Map<IndexDocument>(ms => ms
                         .AutoMap()
                         .Properties(ps => ps
                             .Nested<Attachment>(n => n
                                 .Name(sc => sc.File)
                             .AutoMap()
                             ))
                        .Properties(at => at
                        .Attachment(a => a.Name(o => o.File)
                        .FileField(fl=>fl.Analyzer("full"))
                        .TitleField(t => t.Name(x => x.Title)
                        .Analyzer("full")
                        .TermVector(TermVectorOption.WithPositionsOffsets)
                        )))

                        ))                        
                        );

【讨论】:

    猜你喜欢
    • 2017-02-28
    • 2015-12-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-09-05
    • 1970-01-01
    • 2015-06-01
    • 2022-08-12
    相关资源
    最近更新 更多