如何获取elasticsearch索引中的所有字段名称答案

【问题标题】：How to get all field names in elasticsearch index如何获取elasticsearch索引中的所有字段名称
【发布时间】：2018-06-14 18:19:51
【问题描述】：

我刚开始使用 elasticsearch 5.2。

我正在尝试获取索引中的所有键如果我有以下映射：

"properties": {
         "name": { "type": "text" },
         "article": {
          "properties": {
           "id": { "type": "text" },
           "title":  { "type": "text"},
           "abstract": { "type": "text"},
            "author": {
             "properties": {
              "id": { "type": "text" },
              "name": { "type": "text" }
}}}} } }

是否可以获取所有字段的全名？ 像这样：

 name,
 article.id ,
 article.title ,
 article.abstract ,
 article.author.id,
 article.author.name

我怎样才能得到它？

【问题讨论】：

您是否尝试通过这些字段获取聚合或文档？
我正在尝试获取字段名称列表。也许聚合试验令人困惑。我会删除它。谢谢
那么就可以使用源过滤了——elastic.co/guide/en/elasticsearch/reference/current/…
我不明白这将如何只产生索引的字段名称
ES 默认返回所有字段，如果要从源中排除字段，可以使用源过滤。也许我无法理解你的问题？

标签： elasticsearch

【解决方案1】：

您可以使用_field_names 字段。

_field_names 字段索引文档中每个字段的名称包含除 null 之外的任何值。

GET _search
{
  "size"0,
  "aggs": {
    "Field names": {
      "terms": {
        "field": "_field_names", 
        "size": 100
      }
    }
  }
}

更新：从 ES 5 开始

_field_names 字段已被锁定并且仅被索引，它不支持 fielddata（内存密集型）或 doc 值，

参考：https://github.com/elastic/elasticsearch/issues/22576

您也可以getMapping API

get mapping API 可用于获取多个索引或类型一次调用映射。 API 的一般用法如下以下语法：host:port/{index}/_mapping/{type}

$ curl -XGET 'http://localhost:9200/index/_mapping?pretty'

然后您可以处理响应以提取索引中的所有字段名称

【讨论】：

试过了，但我得到了："root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "Fielddata is not supported on field [_field_names] of type [_field_names]" } ],
IIUC 我不能按照建议在聚合中使用_field_names
是的，您只能查询字段的存在，但不能在同一字段上进行聚合。
您也可以使用 get _mapping api，但这需要您进行一些编码
如果您使用连接功能，获取映射方法将不起作用。如果您想从严格的子文档中获取字段，这是不可能的，因为您无法匹配您的连接字段值。

【解决方案2】：

映射 API 还允许直接查询字段名称。这是应该完成工作的python 3代码sn-p：

import json
import requests

# get mapping fields for a specific index:
index = "INDEX_NAME"
elastic_url = "http://ES_HOSTNAME:9200"
doc_type = "DOC_TYPE"
mapping_fields_request = "_mapping/field/*?ignore_unavailable=false&allow_no_indices=false&include_defaults=true"
mapping_fields_url = "/".join([elastic_url, index, doc_type, mapping_fields_request])
response = requests.get(mapping_fields_url)

# parse the data:
data = response.content.decode()
parsed_data = json.loads(data)
keys = sorted(parsed_data[index]["mappings"][doc_type].keys())
print("index= {} has a total of {} keys".format(index, len(keys)))

# print the keys of the fields:
for i, key in enumerate(keys):
    if i % 43 == 0:
        input()
    print("{:4d}:     {}".format(i, key))

确实很方便。请注意包含“。”的键。以他们的名义可能会让您对他们在文档中的级联程度感到有些困惑......

【讨论】：

【解决方案3】：

你可以试试这个，Get Field Mapping API

def unique_preserving_order(sequence):
    """
    Preserving Order
    :param sequence: object list
    :return:  new list from the set’s contents
    """

    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]

递归获取es索引字段

def get_fields_recursively(dct, field_types=None):

    if dct and 'properties' in dct:
        fields = []
        for key, ndct in dct.get('properties').items():
            if 'properties' in ndct:
                for nkey, nd in ndct.items():
                    if nkey == 'properties':
                        field = get_fields_recursively(ndct)

                        if field_types:
                            for f in field:
                                prop = ndct.get('properties').get(f)
                                if prop and prop.get('type') in field_types:
                                    ff = '{0}.{1}'.format(key, f)
                                    # fields.append(f)
                                    fields.append(ff)
                        else:
                            # if not key.startswith('@'):
                            # _fields = field + ['{0}.{1}'.format(key, f) for f in field]
                            _fields = ['{0}.{1}'.format(key, f) for f in field]
                            fields.extend(_fields)
                        continue

                continue

            if field_types:
                if ndct.get('type') in field_types and not key.startswith('@'):
                    fields.append(key)
            else:
                if not key.startswith('@'):
                    fields.append(key)
        return fields
    else:
        return dct

从索引映射中获取字段，您也可以按类型过滤字段，例如。文本字段或数字字段

def get_mapping_fields(self, field_type=None, index=None, params={}):
    """

    :param field_type: es field types, filter fields by type
    :param index: elastic index name
    :param params: mapping additional params
    :return: fields

    <https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-field-mapping.html>
    - http://eshost:9200/_mapping
    - http://eshost:9200/_all/_mapping
    - http://eshost:9200/index_name/_mapping

    """

    _fields = []
    _mapping = self.esclient.indices.get_mapping(index=index, params=params)
    for idx_mapping in _mapping.values():
        mapping = idx_mapping.get('mappings')
        if 'system' in mapping:
            mapping = mapping.get('system')
        else:
            mapping = mapping.get('doc')
        fields = get_fields_recursively(mapping, field_type)
        if fields:
            _fields.extend(fields)

    return list(unique_preserving_order(_fields))

【讨论】：