1 Star 0 Fork 303

脏小强/elasticsearch-definitive-guide-cn

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
10_Using.asciidoc 2.83 KB
一键复制 编辑 原始数据 按行查看 历史
Looly 提交于 2014-09-22 14:58 +08:00 . First commit, finished 1.1 and 1.2

Using language analyzers

The built-in language analyzers are available globally and don’t need to be configured before being used. They can be specified directly in the field mapping:

PUT /my_index
{
  "mappings": {
    "blog": {
      "properties": {
        "title": {
          "type":     "string",
          "analyzer": "english" (1)
        }
      }
    }
  }
}
  1. The title field will use the english analyzer instead of the default standard analyzer.

Of course, by passing text through the english analyzer, we lose information:

GET /my_index/_analyze?field=title (1)
I'm not happy about the foxes
  1. Emits tokens: i’m, happi, about, fox

We can’t tell if the document mentions one fox or many foxes; the word not is a stopword and is removed, so we can’t tell whether the document is happy about foxes or not. By using the english analyzer, we have increased recall as we can match more loosely, but we have reduced our ability to rank documents accurately.

To get the best of both worlds, we can use multi-fields to index the title field twice: once with the english analyzer and once with the standard analyzer:

PUT /my_index
{
  "mappings": {
    "blog": {
      "properties": {
        "title": { (1)
          "type": "string",
          "fields": {
            "english": { (2)
              "type":     "string",
              "analyzer": "english"
            }
          }
        }
      }
    }
  }
}
  1. The main title field uses the standard analyzer.

  2. The title.english sub-field uses the english analyzer.

With this mapping in place, we can index some test documents to demonstrate how to use both fields at query time:

PUT /my_index/blog/1
{ "title": "I'm happy for this fox" }

PUT /my_index/blog/2
{ "title": "I'm not happy about my fox problem" }

GET /_search
{
  "query": {
    "multi_match": {
      "type":     "most_fields" (1)
      "query":    "not happy foxes",
      "fields": [ "title", "title.english" ]
    }
  }
}
  1. Use the most_fields query type to match the same text in as many fields as possible.

Even though neither of our documents contain the word foxes, both documents are returned as results thanks to the word stemming on the title.english field. The second document is ranked as more relevant, because the word not matches on the title field.

Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/icodes/elasticsearch-definitive-guide-cn.git
git@gitee.com:icodes/elasticsearch-definitive-guide-cn.git
icodes
elasticsearch-definitive-guide-cn
elasticsearch-definitive-guide-cn
master

搜索帮助