1 Star 0 Fork 303

脏小强/elasticsearch-definitive-guide-cn

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
00_Intro.asciidoc 1.31 KB
一键复制 编辑 原始数据 按行查看 历史
Looly 提交于 2014-09-22 14:58 +08:00 . First commit, finished 1.1 and 1.2

Getting started with languages

Elasticsearch ships with a collection of language analyzers which provide good, basic, out-of-the-box support for a number of the world’s most common languages:

Arabic, Armenian, Basque, Brazilian, Bulgarian, Catalan, Chinese, Czech, Danish, Dutch, English, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Korean, Kurdish, Norwegian, Persian, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish, and Thai.

These analyzers typically perform four roles:

  • Tokenize text into individual words:

    The quick brown foxes → [The, quick, brown, foxes]

  • Lowercase tokens:

    Thethe

  • Remove common stopwords:

    [`The`, quick, brown, foxes] → [quick, brown, foxes]

  • Stem tokens to their root form:

    foxesfox

Each analyzer may also apply other transformations specific to its language in order to make words from that language more searchable:

  • the english analyzer removes the possessive 's:

    John’sjohn

  • the french analyzer removes elisions like l' and qu' and diactrics like ¨ or ^:

    l’égliseeglis

  • the german analyzer normalizes terms, replacing ä and ae with a, or ß with ss, among others:

    äußerstausserst

Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/icodes/elasticsearch-definitive-guide-cn.git
git@gitee.com:icodes/elasticsearch-definitive-guide-cn.git
icodes
elasticsearch-definitive-guide-cn
elasticsearch-definitive-guide-cn
master

搜索帮助