IK分词插件
插件主页
https://github.com/medcl/elasticsearch-analysis-ik
查看插件
1 | [elon@icloud-store elasticsearch-6.0.0]$ bin/elasticsearch-plugin list |
分词测试
创建测试索引
1 | [elon@icloud-store logs]# curl -XPUT "http://192.168.0.103:9200/index_test?pretty=true" |
测试分词效果
1 | [elon@icloud-store logs]# curl 'http://192.168.0.103:9200/index_test/_analyze?pretty=true' -H 'Content-Type: application/json' -d '{ "analyzer":"ik_max_word", "text":"元旦火车票开售."}' |
通过对中文串的分词结果可以明显看到, ik_smart和ik_max_word两种模式的分词结果上的差异.
- ik_max_word: 会将文本做最细粒度的拆分,比如会将“元旦火车票开售”拆分为“元旦,火车票,火车,车票,开售”,会穷尽各种可能的组合;
- ik_smart: 会做最粗粒度的拆分,比如会将“元旦火车票开售”拆分为“元旦,火车票,开售”;
应用示例
删除索引
1 | [wuyu@icloud-store elasticsearch-6.0.0]$ curl -XDELETE "http://192.168.0.103:9200/index_test*?pretty=true" |
创建索引
1 | [elon@icloud-store logs]# curl -XPUT "http://192.168.0.103:9200/index_test?pretty=true" -H 'Content-Type: application/json' -d' |
创建类型
1 | [[elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "http://192.168.0.103:9200/index_test/person/_mapping?pretty=true" -H 'Content-Type: application/json' -d' |
索引数据
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "http://192.168.0.103:9200/index_test/person/1?pretty=true" -H 'Content-Type: application/json' -d'{"name":"元旦火车票开售."}' |
查询数据(主键)
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XGET "http://192.168.0.103:9200/index_test/person/2?pretty=true" |
更新数据
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "http://192.168.0.103:9200/index_test/person/2?pretty=true" -H 'Content-Type: application/json' -d'{"name":"元旦节是每年1月1号"}' |
查询数据(关键词、分页、高量)
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "192.168.0.103:9200/index_test/person/_search?pretty" -H 'Content-Type: application/json' -d' |
Pinyin分词插件
插件主页
https://github.com/medcl/elasticsearch-analysis-pinyin
查看插件
1 | [elon@icloud-store elasticsearch-6.0.0]$ bin/elasticsearch-plugin list |
分词测试
创建测试索引
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPUT "http://192.168.0.103:9200/index_test?pretty=true" -H 'Content-Type: application/json' -d' |
测试分词效果
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl "http://192.168.0.103:9200/index_test/_analyze?pretty=true" -H 'Content-Type: application/json' -d'{ "analyzer":"pinyin_analyzer", "text":"元旦火车票开售."}' |
应用示例(Ik+Pinyin)
应用示例展示IK+Pinyin集成的索引分词效果
删除索引
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XDELETE "http://192.168.0.103:9200/index_test*?pretty=true" |
创建索引
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPUT "http://192.168.0.103:9200/index_test?pretty=true" -H 'Content-Type: application/json' -d' |
测试分词效果
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl "http://192.168.0.103:9200/index_test/_analyze?pretty=true" -H 'Content-Type: application/json' -d'{ "analyzer":"ik_pinyin_analyzer", "text":"元旦节是公历新一年的第一天."}' |
创建类型
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl "http://192.168.0.103:9200/index_test/article/_mapping?pretty=true" -H 'Content-Type: application/json' -d' |
索引数据
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "http://192.168.0.103:9200/index_test/article/1?pretty=true" -H 'Content-Type: application/json' -d'{"name":"元旦火车票 开售."}' |
查询数据(主键)
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XGET "http://192.168.0.103:9200/index_test/article/2?pretty=true" |
查询数据(中文)
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "http://192.168.0.103:9200/index_test/article/_search?pretty" -H 'Content-Type: application/json' -d' |
查询数据(拼音)
1 | [elon@icloud-store elasticsearch-6.0.0]$ curl -XPOST "http://192.168.0.103:9200/index_test/article/_search?pretty" -H 'Content-Type: application/json' -d' |