es中的term和match的区别 您所在的位置:网站首页 es的书 es中的term和match的区别

es中的term和match的区别

2024-07-18 02:34| 来源: 网络整理| 查看: 265

term 和 match 总结

在实际的项目查询中,term和match 是最常用的两个查询,而经常搞不清两者有什么区别,趁机总结有空总结下。

term用法

先看看term的定义,term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词拆解。

这里通过例子来说明,先存放一些数据:

{ "title": "love China", "content": "people very love China", "tags": ["China", "love"] } { "title": "love HuBei", "content": "people very love HuBei", "tags": ["HuBei", "love"] }

来使用term 查询下:

{ "query": { "term": { "title": "love" } } }

结果是,上面的两条数据都能查询到:

{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.6931472, "hits": [ { "_index": "test", "_type": "doc", "_id": "8", "_score": 0.6931472, "_source": { "title": "love HuBei", "content": "people very love HuBei", "tags": ["HuBei","love"] } }, { "_index": "test", "_type": "doc", "_id": "7", "_score": 0.6931472, "_source": { "title": "love China", "content": "people very love China", "tags": ["China","love"] } } ] } }

发现,title里有关love的关键字都查出来了,但是我只想精确匹配 love China这个,按照下面的写法看看能不能查出来:

{ "query": { "term": { "title": "love China" } } }

执行发现无数据,从概念上看,term属于精确匹配,只能查单个词。我想用term匹配多个词怎么做?可以使用terms来:

{ "query": { "terms": { "title": ["love", "China"] } } }

查询结果为:

{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.6931472, "hits": [ { "_index": "test", "_type": "doc", "_id": "8", "_score": 0.6931472, "_source": { "title": "love HuBei", "content": "people very love HuBei", "tags": ["HuBei","love"] } }, { "_index": "test", "_type": "doc", "_id": "7", "_score": 0.6931472, "_source": { "title": "love China", "content": "people very love China", "tags": ["China","love"] } } ] } }

发现全部查询出来,为什么?因为terms里的[ ] 多个是或者的关系,只要满足其中一个词就可以。想要通知满足两个词的话,就得使用bool的must来做,如下:

{ "query": { "bool": { "must": [ { "term": { "title": "love" } }, { "term": { "title": "china" } } ] } } }

可以看到,我们上面使用china是小写的。当使用的是大写的China 我们进行搜索的时候,发现搜不到任何信息。这是为什么了?title这个词在进行存储的时候,进行了分词处理。我们这里使用的是默认的分词处理器进行了分词处理。我们可以看看如何进行分词处理的?

分词处理器 GET test/_analyze { "text" : "love China" }

结果为:

{ "tokens": [ { "token": "love", "start_offset": 0, "end_offset": 4, "type": "", "position": 0 }, { "token": "china", "start_offset": 5, "end_offset": 10, "type": "", "position": 1 } ] }

分析出来的为love和china的两个词。而term只能完完整整的匹配上面的词,不做任何改变的匹配。所以,我们使用China这样的方式进行的查询的时候,就会失败。稍后会有一节专门讲解分词器。

match 用法

先用 love China来匹配。

GET test/doc/_search { "query": { "match": { "title": "love China" } } }

结果是:

{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 1.3862944, "hits": [ { "_index": "test", "_type": "doc", "_id": "7", "_score": 1.3862944, "_source": { "title": "love China", "content": "people very love China", "tags": [ "China", "love" ] } }, { "_index": "test", "_type": "doc", "_id": "8", "_score": 0.6931472, "_source": { "title": "love HuBei", "content": "people very love HuBei", "tags": [ "HuBei", "love" ] } } ] } }

发现两个都查出来了,为什么?因为match进行搜索的时候,会先进行分词拆分,拆完后,再来匹配,上面两个内容,他们title的词条为: love china hubei ,我们搜索的为love China 我们进行分词处理得到为love china ,并且属于或的关系,只要任何一个词条在里面就能匹配到。如果想 love 和 China 同时匹配到的话,怎么做?使用 match_phrase

match_phrase 用法

match_phrase 称为短语搜索,要求所有的分词必须同时出现在文档中,同时位置必须紧邻一致。

GET test/doc/_search { "query": { "match_phrase": { "title": "love china" } } }

结果为:

{ "took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1.3862944, "hits": [ { "_index": "test", "_type": "doc", "_id": "7", "_score": 1.3862944, "_source": { "title": "love China", "content": "people very love China", "tags": [ "China", "love" ] } } ] } }

这次好像符合我们的需求了,结果只出现了一条记录。



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

      专题文章
        CopyRight 2018-2019 实验室设备网 版权所有