字段类型:https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html
[TOC]
在写入文档的时候,如果索引不存在,会自动创建索引
Dynamic Mapping 的机制,使得我们无需手动定义 Mappings。Elasticsearch 会自动根据文档信息,推算出字段的类型
但是会有时候推算不对。当类型如果设置不对时,会导致一些功能无法正常运行
两种情况
如果希望改变字段类型,必须 Reindex API,重建索引
如果修改了字段的数据类型,会导致已被索引的属于无法被搜索
但是如果是增加新的字段,就不会有这样的影响
PUT /mytest_user/_mapping
{
"properties": {
"add_new2": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
}
}
//修改为dynamic false(丢弃)/strict(报错)/true(自动新增,可能不是我们想要的字段)
PUT dynamic_mapping_test/_mapping
{
"dynamic":false
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html
PUT /my_indexxxx
{
"mappings": {
"properties": {
"city": {
"type": "text"
}
}
}
}
PUT /my_indexxxx/_mapping
{
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
The city field can be used for full text search.
The city.raw field can be used for sorting and aggregations
PUT /my_index
{
"mappings": {
"properties": {
"user_identifier": {
"type": "keyword"
}
}
}
}
PUT /my_index/_mapping
{
"properties": {
"user_id": {
"type": "alias",
"path": "user_identifier"
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html
默认为两种格式:
strict_date_optional_time 此格式为ISO8601标准 示例:2018-08-31T14:56:18.000+08:00
epoch_millis 也就是时间戳 示例1515150699465, 1515150699
实测,仅支持"yyyy-MM-dd"、"yyyyMMdd"、"yyyyMMddHHmmss"、"yyyy-MM-ddTHH:mm:ss"、"yyyy-MM-ddTHH:mm:ss.SSS"、"yyyy-MM-ddTHH:mm:ss.SSSZ"格式,不支持常用的"yyyy-MM-dd HH:mm:ss"等格式。
注意,"T"和"Z"是固定的字符,在获取"yyyy-MM-ddTHH:mm:ss"、"yyyy-MM-ddTHH:mm:ss.SSS"、"yyyy-MM-ddTHH:mm:ss.SSSZ"格式字符串值时,不能直接以前面格式格式化date,而是需要多次格式化date并且拼接得到。
全量同步:curl 127.0.0.1:8081/etl/es7/xx_index.yml -X POST -d "params=2018-10-21 00:00:00"
select a.LogId, a.OrderType,a.CompchterId,a.WarehouseCode,a.xx_name,a.BarCode,a.GoodsName,a.CreateTime from xx_index a
PUT /goods-stock-log
{
"aliases" : {
"xx_index" : {}
},
"settings" : {
"number_of_shards" : 5,
"number_of_replicas" : 1
},
"mappings":{
"properties":{
"LogId": {
"type": "keyword"
},
"OrderType": {
"type": "long"
},
"CompchterId": {
"type": "long"
},
"WarehouseCode": {
"type": "keyword"
},
"xx_name": {
"type": "keyword"
},
"BarCode": {
"type": "keyword"
},
"GoodsName": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"CreateTime": {
"type": "date"
}
}
}
}
merchant SecurityCharacter StoreBasic CompanyBasic
一对一, 多对一
这里join操作只能是left outer join, 第一张表必须为主表!!
SecurityUser SecurityVita SecurityRole
select a.UserId,a.RoleId, a.CharacterId,a.LoginId,b.TrueName,b.UniteNote,c.RoleName,a.CreateTime from SecurityUser a left join SecurityVita b on b.UserId = a.UserId left join SecurityRole c on a.RoleId = c.RoleId
全量同步:curl 127.0.0.1:8081/etl/es7/SecurityUser.yml -X POST
PUT /security-user
{
"aliases" : {
"SecurityUser" : {}
},
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
},
"mappings":{
"properties":{
"UserId": {
"type": "long"
},
"CharacterId": {
"type": "long"
},
"LoginId": {
"type": "keyword"
},
"TrueName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 50
}
}
},
"UniteNote": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"RoleName": {
"type": "keyword"
},
"CreateTime": {
"type": "date"
}
}
}
}
xxorderBasic xxorderItem JunkBasic
关联从表如果是子查询不能有多张表
全量同步:
curl 127.0.0.1:8081/etl/es7/xxorderBasic.yml -X POST
curl 127.0.0.1:8081/etl/es7/xxorderItem.yml -X POST
PUT /xx-order
{
"aliases" : {
"xxorder" : {}
},
"settings" : {
"number_of_shards" : 5,
"number_of_replicas" : 1
},
"mappings":{
"properties":{
"PrintId": {
"type": "long"
},
"CharacterId": {
"type": "long"
},
"PrintCode": {
"type": "keyword"
},
"MaterialTypes": {
"type": "text"
},
"CheckNote": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"CreateTime": {
"type": "date"
},
"Item_JunkId": {
"type": "long"
},
"Item_PrintQty": {
"type": "long"
},
"basic_item":{
"type":"join",
"relations":{
"basic": ["item"]
}
}
}
}
}
写入父表
PUT /my-index/_doc/1?refresh
{
"basic-text": "This is a parent document.",
"basic_item": "basic"
}
写入子表
PUT /my-index/_doc/2?routing=1
{
"join": {
"name": "item",
"parent": "1"
},
"item-text": "This is a item document."
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
POST /symbol/_close
POST /symbol/_open
PUT /goods-stock-log-v2
{
"settings" : {
"number_of_shards" : 5,
"number_of_replicas" : 1
},
"mappings":{
"properties":{
"LogId": {
"type": "keyword"
},
"OrderType": {
"type": "long"
},
"CompchterId": {
"type": "long"
},
"WarehouseCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 50
}
}
},
"xx_name": {
"type": "keyword"
},
"BarCode": {
"type": "keyword"
},
"GoodsName": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"CreateTime": {
"type": "date"
}
}
}
}
POST /_aliases
{
"actions": [
{ "remove": {
"alias": "xx_index",
"index": "goods-stock-log"
}},
{ "add": {
"alias": "xx_index",
"index": "goods-stock-log-v2"
}}
]
}
但如果新的index中有数据,并且可能发生冲突,那么可以设置version_type"version_type": "internal"或者不设置,则Elasticsearch强制性的将文档转储到目标中,覆盖具有相同类型和ID的任何内容:
POST _reindex
{
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index",
"version_type": "internal"
}
}
POST /_reindex
{
"source": {"index": "goods-stock-log"},
"dest": {"index": "goods-stock-log-v2"}
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html
GET /xx_index/_search
{
"query": { "match_all": {} }
}
GET /xx_index/_search
{
"query": { "match": { "xx_name": "半成品" } }
}
GET /xx_index/_search
{
"query": { "match_phrase": { "GoodsName": "半 成品" } }
}
GET /xx_index/_search
{
"query": { "match_phrase_prefix": { "GoodsName": {
"query": "johnnie walker bl",
"slop":5,
"max_expansions": 1
} } }
}
这种查询的行为与 match_phrase 查询一致,不同的是它将查询字符串的最后一个词作为前缀使用,换句话说,可以将之前的例子看成如下这样:
johnnie
跟着 walker
跟着以 bl 开始的词
其它参数
比如查询条件是:
{
"query":{
"term":{
"foo": "hello world"
}
}
}
那么只有在字段中存储了“hello world”的数据才会被返回,如果在存储时,使用了分词,原有的文本“I say hello world”会被分词进行存储,不会存在“hello world”这整个词,那么不会返回任何值。
但是如果使用“hello”作为查询条件,则只要数据中包含“hello”的数据都会被返回,分词对这个查询影响较大。
{ "foo":"I just said hello world" }
{ "foo":"Hello world" }
{ "foo":"World Hello" }
使用match_phase:
{
"query": {
"match_phrase": {
"foo": "Hello World"
}
}
}
会返回前两条文档。
3.match模糊匹配,先对输入进行分词,对分词后的结果进行查询,文档只要包含match查询条件的一部分就会被返回。
4.query_string语法查询,同match_phase的相同点在于,输入的查询条件会被分词,但是不同之处在与文档中的数据可以不用和query_string中的查询条件有相同的顺序。
GET /_search
{
"query": {
"ids" : {
"values" : ["1", "4", "100"]
}
}
}
GET _search
{
"query": {
"range" : {
"age" : {
"gte" : 10,
"lte" : 20,
"boost" : 2.0
}
}
}
}
GET /_search
{
"query": {
"term": {
"user": {
"value": "Kimchy",
"boost": 1.0
}
}
}
}
GET /_search
{
"query" : {
"terms" : {
"user" : ["kimchy", "elasticsearch"],
"boost" : 1.0
}
}
}
GET /job-candidates/_search
{
"query": {
"terms_set": {
"programming_languages": {
"terms": ["c++", "java", "php"],
"minimum_should_match_field": "required_matches"
}
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
扫描整个倒排索引
前缀越短,要处理的doc越多,性能越差,尽可能用长前缀搜索
GET /xx_index/_search
{
"query": { "prefix": { "xx_name": "半成" } }
}
?:任意字符
*:0个或任意多个字符
性能一样差,必须扫描整个倒排索引
GET /xx_index/_search
{
"query": { "wildcard": { "xx_name": "?成品" } }
}
[0-9]:指定范围内的数字
[a-z]:指定范围内的字母
.:一个字符
+:前面的正则表达式可以出现一次或多次
性能一样差,必须扫描整个倒排索引
GET /xx-order/_search
{
"query": { "match_all": {} },
"size": 20,
"from": 30
}
GET /xx-order/_search
{
"query": { "match_all": {} },
"sort": [
{ "PrintId": "asc" }
]
}
GET /xx-order/_search
{
"query": { "match": { "MaterialTypes": "27" } },
"sort": [
{ "PrintId": "asc" }
]
}
查询printid=53094的所有item
GET /xx-order/_search
{
"query": {
"parent_id": {
"type": "item",
"id": "53094"
}
}
}
GET /xx-order/_search
{
"query": {
"has_child" : {
"type" : "item",
"query" : {
"match_all" : {}
},
"max_children": 10,
"min_children": 2,
"score_mode" : "min"
}
}
}
GET /xx-order/_search
{
"query": {
"has_child" : {
"type" : "item",
"query" : {
"match_all" : {}
},
"max_children": 10,
"min_children": 2,
"score_mode" : "min"
},
"inner_hits": {}
}
}
GET /xx-order/_search
{
"query": {
"has_parent" : {
"parent_type" : "basic",
"query" : {
"match_all" : {}
}
}
}
}
也可以加inner_hits返回父表
"inner_hits": {}
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-multi-get.html
POST /_sql?format=json
{
"query": "SELECT * FROM xx_index WHERE xx_name LIKE '半成品'"
}
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html
删除字段
POST /index/_update/1
{
"script" : "ctx._source.remove(\"State\")"
}
更新字段
POST /index/_update_by_query
{
"script": {
"source": "ctx._source['State']=ctx._source['Id']"
}
}
PUT /xxx
{
"aliases": {
"xxx": {}
},
"settings": {
"analysis": {
"analyzer": {
"rebuilt_cjk": {
"tokenizer": "standard",
"filter": ["cjk_bigram","ngram"]
}
}
}
},
"mappings": {
"properties": {
"ProductNote": {
"type": "text",
"analyzer" : "rebuilt_cjk"
}
}
}
}