Elastic Stack Workshop | 刘征 Developer Advocate @ Elastic

Version 0.2.5

搜索、观测、保护三大解决方案实战教程,各种级别和类型的学习资料,欢迎一起学习和共建。

Elasticsearch 核心概念梳理

准备工作

准备三个 Linux 虚拟机,将 Elasticsearch 8 安装包上传到这些虚拟机的非 root 用户的 home 目录中。将 Kibana 8 的安装包上传到其中一个虚拟机上。

参考这篇教程 在 Linux 上运行 Elasticsearch 8.1.0

索引管理

core concepts

索引的创建、删除和查看

PUT blog

DELETE blog

PUT Blog

PUT blog

GET blog

索引属性的查看和更新

GET blog/_settings

PUT blog/_settings
{
  "number_of_replicas": 2
}

GET blog/_settings

PUT blog/_settings
{
  "number_of_shards": 2
}

GET blog/_settings

创建带有目标属性的索引

DELETE blog

PUT blog
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2
  }
}

GET blog

在索引中索引文档

POST blog/_doc/1
{
  "title": "Java虚拟机"
}

GET blog/_doc/1

PUT blog/_settings
{
  "blocks.write": true
}

GET blog/_settings

GET blog/_doc/1

POST blog/_doc/2
{
  "title": "AWS EC2 虚拟机"
}

PUT blog/_settings
{
  "blocks.write": false
}

POST blog/_doc/2
{
  "title": "AWS EC2 虚拟机"
}

GET blog/_doc/2

POST blog/_search

索引的打开和关闭

POST blog/_search

POST blog/_doc/3
{
  "title": "Azure 虚拟机"
}

POST blog/_search

POST blog/_close

POST blog/_search

POST blog/_doc/4
{
  "title": "GCP 虚拟机"
}

POST blog/_open

POST blog/_doc/4
{
  "title": "GCP 虚拟机"
}

POST blog/_search

索引的复制

POST blog/_search

POST _reindex
{
  "source": {"index": "blog"},
  "dest": {"index": "blog_old_readonly"}
}

GET blog_old_readonly

GET blog_old_readonly/_search

DELETE blog_old_readonly

PUT blog_old_readonly
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  }
}

GET blog_old_readonly

POST _reindex
{
  "source": {"index": "blog"},
  "dest": {"index": "blog_old_readonly"}
}

POST blog_old_readonly/_search

使用别名访问索引

GET blog/_settings

GET blog/_search

GET blog_old_readonly/_search

DELETE blog

PUT blog_new
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1
  }
}

GET blog_new

POST blog_new/_search

POST blog_new/_doc/5
{
  "title": "聊一聊云主机性价比哪家强"
}

POST blog_new/_search

POST blog_new/_doc/6
{
  "title": " 为什么要使用对象存储?"
}

POST blog_new/_search
{
  "explain": true
}

GET blog_new/

POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "blog_new",
        "alias": "blog_all"
      }
    }
  ]
}

GET blog_all/_search

POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "blog_old_readonly",
        "alias": "blog_all"
      }
    }
  ]
}

GET blog_all/_search

GET blog_new/

GET blog_old_readonly

GET blog_all/_alias

GET blog_all/_search

GET _cat/nodes

GET _cat/shards

POST blog_new/_search
{
  "explain": true
}

POST blog_new/_search
{
  "explain": true,
  "query": {
    "match_all": {}
  }
}

在索引中管理文档记录

PUT blog_now { “mappings”: { “dynamic”: “strict”, “properties”: { “title”: {“type”: “text”}, “content”: {“type”: “text”}, “posttime”: {“type”: “date”} }, “settings”: { “number_of_shards”: 2, “number_of_replicas”: 1 } }

PUT movies { “mappings”: { “properties”: { “movieid”: {“type”: “integer”}, “title”: {“type”: “text”}, “genres”: {“type”: “text”}, “search_index”: {“type”: “text”} }, “settings”: { “number_of_shards”: 1, “number_of_replicas”: 0 } }

curl -X DELETE “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats?pretty"

curl -XPUT “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats?pretty" -H “Content-Type: application/json” -d' { “mappings”: { “properties”: { “weeks”: { “type”: “integer”}, “workshop”:{ “properties”: { “pv”: { “type”: “integer” }, “uv”: { “type”: “integer” } } }, “bilibili”: { “properties”: { “follower”: { “type”: “integer” }, “arc_passed_total”: { “type”: “integer” }, “inc_click”: { “type”: “integer” }, “inc_dm”:{ “type”: “integer” }, “icn_reply”: { “type”: “integer” }, “inc_fav”: { “type”: “integer” }, “inc_coin”: { “type”: “integer” }, “inc_share”:{ “type”: “integer” }, “inc_like”: { “type”: “integer” }, “inc_elec”: { “type”: “integer” } } } } } }'

curl -XPUT “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats?pretty" -H “Content-Type: application/json” -d' { “weeks”: 38, “workshop.pv”: 6306, “workshop.uv”: 5602 }'

curl -XPOST “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats/_doc/" -H “Content-Type: application/json” -d' { “weeks”: 38, “workshop.pv”: 6306, “workshop.uv”: 5602 }'

curl -XGET “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats?pretty"

curl -XPOST “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats/_doc?pretty" -H “Content-Type: application/json” -d' { “weeks”: 38, “arc_passed_total”: 141, “follower”: 2699, “icn_reply”: 155, “inc_click”: 56417, “inc_coin”: 272, “inc_dm”: 27, “inc_elec”: 0, “inc_fav”: 1506, “inc_like”: 631, “inc_share”: 171 }'

curl –insecure –user admin:admin -H “Content-Type: application/json” -XPOST “https://localhost:9200/movie/_bulk?pretty&refresh” –data-binary “@movies.json”

curl -XGET “https://zheng.liu:[email protected]cp.cloud.es.io:9243/martin-stats/_search?pretty"

./opensearch -Ecluster.name=opensearch-cluster -Enode.name=opensearch-node1 -Ehttp.host=0.0.0.0 -Ediscovery.type=single-node

import json import requests import ipywidgets as widgets from IPython.display import clear_output, Markdown, display from time import time import requests.packages.urllib3 requests.packages.urllib3.disable_warnings()

定义本地 OpenSearch 服务器访问信息

aws_endpoint = ‘https://192.168.2.18:9200’ aws_username = ‘admin’ aws_password = ‘admin’

构建 前缀搜索 查询请求

def search_prefix(keywords): if not keywords: return None keywords = keywords.split(’ ‘) if len(keywords) > 1: prefix = keywords[-1] phrase = ’ ‘.join(keywords[:-1]) query = { ‘query’: { ‘bool’: { ‘must’: [ { ‘prefix’: { ‘search_index’: prefix } }, { ‘match’: { ‘search_index’: { ‘query’: phrase, ‘minimum_should_match’: ‘100%’ } } } ] } } } else: keywords = ’ ‘.join(keywords) query = { ‘query’: { ‘prefix’: { ‘search_index’: keywords } } }

query = json.dumps(query)

url = aws_endpoint + '/movies/_search'
r = requests.post(url, 
                  auth=(aws_username, aws_password), 
                  data=query, 
                  headers={'Content-type': 'application/json'},
                  verify=False)
return r.json()

在结果中加粗现实搜索的 term

def bold(result, keywords): “““Bold keywords in the result””” keywords = keywords.strip().split(’ ‘) for keyword in keywords: result = result.replace(keyword, f”{keyword}”) keyword = keyword.title() result = result.replace(keyword, f”{keyword}”) return result

def printmd(string): “““Wrapper function to display markdown””” display(Markdown(string))

每当搜索的 term 发生变化的时候,就发出新的搜索请求,模仿 search as you type

def text_change(change): “““Widget event when the value changes””” global output with output: clear_output() t0 = time() results = search_prefix(change[’new’]) if results: hits = results[‘hits’][‘hits’] print_text = [] for res in hits[:10]: res = res[’_source’] print_text.append(bold(res[’title’], change[’new’])) printmd(’

’.join(print_text)) if hits: print(f’\n{round(time() - t0, 4)} seconds’)

现实一个可以交互的搜索框

box = widgets.Text( value=’’, placeholder=‘Type something’, description=‘Query:’, disabled=False ) output = widgets.Output() display(box, output) box.observe(text_change, names=‘value’)

GET _search { “query”: { “match_all”: {} } }

GET _cat/indices

GET movies/_search

匹配到了就返回

POST movies/_search { “query”: { “match”: { “title”: “harry” } } }

前缀搜索,遵循输入的顺序

POST movies/_search { “query”: { “match_phrase_prefix”: { “title”: “harry” } } }

POST movies/_search { “query”: { “match_phrase_prefix”: { “title”: “harry potter” } } }

POST movies/_search { “query”: { “match_phrase_prefix”: { “title”: “potter harry” } } }

最后输入的词做前缀,其他词也需要匹配到,忽略输入顺序

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “title”: { “value”: “harry” } } }, { “match”: { “title”: { “query”: “potter”, “minimum_should_match”: 1 } } } ] } } }

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “title”: { “value”: “potter” } } }, { “match”: { “title”: { “query”: “harry”, “minimum_should_match”: 1 } } } ] } } }

想让其他字段参与搜索,捷径是搜索合并后字段

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “search_index”: { “value”: “potter” } } }, { “match”: { “search_index”: { “query”: “adventure”, “minimum_should_match”: 1 } } } ] } } }

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “search_index”: { “value”: “harry” } } }, { “match”: { “search_index”: { “query”: “adventure”, “minimum_should_match”: 1 } } } ] } } }

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “search_index”: { “value”: “harry” } } }, { “match”: { “search_index”: { “query”: “comedy”, “minimum_should_match”: 1 } } } ] } } }

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “search_index”: { “value”: “2000” } } }, { “match”: { “search_index”: { “query”: “comedy”, “minimum_should_match”: 1 } } } ] } } }

POST movies/_search { “query”: { “bool”: { “must”: [ { “prefix”: { “search_index”: { “value”: “2000” } } }, { “match”: { “search_index”: { “query”: “action”, “minimum_should_match”: 1 } } } ] } } }

0100000000202020202057442d574343374b365446304e5953574443202057

0100000000202020202057442d574343374b30484153454538574443202057

0100000000202020202057442d574343374b365446304e5953574443202057 0100000000202020202057442d574343374b30484153454538574443202057

0100000000202020202057442d574343374b30554a4358454c574443205744

0100000000202020202057442d574343374b33415950394c32574443205744

0100000000202020202057442d574343374b30554a4358454c574443205744

0100000000202020202057442d574343374b33415950394c32574443205744

0100000000202020202057442d574343374b33415950394c32574443205744

0100000000202020202057442d574343374b33415950394c32574443205744 0100000000202020202057442d574343374b30554a43

0100000000202020202057442d574343374b33415950394c32574443205744

0100000000202020202057442d574343374b33415950394c32574443205744

disk 1 0100000000202020202057442d574343374b305a4e36453933574443205744

0100000000202020202057442d574343374b305a4e36453933574443205744

0100000000202020202057442d574343374b314b4b50365645574443205744 0100000000202020202057442d574343374b314b4b50365645574443205744 0100000000202020202057442d574343374b314b4b50365645574443205744

Last updated on 2022-09-18
Published on 2022-09-18
 Edit on GitHub