Merge branch 'main' into refactoring/composite_collector

javanna · Jun 30, 2023 · 799dc4b · 799dc4b
2 parents 3830a7a + e4f75a3
commit 799dc4b
Show file tree

Hide file tree

Showing 266 changed files with 4,195 additions and 1,321 deletions.
diff --git a/.ci/bwcVersions b/.ci/bwcVersions
@@ -60,6 +60,7 @@ BWC_VERSION:
   - "7.17.9"
   - "7.17.10"
   - "7.17.11"
+  - "7.17.12"
   - "8.0.0"
   - "8.0.1"
   - "8.1.0"
@@ -90,5 +91,6 @@ BWC_VERSION:
   - "8.8.0"
   - "8.8.1"
   - "8.8.2"
+  - "8.8.3"
   - "8.9.0"
   - "8.10.0"
diff --git a/.ci/snapshotBwcVersions b/.ci/snapshotBwcVersions
@@ -1,5 +1,5 @@
 BWC_VERSION:
-  - "7.17.11"
-  - "8.8.2"
+  - "7.17.12"
+  - "8.8.3"
   - "8.9.0"
   - "8.10.0"
diff --git a/build-tools-internal/src/main/resources/forbidden/es-server-signatures.txt b/build-tools-internal/src/main/resources/forbidden/es-server-signatures.txt
@@ -150,3 +150,6 @@ org.elasticsearch.cluster.service.ClusterService#submitUnbatchedStateUpdateTask(
 @defaultMessage Reacting to the published cluster state is an obstruction to batching cluster state tasks which leads to performance and stability bugs. Use the variants that accept a Runnable instead.
 org.elasticsearch.cluster.ClusterStateTaskExecutor$TaskContext#success(java.util.function.Consumer)
 org.elasticsearch.cluster.ClusterStateTaskExecutor$TaskContext#success(java.util.function.Consumer, org.elasticsearch.cluster.ClusterStateAckListener)
+
+@defaultMessage ClusterState#transportVersions are for internal use only. Use ClusterState#getMinTransportVersion or a different version. See TransportVersion javadocs for more info.
+org.elasticsearch.cluster.ClusterState#transportVersions()
diff --git a/docs/changelog/93545.yaml b/docs/changelog/93545.yaml
@@ -0,0 +1,5 @@
+pr: 93545
+summary: Improve error message when aggregation doesn't support counter field
+area: Aggregations
+type: enhancement
+issues: []
diff --git a/docs/changelog/96161.yaml b/docs/changelog/96161.yaml
@@ -4,3 +4,28 @@ area: "Search"
 type: enhancement
 issues:
  - 95541
+highlight:
+  title: Better indexing and search performance under concurrent indexing and search
+  body: "When a query like a match phrase query or a terms query targets a constant keyword field we can skip\
+  query execution on shards where the query is rewritten to match no documents. We take advantage of index mappings\
+  including constant keyword fields and rewrite queries in such a way that, if a constant keyword field does not\
+  match the value defined in the index mapping, we rewrite the query to match no document. This will result in the\
+  shard level request to return immediately, before the query is executed on the data node and, as a result, skipping\
+  the shard completely. Here we leverage the ability to skip shards whenever possible to avoid unnecessary shard\
+  refreshes and improve query latency (by not doing any search-related I/O). Avoiding such unnecessary shard refreshes\
+  improves query latency since the search thread does not need to wait anymore for unnecessary shard refreshes. Shards\
+  not matching the query criteria will remain in a search-idle state and indexing throughput will not be negatively\
+  affected by a refresh. Before introducing this change a query hitting multiple shards, including those with no\
+  documents matching the search criteria (think about using index patterns or data streams with many backing indices),\
+  would potentially result in a \"shard refresh storm\" increasing query latency as a result of the search thread\
+  waiting on all shard refreshes to complete before being able to initiate and carry out the search operation.\
+  After introducing this change the search thread will just need to wait for refreshes to be completed on shards\
+  including relevant data. Note that execution of the shard pre-filter and the corresponding \"can match\" phase where\
+  rewriting happens, depends on the overall number of shards involved and on whether there is at least one of them\
+  returning a non-empty result (see 'pre_filter_shard_size' setting to understand how to control this behaviour).\
+  Elasticsearch does the rewrite operation on the data node in the so called \"can match\" phase, taking advantage of\
+  the fact that, at that moment, we can access index mappings and extract information about constant keyword fields\
+  and their values. This means we still\"fan-out\" search queries from the coordinator node to involved data nodes.\
+  Rewriting queries based on index mappings is not possible on the coordinator node because the coordinator node is\
+  missing index mapping information."
+  notable: true
diff --git a/docs/changelog/96243.yaml b/docs/changelog/96243.yaml
diff --git a/docs/changelog/96251.yaml b/docs/changelog/96251.yaml
diff --git a/docs/changelog/96540.yaml b/docs/changelog/96540.yaml
diff --git a/docs/changelog/96551.yaml b/docs/changelog/96551.yaml
diff --git a/docs/changelog/96606.yaml b/docs/changelog/96606.yaml
diff --git a/docs/changelog/96668.yaml b/docs/changelog/96668.yaml
diff --git a/docs/changelog/96738.yaml b/docs/changelog/96738.yaml
diff --git a/docs/changelog/96782.yaml b/docs/changelog/96782.yaml
diff --git a/docs/changelog/96785.yaml b/docs/changelog/96785.yaml
diff --git a/docs/changelog/96821.yaml b/docs/changelog/96821.yaml
diff --git a/docs/changelog/96843.yaml b/docs/changelog/96843.yaml
diff --git a/docs/changelog/97041.yaml b/docs/changelog/97041.yaml
@@ -0,0 +1,5 @@
+pr: 97041
+summary: Introduce downsampling configuration for data stream lifecycle
+area: Data streams
+type: feature
+issues: []
diff --git a/docs/changelog/97159.yaml b/docs/changelog/97159.yaml
@@ -0,0 +1,5 @@
+pr: 97159
+summary: Improve exists query rewrite
+area: Search
+type: enhancement
+issues: []
diff --git a/docs/changelog/97203.yaml b/docs/changelog/97203.yaml
@@ -0,0 +1,5 @@
+pr: 97203
+summary: Fix possible NPE when transportversion is null in `MainResponse`
+area: Infra/REST API
+type: bug
+issues: []
diff --git a/docs/changelog/97224.yaml b/docs/changelog/97224.yaml
@@ -0,0 +1,5 @@
+pr: 97224
+summary: Remove exception wrapping in `BatchedRerouteService`
+area: Allocation
+type: bug
+issues: []
diff --git a/docs/changelog/97234.yaml b/docs/changelog/97234.yaml
@@ -0,0 +1,5 @@
+pr: 97234
+summary: Add "operator" field to authenticate response
+area: Authorization
+type: enhancement
+issues: []
diff --git a/docs/plugins/analysis-nori.asciidoc b/docs/plugins/analysis-nori.asciidoc
@@ -305,7 +305,7 @@ Which responds with:
 
 The `nori_part_of_speech` token filter removes tokens that match a set of
 part-of-speech tags. The list of supported tags and their meanings can be found here:
-{lucene-core-javadoc}/../analyzers-nori/org/apache/lucene/analysis/ko/POS.Tag.html[Part of speech tags]
+{lucene-core-javadoc}/../analysis/nori/org/apache/lucene/analysis/ko/POS.Tag.html[Part of speech tags]
 
 It accepts the following setting:
 

diff --git a/docs/reference/data-management.asciidoc b/docs/reference/data-management.asciidoc
@@ -20,17 +20,32 @@ so you can move it to less expensive, less performant hardware.
 For your oldest data, what matters is that you have access to the data. 
 It's ok if queries take longer to complete.
 
-To help you manage your data, {es} enables you to:
+To help you manage your data, {es} offers you:
 
+* <<index-lifecycle-management, {ilm-cap}>> ({ilm-init}) to manage both indices and data streams and it is fully customisable, and
+* <<data-stream-lifecycle, Data stream lifecycle>> which is the built-in lifecycle of data streams and addresses the most
+common lifecycle management needs.
+
+preview::["The built-in data stream lifecycle is in technical preview and may be changed or removed in a future release. Elastic will apply best effort to fix any issues, but this feature is not subject to the support SLA of official GA features."]
+
+**{ilm-init}** can be used to manage both indices and data streams and it allows you to:
+
+* Define the retention period of your data. The retention period is the minimum time your data will be stored in {es}.
+Data older than this period can be deleted by {es}.
 * Define <<data-tiers, multiple tiers>> of data nodes with different performance characteristics.
-* Automatically transition indices through the data tiers according to your performance needs and retention policies
-with <<index-lifecycle-management, {ilm}>> ({ilm-init}). 
+* Automatically transition indices through the data tiers according to your performance needs and retention policies.
 * Leverage <<searchable-snapshots, searchable snapshots>> stored in a remote repository to provide resiliency 
 for your older indices while reducing operating costs and maintaining search performance.  
 * Perform <<async-search-intro, asynchronous searches>> of data stored on less-performant hardware.
+
+**Data stream lifecycle** is less feature rich but is focused on simplicity, so it allows you to easily:
+
+* Define the retention period of your data. The retention period is the minimum time your data will be stored in {es}.
+Data older than this period can be deleted by {es} at a later time.
+* Improve the performance of your data stream by performing background operations that will optimise the way your data
+stream is stored.
 --
 
 include::ilm/index.asciidoc[]
 
 include::datatiers.asciidoc[]
-
diff --git a/docs/reference/data-streams/data-stream-apis.asciidoc b/docs/reference/data-streams/data-stream-apis.asciidoc
@@ -12,6 +12,14 @@ The following APIs are available for managing <<data-streams,data streams>>:
 * <<promote-data-stream-api>>
 * <<modify-data-streams-api>>
 
+[[data-stream-lifecycle-api]]
+The following APIs are available for managing the built-in lifecycle of data streams:
+
+* <<data-streams-put-lifecycle,Update data stream lifecycle>> preview:[]
+* <<data-streams-get-lifecycle,Get data stream lifecycle>> preview:[]
+* <<data-streams-delete-lifecycle,Delete data stream lifecycle>> preview:[]
+* <<data-streams-explain-lifecycle,Explain data stream lifecycle>> preview:[]
+
 The following API is available for <<tsds,time series data streams>>:
 
 * <<indices-downsample-data-stream>>
@@ -33,4 +41,12 @@ include::{es-repo-dir}/data-streams/promote-data-stream-api.asciidoc[]
 
 include::{es-repo-dir}/data-streams/modify-data-streams-api.asciidoc[]
 
+include::{es-repo-dir}/data-streams/lifecycle/apis/put-lifecycle.asciidoc[]
+
+include::{es-repo-dir}/data-streams/lifecycle/apis/get-lifecycle.asciidoc[]
+
+include::{es-repo-dir}/data-streams/lifecycle/apis/delete-lifecycle.asciidoc[]
+
+include::{es-repo-dir}/data-streams/lifecycle/apis/explain-lifecycle.asciidoc[]
+
 include::{es-repo-dir}/indices/downsample-data-stream.asciidoc[]
diff --git a/docs/reference/data-streams/data-streams.asciidoc b/docs/reference/data-streams/data-streams.asciidoc
@@ -135,3 +135,4 @@ include::set-up-a-data-stream.asciidoc[]
 include::use-a-data-stream.asciidoc[]
 include::change-mappings-and-settings.asciidoc[]
 include::tsds.asciidoc[]
+include::lifecycle/index.asciidoc[]
diff --git a/...erence/dlm/apis/delete-lifecycle.asciidoc → .../lifecycle/apis/delete-lifecycle.asciidoc b/...erence/dlm/apis/delete-lifecycle.asciidoc → .../lifecycle/apis/delete-lifecycle.asciidoc
@@ -1,10 +1,10 @@
-[[dlm-delete-lifecycle]]
+[[data-streams-delete-lifecycle]]
 === Delete the lifecycle of a data stream
 ++++
 <titleabbrev>Delete Data Stream Lifecycle</titleabbrev>
 ++++
 
-experimental::[]
+preview::[]
 
 Deletes the lifecycle from a set of data streams.
 
@@ -14,18 +14,18 @@ Deletes the lifecycle from a set of data streams.
 * If the {es} {security-features} are enabled, you must have the `manage_data_stream_lifecycle` index privilege or higher to
 use this API. For more information, see <<security-privileges>>.
 
-[[dlm-delete-lifecycle-request]]
+[[data-streams-delete-lifecycle-request]]
 ==== {api-request-title}
 
 `DELETE _data_stream/<data-stream>/_lifecycle`
 
-[[dlm-delete-lifecycle-desc]]
+[[data-streams-delete-lifecycle-desc]]
 ==== {api-description-title}
 
 Deletes the lifecycle from the specified data streams. If multiple data streams are provided but at least one of them
 does not exist, then the deletion of the lifecycle will fail for all of them and the API will respond with `404`.
 
-[[dlm-delete-lifecycle-path-params]]
+[[data-streams-delete-lifecycle-path-params]]
 ==== {api-path-parms-title}
 
 `<data-stream>`::
@@ -41,7 +41,7 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=ds-expand-wildcards]
 +
 Defaults to `open`.
 
-[[dlm-delete-lifecycle-example]]
+[[data-streams-delete-lifecycle-example]]
 ==== {api-examples-title}
 
 ////

diff --git a/.../dlm/apis/explain-data-lifecycle.asciidoc → ...lifecycle/apis/explain-lifecycle.asciidoc b/.../dlm/apis/explain-data-lifecycle.asciidoc → ...lifecycle/apis/explain-lifecycle.asciidoc
@@ -1,41 +1,39 @@
-[[dlm-explain-lifecycle]]
-=== Explain Lifecycle API
+[[data-streams-explain-lifecycle]]
+=== Explain data stream lifecycle
 ++++
-<titleabbrev>Explain Data Lifecycle</titleabbrev>
+<titleabbrev>Explain Data Stream Lifecycle</titleabbrev>
 ++++
 
-experimental::[]
+preview::[]
 
 Retrieves the current data lifecycle status for one or more data stream backing indices.
 
 [[explain-lifecycle-api-prereqs]]
 ==== {api-prereq-title}
 
-* Nit: would rephrase as:
-
 If the {es} {security-features} are enabled, you must have at least the `manage_data_stream_lifecycle` index privilege or
 `view_index_metadata` index privilege to use this API. For more information, see <<security-privileges>>.
 
-[[dlm-explain-lifecycle-request]]
+[[data-streams-explain-lifecycle-request]]
 ==== {api-request-title}
 
 `GET <target>/_lifecycle/explain`
 
-[[dlm-explain-lifecycle-desc]]
+[[data-streams-explain-lifecycle-desc]]
 ==== {api-description-title}
 
-Retrieves information about the index's current DLM lifecycle state, such as
+Retrieves information about the index or data stream's current data stream lifecycle state, such as
 time since index creation, time since rollover, the lifecycle configuration
 managing the index, or any error that {es} might've encountered during the lifecycle
 execution.
 
-[[dlm-explain-lifecycle-path-params]]
+[[data-streams-explain-lifecycle-path-params]]
 ==== {api-path-parms-title}
 
 `<target>`::
 (Required, string) Comma-separated list of indices.
 
-[[dlm-explain-lifecycle-query-params]]
+[[data-streams-explain-lifecycle-query-params]]
 ==== {api-query-parms-title}
 
 `include_defaults`::
@@ -44,7 +42,7 @@ execution.
 
 include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
 
-[[dlm-explain-lifecycle-example]]
+[[data-streams-explain-lifecycle-example]]
 ==== {api-examples-title}
 
 The following example retrieves the lifecycle state of the index `.ds-metrics-2023.03.22-000001`:
@@ -53,9 +51,9 @@ The following example retrieves the lifecycle state of the index `.ds-metrics-20
 --------------------------------------------------
 GET .ds-metrics-2023.03.22-000001/_lifecycle/explain
 --------------------------------------------------
-// TEST[skip:we're not setting up DLM in these tests]
+// TEST[skip:we're not setting up data stream lifecycle in these tests]
 
-If the index is managed by DLM `explain` will show the `managed_by_lifecycle` field
+If the index is managed by a data stream lifecycle `explain` will show the `managed_by_lifecycle` field
 set to `true` and the rest of the response will contain information about the
 lifecycle execution status for this index:
 
@@ -77,8 +75,8 @@ lifecycle execution status for this index:
 --------------------------------------------------
 // TESTRESPONSE[skip:the result is for illustrating purposes only]
 
-<1> Shows if the index is being managed by DLM. If the index is not managed by
-DLM the other fields will not be shown
+<1> Shows if the index is being managed by data stream lifecycle. If the index is not managed by
+a data stream lifecycle the other fields will not be shown
 <2> When the index was created, this timestamp is used to determine when to
 rollover
 <3> The time since the index creation (used for calculating when to rollover