From 891359ee7657bf416ba2a89684f8055cee6f53cc Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Thu, 13 Jun 2024 18:02:59 -0700 Subject: [PATCH 01/18] Enhancements in Version 2.15 Starting from version 2.15, we have introduced several enhancements: 1. Custom Index Management: * Added support for custom index management. For more details, watch the video on the create detector page and detector detail page anomaly-detection-dashboards-plugin#770. * Custom result indices are now managed as aliases. Consequently, additional security permissions are required for management. 2. New JVM Heap Usage Threshold Setting: * Introduced a new setting, plugins.anomaly_detection.jvm_heap_usage_threshold, to manage the memory circuit breaker threshold. 3. Documentation Improvements: * Added examples for DSL filters to enhance the documentation. 4. Ruby Version Update: * Updated Ruby version in CONTRIBUTING.md from 3.2 to 3.2.4 as version 3.2 is not available. Testing done: * built and viewed the changed website locally Signed-off-by: Kaituo Li --- CONTRIBUTING.md | 2 +- _observing-your-data/ad/index.md | 52 ++++++++++++++++++++++++++--- _observing-your-data/ad/settings.md | 3 +- 3 files changed, 50 insertions(+), 7 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f9f1a23f51..de44bbe4ee 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -82,7 +82,7 @@ Follow these steps to set up your local copy of the repository: ``` curl -sSL https://get.rvm.io | bash -s stable - rvm install 3.2 + rvm install 3.2.4 ruby -v ``` diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 139b63d199..af14818576 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -30,7 +30,38 @@ A detector is an individual anomaly detection task. You can define multiple dete - Enter a name and brief description. Make sure the name is unique and descriptive enough to help you to identify the purpose of the detector. 1. Specify the data source. - For **Data source**, choose the index you want to use as the data source. You can optionally use index patterns to choose multiple indexes. - - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. + - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) are supported in the DSL. + + **Example filter in DSL** + The query is designed to match documents that have specific values in the urlPath.keyword field. Specifically, it will match documents where the urlPath.keyword field is equal to one of the following values: + - /domain/{id}/short + - /sub_dir/{id}/short + - /abcd/123/{id}/xyz + + ```json + { + "bool": { + "should": [ + { + "term": { + "urlPath.keyword": "/domain/{id}/short" + } + }, + { + "term": { + "urlPath.keyword": "/sub_dir/{id}/short" + } + }, + { + "term": { + "urlPath.keyword": "/abcd/123/{id}/xyz" + } + } + ] + } + } + ``` + 1. Specify a timestamp. - Select the **Timestamp field** in your index. 1. Define operation settings. @@ -45,15 +76,17 @@ A detector is an individual anomaly detection task. You can define multiple dete - This value tells the detector that the data is not ingested into OpenSearch in real time but with a certain delay. Set the window delay to shift the detector interval to account for this delay. - For example, say the detector interval is 10 minutes and data is ingested into your cluster with a general delay of 1 minute. Assume the detector runs at 2:00. The detector attempts to get the last 10 minutes of data from 1:50 to 2:00, but because of the 1-minute delay, it only gets 9 minutes of data and misses the data from 1:59 to 2:00. Setting the window delay to 1 minute shifts the interval window to 1:49--1:59, so the detector accounts for all 10 minutes of the detector interval time. 1. Specify custom result index. - - If you want to store the anomaly detection results in your own index, choose **Enable custom result index** and specify the custom index to store the result. The anomaly detection plugin adds an `opensearch-ad-plugin-result-` prefix to the index name that you input. For example, if you input `abc` as the result index name, the final index name is `opensearch-ad-plugin-result-abc`. + - If you want to store the anomaly detection results in your own index, choose **Enable custom result index** and specify the custom index to store the result. The Anomaly Detection plugin automatically prefixes the index name you input with `opensearch-ad-plugin-result-`. For example, if you enter `abc` as the result index name, the final alias name will be `opensearch-ad-plugin-result-abc`. This alias points to an index with a name that includes the date and a sequence number, such as `opensearch-ad-plugin-result-abc-history-2024.06.12-000002`. You can use the dash “-” sign to separate the namespace to manage custom result index permissions. For example, if you use `opensearch-ad-plugin-result-financial-us-group1` as the result index, you can create a permission role based on the pattern `opensearch-ad-plugin-result-financial-us-*` to represent the "financial" department at a granular level for the "us" area. {: .note } - - If the custom index you specify doesn’t already exist, the Anomaly Detection plugin creates this index when you create the detector and start your real-time or historical analysis. - - If the custom index already exists, the plugin checks if the index mapping of the custom index matches the anomaly result file. You need to make sure the custom index has valid mapping as shown here: [anomaly-results.json](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json). + - Using a custom result index allows you to build customized dashboards. When the Security plugin (also known as Fine-grained access control) is enabled, our default result index becomes a system index. As a result, the default result index is not accessible through the standard index/search API. You must use the anomaly detection RESTful API or the Dashboard to access its content. Consequently, you cannot build a customized dashboard using the default result index if the Security plugin is enabled. + - If the custom index you specify doesn’t already exist, the Anomaly Detection plugin creates this index when you create the detector and start your real-time or historical analysis. + - If the custom index already exists, the plugin checks if the index mapping of the custom index matches the anomaly result file. You need to make sure the custom index has valid mapping as shown here: [anomaly-results.json](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json). - To use the custom result index option, you need the following permissions: - - `indices:admin/create` - If the custom index already exists, you don't need this. + - `indices:admin/create` - Required for the Anomaly Detection plugin to create and roll over the custom index. + - `indices:admin/aliases` - Required for the Anomaly Detection plugin to create and access an alias for the custom index. - `indices:data/write/index` - You need the `write` permission for the Anomaly Detection plugin to write results into the custom index for a single-entity detector. - `indices:data/read/search` - You need the `search` permission because the Anomaly Detection plugin needs to search custom result indexes to show results on the anomaly detection UI. - `indices:data/write/delete` - Because the detector might generate a large number of anomaly results, you need the `delete` permission to delete old data and save disk space. @@ -61,6 +94,15 @@ A detector is an individual anomaly detection task. You can define multiple dete - Managing the custom result index: - The anomaly detection dashboard queries all detectors’ results from all custom result indexes. Having too many custom result indexes might impact the performance of the Anomaly Detection plugin. - You can use [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/ism/index/) to rollover old result indexes. You can also manually delete or archive any old result indexes. We recommend reusing a custom result index for multiple detectors. + - The Anomaly Detection plugin can also be used to manage the lifecycle of custom indexes. It rolls an alias over to a new index when the custom result index meets any of the following conditions: + + + Parameter | Description | Type | Unit | Example | Required + :--- | :--- |:--- |:--- |:--- |:--- + `result_index_min_size` | Specifies the minimum size of total primary shard storage (excluding replicas) required to roll over the index. For example, if `result_index_min_size` is set to 100 GiB and the index has 5 primary shards and 5 replica shards of 20 GiB each, the total size of all primary shards is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No + `result_index_min_age` | Specifies the minimum age of the index required to roll over. The index age is calculated from its creation time to the current time. | `integer` |`day` | `7` | No + `result_index_ttl` | Specifies the minimum age required to permanently delete rolled over indexes. | `integer` | `day` | `60` | No + 1. Choose **Next**. After you define the detector, the next step is to configure the model. diff --git a/_observing-your-data/ad/settings.md b/_observing-your-data/ad/settings.md index 19099441ad..44f4526976 100644 --- a/_observing-your-data/ad/settings.md +++ b/_observing-your-data/ad/settings.md @@ -49,4 +49,5 @@ plugins.anomaly_detection.dedicated_cache_size | 10 | If the real-time analysis plugins.anomaly_detection.max_concurrent_preview | 2 | The maximum number of concurrent previews. You can use this setting to limit resource usage. plugins.anomaly_detection.model_max_size_percent | 0.1 | The upper bound of the memory percentage for a model. plugins.anomaly_detection.door_keeper_in_cache.enabled | False | When set to `true`, OpenSearch places a bloom filter in front of an inactive entity cache to filter out items that are not likely to appear more than once. -plugins.anomaly_detection.hcad_cold_start_interpolation.enabled | False | When set to `true`, enables interpolation in high-cardinality anomaly detection (HCAD) cold start. \ No newline at end of file +plugins.anomaly_detection.hcad_cold_start_interpolation.enabled | False | When set to `true`, enables interpolation in high-cardinality anomaly detection (HCAD) cold start. +plugins.anomaly_detection.jvm_heap_usage_threshold | 95 | The JVM memory usage threshold at which anomaly detectors are disabled. Defaults to 95% of the JVM heap. \ No newline at end of file From 42841c727b1c8a4afe98e72802a23cd9add73151 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:26:49 -0700 Subject: [PATCH 02/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index af14818576..c993f77393 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -32,7 +32,7 @@ A detector is an individual anomaly detection task. You can define multiple dete - For **Data source**, choose the index you want to use as the data source. You can optionally use index patterns to choose multiple indexes. - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) are supported in the DSL. - **Example filter in DSL** +#### Example filter using query DSL The query is designed to match documents that have specific values in the urlPath.keyword field. Specifically, it will match documents where the urlPath.keyword field is equal to one of the following values: - /domain/{id}/short - /sub_dir/{id}/short From bcc8a16da6a5ad3016fd73d498c5cf5abe4f3a6b Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:27:15 -0700 Subject: [PATCH 03/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index c993f77393..fc842711b3 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -30,7 +30,7 @@ A detector is an individual anomaly detection task. You can define multiple dete - Enter a name and brief description. Make sure the name is unique and descriptive enough to help you to identify the purpose of the detector. 1. Specify the data source. - For **Data source**, choose the index you want to use as the data source. You can optionally use index patterns to choose multiple indexes. - - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) are supported in the DSL. + - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only a [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) is supported for query domain-specific language (DSL). #### Example filter using query DSL The query is designed to match documents that have specific values in the urlPath.keyword field. Specifically, it will match documents where the urlPath.keyword field is equal to one of the following values: From 21cac7d67250869593d7bf01dd7245fe7056f687 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:29:28 -0700 Subject: [PATCH 04/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index fc842711b3..d7c7ebf515 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -76,7 +76,7 @@ A detector is an individual anomaly detection task. You can define multiple dete - This value tells the detector that the data is not ingested into OpenSearch in real time but with a certain delay. Set the window delay to shift the detector interval to account for this delay. - For example, say the detector interval is 10 minutes and data is ingested into your cluster with a general delay of 1 minute. Assume the detector runs at 2:00. The detector attempts to get the last 10 minutes of data from 1:50 to 2:00, but because of the 1-minute delay, it only gets 9 minutes of data and misses the data from 1:59 to 2:00. Setting the window delay to 1 minute shifts the interval window to 1:49--1:59, so the detector accounts for all 10 minutes of the detector interval time. 1. Specify custom result index. - - If you want to store the anomaly detection results in your own index, choose **Enable custom result index** and specify the custom index to store the result. The Anomaly Detection plugin automatically prefixes the index name you input with `opensearch-ad-plugin-result-`. For example, if you enter `abc` as the result index name, the final alias name will be `opensearch-ad-plugin-result-abc`. This alias points to an index with a name that includes the date and a sequence number, such as `opensearch-ad-plugin-result-abc-history-2024.06.12-000002`. + - The Anomaly Detection plugin allows you to store anomaly detection results in a custom index of your choice. To enable this, select **Enable custom result index** and provide a name for your index, for example, `abc`. The plugin then creates an alias prefixed with `opensearch-ad-plugin-result-` followed by your chosen name, for example, `opensearch-ad-plugin-result-abc`. This alias points to an actual index with a name containing the date and a sequence number, like `opensearch-ad-plugin-result-abc-history-2024.06.12-000002`, where your results are stored. You can use the dash “-” sign to separate the namespace to manage custom result index permissions. For example, if you use `opensearch-ad-plugin-result-financial-us-group1` as the result index, you can create a permission role based on the pattern `opensearch-ad-plugin-result-financial-us-*` to represent the "financial" department at a granular level for the "us" area. {: .note } From 3c2fc82e8f8fa3b36fa499794f864edad4f72356 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:29:58 -0700 Subject: [PATCH 05/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index d7c7ebf515..10a369abbf 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -81,7 +81,7 @@ A detector is an individual anomaly detection task. You can define multiple dete You can use the dash “-” sign to separate the namespace to manage custom result index permissions. For example, if you use `opensearch-ad-plugin-result-financial-us-group1` as the result index, you can create a permission role based on the pattern `opensearch-ad-plugin-result-financial-us-*` to represent the "financial" department at a granular level for the "us" area. {: .note } - - Using a custom result index allows you to build customized dashboards. When the Security plugin (also known as Fine-grained access control) is enabled, our default result index becomes a system index. As a result, the default result index is not accessible through the standard index/search API. You must use the anomaly detection RESTful API or the Dashboard to access its content. Consequently, you cannot build a customized dashboard using the default result index if the Security plugin is enabled. + - When the Security plugin (fine-grained access control) is enabled, the default result index becomes a system index and is no longer accessible through the standard Index or Search API. To access its content, you must use the anomaly detection RESTful API or the dashboard. As a result, you cannot build customized dashboards using the default result index if the Security plugin is enabled. However, you can create a custom result index to build customized dashboards. - If the custom index you specify doesn’t already exist, the Anomaly Detection plugin creates this index when you create the detector and start your real-time or historical analysis. - If the custom index already exists, the plugin checks if the index mapping of the custom index matches the anomaly result file. You need to make sure the custom index has valid mapping as shown here: [anomaly-results.json](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json). - To use the custom result index option, you need the following permissions: From 311f4d0b116afce4cb08e6eb8101a97479c65308 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:31:19 -0700 Subject: [PATCH 06/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 10a369abbf..221e1099fe 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -82,7 +82,7 @@ A detector is an individual anomaly detection task. You can define multiple dete {: .note } - When the Security plugin (fine-grained access control) is enabled, the default result index becomes a system index and is no longer accessible through the standard Index or Search API. To access its content, you must use the anomaly detection RESTful API or the dashboard. As a result, you cannot build customized dashboards using the default result index if the Security plugin is enabled. However, you can create a custom result index to build customized dashboards. - - If the custom index you specify doesn’t already exist, the Anomaly Detection plugin creates this index when you create the detector and start your real-time or historical analysis. + - If the custom index you specify does not exist, the Anomaly Detection plugin will create it when you create the detector and start your real-time or historical analysis. - If the custom index already exists, the plugin checks if the index mapping of the custom index matches the anomaly result file. You need to make sure the custom index has valid mapping as shown here: [anomaly-results.json](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json). - To use the custom result index option, you need the following permissions: - `indices:admin/create` - Required for the Anomaly Detection plugin to create and roll over the custom index. From 03f964cc0985a166b2e8727feb34ba67c6e81b12 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:31:49 -0700 Subject: [PATCH 07/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 221e1099fe..a02c69369d 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -83,7 +83,7 @@ A detector is an individual anomaly detection task. You can define multiple dete - When the Security plugin (fine-grained access control) is enabled, the default result index becomes a system index and is no longer accessible through the standard Index or Search API. To access its content, you must use the anomaly detection RESTful API or the dashboard. As a result, you cannot build customized dashboards using the default result index if the Security plugin is enabled. However, you can create a custom result index to build customized dashboards. - If the custom index you specify does not exist, the Anomaly Detection plugin will create it when you create the detector and start your real-time or historical analysis. - - If the custom index already exists, the plugin checks if the index mapping of the custom index matches the anomaly result file. You need to make sure the custom index has valid mapping as shown here: [anomaly-results.json](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json). + - If the custom index already exists, the plugin will verify that the index mapping matches the required structure for anomaly results. In this case, ensure that the custom index has a valid mapping as defined in the [`anomaly-results.json`](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json) file. - To use the custom result index option, you need the following permissions: - `indices:admin/create` - Required for the Anomaly Detection plugin to create and roll over the custom index. - `indices:admin/aliases` - Required for the Anomaly Detection plugin to create and access an alias for the custom index. From 6c23223d32569ea89d0640245474f214ac1338fc Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:32:09 -0700 Subject: [PATCH 08/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index a02c69369d..2fc8252855 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -85,7 +85,7 @@ A detector is an individual anomaly detection task. You can define multiple dete - If the custom index you specify does not exist, the Anomaly Detection plugin will create it when you create the detector and start your real-time or historical analysis. - If the custom index already exists, the plugin will verify that the index mapping matches the required structure for anomaly results. In this case, ensure that the custom index has a valid mapping as defined in the [`anomaly-results.json`](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json) file. - To use the custom result index option, you need the following permissions: - - `indices:admin/create` - Required for the Anomaly Detection plugin to create and roll over the custom index. + - `indices:admin/create` - The Anomaly Detection plugin requires the ability to create and roll over the custom index. - `indices:admin/aliases` - Required for the Anomaly Detection plugin to create and access an alias for the custom index. - `indices:data/write/index` - You need the `write` permission for the Anomaly Detection plugin to write results into the custom index for a single-entity detector. - `indices:data/read/search` - You need the `search` permission because the Anomaly Detection plugin needs to search custom result indexes to show results on the anomaly detection UI. From c4dcec6c13970484426166d0bf077d26003cceb8 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:32:26 -0700 Subject: [PATCH 09/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 2fc8252855..83ffe31a17 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -99,7 +99,7 @@ A detector is an individual anomaly detection task. You can define multiple dete Parameter | Description | Type | Unit | Example | Required :--- | :--- |:--- |:--- |:--- |:--- - `result_index_min_size` | Specifies the minimum size of total primary shard storage (excluding replicas) required to roll over the index. For example, if `result_index_min_size` is set to 100 GiB and the index has 5 primary shards and 5 replica shards of 20 GiB each, the total size of all primary shards is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No + `result_index_min_size` | The minimum total size of primary shards (excluding replicas) required for index rollover. If set to 100 GiB and the index has 5 primary and 5 replica shards of 20 GiB each, the total primary shard size is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No `result_index_min_age` | Specifies the minimum age of the index required to roll over. The index age is calculated from its creation time to the current time. | `integer` |`day` | `7` | No `result_index_ttl` | Specifies the minimum age required to permanently delete rolled over indexes. | `integer` | `day` | `60` | No From 5731616ab0dfd700a2851e7e9336880ed6e53f70 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:32:41 -0700 Subject: [PATCH 10/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 83ffe31a17..3fc4383100 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -100,7 +100,7 @@ A detector is an individual anomaly detection task. You can define multiple dete Parameter | Description | Type | Unit | Example | Required :--- | :--- |:--- |:--- |:--- |:--- `result_index_min_size` | The minimum total size of primary shards (excluding replicas) required for index rollover. If set to 100 GiB and the index has 5 primary and 5 replica shards of 20 GiB each, the total primary shard size is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No - `result_index_min_age` | Specifies the minimum age of the index required to roll over. The index age is calculated from its creation time to the current time. | `integer` |`day` | `7` | No + `result_index_min_age` | The minimum age of the index required for rollover, calculated from its creation time to the current time. | `integer` |`day` | `7` | No `result_index_ttl` | Specifies the minimum age required to permanently delete rolled over indexes. | `integer` | `day` | `60` | No 1. Choose **Next**. From 5fb09c711619e1b3b8232ede5a974a06d53a3f3f Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:33:03 -0700 Subject: [PATCH 11/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 3fc4383100..f260b5ffbe 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -101,7 +101,7 @@ A detector is an individual anomaly detection task. You can define multiple dete :--- | :--- |:--- |:--- |:--- |:--- `result_index_min_size` | The minimum total size of primary shards (excluding replicas) required for index rollover. If set to 100 GiB and the index has 5 primary and 5 replica shards of 20 GiB each, the total primary shard size is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No `result_index_min_age` | The minimum age of the index required for rollover, calculated from its creation time to the current time. | `integer` |`day` | `7` | No - `result_index_ttl` | Specifies the minimum age required to permanently delete rolled over indexes. | `integer` | `day` | `60` | No + `result_index_ttl` | The minimum age required to permanently delete rolled over indexes. | `integer` | `day` | `60` | No 1. Choose **Next**. From b0ed8a7141c2a419765d15f2c8e259f930e6682e Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:33:21 -0700 Subject: [PATCH 12/18] Update _observing-your-data/ad/index.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index f260b5ffbe..071a2487e5 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -86,7 +86,7 @@ A detector is an individual anomaly detection task. You can define multiple dete - If the custom index already exists, the plugin will verify that the index mapping matches the required structure for anomaly results. In this case, ensure that the custom index has a valid mapping as defined in the [`anomaly-results.json`](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json) file. - To use the custom result index option, you need the following permissions: - `indices:admin/create` - The Anomaly Detection plugin requires the ability to create and roll over the custom index. - - `indices:admin/aliases` - Required for the Anomaly Detection plugin to create and access an alias for the custom index. + - `indices:admin/aliases` - The Anomaly Detection plugin requires access to create and manage an alias for the custom index. - `indices:data/write/index` - You need the `write` permission for the Anomaly Detection plugin to write results into the custom index for a single-entity detector. - `indices:data/read/search` - You need the `search` permission because the Anomaly Detection plugin needs to search custom result indexes to show results on the anomaly detection UI. - `indices:data/write/delete` - Because the detector might generate a large number of anomaly results, you need the `delete` permission to delete old data and save disk space. From f91091258c4184928bcd7041bf41defb72a9c653 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:33:42 -0700 Subject: [PATCH 13/18] Update _observing-your-data/ad/settings.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/settings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/settings.md b/_observing-your-data/ad/settings.md index 44f4526976..62987dc82f 100644 --- a/_observing-your-data/ad/settings.md +++ b/_observing-your-data/ad/settings.md @@ -49,5 +49,5 @@ plugins.anomaly_detection.dedicated_cache_size | 10 | If the real-time analysis plugins.anomaly_detection.max_concurrent_preview | 2 | The maximum number of concurrent previews. You can use this setting to limit resource usage. plugins.anomaly_detection.model_max_size_percent | 0.1 | The upper bound of the memory percentage for a model. plugins.anomaly_detection.door_keeper_in_cache.enabled | False | When set to `true`, OpenSearch places a bloom filter in front of an inactive entity cache to filter out items that are not likely to appear more than once. -plugins.anomaly_detection.hcad_cold_start_interpolation.enabled | False | When set to `true`, enables interpolation in high-cardinality anomaly detection (HCAD) cold start. +plugins.anomaly_detection.hcad_cold_start_interpolation.enabled | False | When set to true, enables interpolation for high-cardinality anomaly detection (HCAD) during the initial cold start period. plugins.anomaly_detection.jvm_heap_usage_threshold | 95 | The JVM memory usage threshold at which anomaly detectors are disabled. Defaults to 95% of the JVM heap. \ No newline at end of file From 4d22b55574b66cdc56f1507d5acba9715c4ca881 Mon Sep 17 00:00:00 2001 From: Kaituo Li Date: Fri, 14 Jun 2024 12:34:01 -0700 Subject: [PATCH 14/18] Update _observing-your-data/ad/settings.md Co-authored-by: Melissa Vagi Signed-off-by: Kaituo Li --- _observing-your-data/ad/settings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/settings.md b/_observing-your-data/ad/settings.md index 62987dc82f..e272435a63 100644 --- a/_observing-your-data/ad/settings.md +++ b/_observing-your-data/ad/settings.md @@ -50,4 +50,4 @@ plugins.anomaly_detection.max_concurrent_preview | 2 | The maximum number of con plugins.anomaly_detection.model_max_size_percent | 0.1 | The upper bound of the memory percentage for a model. plugins.anomaly_detection.door_keeper_in_cache.enabled | False | When set to `true`, OpenSearch places a bloom filter in front of an inactive entity cache to filter out items that are not likely to appear more than once. plugins.anomaly_detection.hcad_cold_start_interpolation.enabled | False | When set to true, enables interpolation for high-cardinality anomaly detection (HCAD) during the initial cold start period. -plugins.anomaly_detection.jvm_heap_usage_threshold | 95 | The JVM memory usage threshold at which anomaly detectors are disabled. Defaults to 95% of the JVM heap. \ No newline at end of file +plugins.anomaly_detection.jvm_heap_usage_threshold | 95 | Specifies the JVM memory usage threshold, as a percentage, at which anomaly detectors will be disabled. The default value is 95%, meaning detectors will be disabled when JVM heap usage reaches 95%. \ No newline at end of file From 8ed9c923ef892f6c3275db1e6d5f72c6b60469b5 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Fri, 14 Jun 2024 14:05:19 -0600 Subject: [PATCH 15/18] Update _observing-your-data/ad/index.md Signed-off-by: Melissa Vagi --- _observing-your-data/ad/index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 071a2487e5..7ad2c15e8a 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -33,7 +33,8 @@ A detector is an individual anomaly detection task. You can define multiple dete - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only a [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) is supported for query domain-specific language (DSL). #### Example filter using query DSL - The query is designed to match documents that have specific values in the urlPath.keyword field. Specifically, it will match documents where the urlPath.keyword field is equal to one of the following values: +The query is designed to retrieve documents where the `urlPath.keyword` field matches one of the following specified values: + - /domain/{id}/short - /sub_dir/{id}/short - /abcd/123/{id}/xyz From 9e1b687dc2a06c1e466c32c42af72cd488fe3f98 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Fri, 14 Jun 2024 14:05:28 -0600 Subject: [PATCH 16/18] Update _observing-your-data/ad/index.md Signed-off-by: Melissa Vagi --- _observing-your-data/ad/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 7ad2c15e8a..5144f0b801 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -95,7 +95,7 @@ The query is designed to retrieve documents where the `urlPath.keyword` field ma - Managing the custom result index: - The anomaly detection dashboard queries all detectors’ results from all custom result indexes. Having too many custom result indexes might impact the performance of the Anomaly Detection plugin. - You can use [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/ism/index/) to rollover old result indexes. You can also manually delete or archive any old result indexes. We recommend reusing a custom result index for multiple detectors. - - The Anomaly Detection plugin can also be used to manage the lifecycle of custom indexes. It rolls an alias over to a new index when the custom result index meets any of the following conditions: + - The Anomaly Detection plugin also provides lifecycle management for custom indexes. It rolls an alias over to a new index when the custom result index meets any of the following conditions: Parameter | Description | Type | Unit | Example | Required From a175c586942f3feaf872726a12b90eb326842362 Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Mon, 17 Jun 2024 10:24:50 -0600 Subject: [PATCH 17/18] Update index.md Signed-off-by: Melissa Vagi Signed-off-by: Melissa Vagi --- _observing-your-data/ad/index.md | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/_observing-your-data/ad/index.md b/_observing-your-data/ad/index.md index 5144f0b801..5dfa1b8f1a 100644 --- a/_observing-your-data/ad/index.md +++ b/_observing-your-data/ad/index.md @@ -30,10 +30,10 @@ A detector is an individual anomaly detection task. You can define multiple dete - Enter a name and brief description. Make sure the name is unique and descriptive enough to help you to identify the purpose of the detector. 1. Specify the data source. - For **Data source**, choose the index you want to use as the data source. You can optionally use index patterns to choose multiple indexes. - - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only a [Boolean query]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) is supported for query domain-specific language (DSL). + - (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query. Only [Boolean queries]({{site.url}}{{site.baseurl}}/query-dsl/compound/bool/) are supported for query domain-specific language (DSL). #### Example filter using query DSL -The query is designed to retrieve documents where the `urlPath.keyword` field matches one of the following specified values: +The query is designed to retrieve documents in which the `urlPath.keyword` field matches one of the following specified values: - /domain/{id}/short - /sub_dir/{id}/short @@ -76,33 +76,32 @@ The query is designed to retrieve documents where the `urlPath.keyword` field ma - (Optional) To add extra processing time for data collection, specify a **Window delay** value. - This value tells the detector that the data is not ingested into OpenSearch in real time but with a certain delay. Set the window delay to shift the detector interval to account for this delay. - For example, say the detector interval is 10 minutes and data is ingested into your cluster with a general delay of 1 minute. Assume the detector runs at 2:00. The detector attempts to get the last 10 minutes of data from 1:50 to 2:00, but because of the 1-minute delay, it only gets 9 minutes of data and misses the data from 1:59 to 2:00. Setting the window delay to 1 minute shifts the interval window to 1:49--1:59, so the detector accounts for all 10 minutes of the detector interval time. -1. Specify custom result index. - - The Anomaly Detection plugin allows you to store anomaly detection results in a custom index of your choice. To enable this, select **Enable custom result index** and provide a name for your index, for example, `abc`. The plugin then creates an alias prefixed with `opensearch-ad-plugin-result-` followed by your chosen name, for example, `opensearch-ad-plugin-result-abc`. This alias points to an actual index with a name containing the date and a sequence number, like `opensearch-ad-plugin-result-abc-history-2024.06.12-000002`, where your results are stored. +1. Specify custom results index. + - The Anomaly Detection plugin allows you to store anomaly detection results in a custom index of your choice. To enable this, select **Enable custom results index** and provide a name for your index, for example, `abc`. The plugin then creates an alias prefixed with `opensearch-ad-plugin-result-` followed by your chosen name, for example, `opensearch-ad-plugin-result-abc`. This alias points to an actual index with a name containing the date and a sequence number, like `opensearch-ad-plugin-result-abc-history-2024.06.12-000002`, where your results are stored. - You can use the dash “-” sign to separate the namespace to manage custom result index permissions. For example, if you use `opensearch-ad-plugin-result-financial-us-group1` as the result index, you can create a permission role based on the pattern `opensearch-ad-plugin-result-financial-us-*` to represent the "financial" department at a granular level for the "us" area. + You can use the dash “-” sign to separate the namespace to manage custom results index permissions. For example, if you use `opensearch-ad-plugin-result-financial-us-group1` as the results index, you can create a permission role based on the pattern `opensearch-ad-plugin-result-financial-us-*` to represent the "financial" department at a granular level for the "us" area. {: .note } - - When the Security plugin (fine-grained access control) is enabled, the default result index becomes a system index and is no longer accessible through the standard Index or Search API. To access its content, you must use the anomaly detection RESTful API or the dashboard. As a result, you cannot build customized dashboards using the default result index if the Security plugin is enabled. However, you can create a custom result index to build customized dashboards. + - When the Security plugin (fine-grained access control) is enabled, the default results index becomes a system index and is no longer accessible through the standard Index or Search APIs. To access its content, you must use the Anomaly Detection RESTful API or the dashboard. As a result, you cannot build customized dashboards using the default results index if the Security plugin is enabled. However, you can create a custom results index in order to build customized dashboards. - If the custom index you specify does not exist, the Anomaly Detection plugin will create it when you create the detector and start your real-time or historical analysis. - If the custom index already exists, the plugin will verify that the index mapping matches the required structure for anomaly results. In this case, ensure that the custom index has a valid mapping as defined in the [`anomaly-results.json`](https://github.com/opensearch-project/anomaly-detection/blob/main/src/main/resources/mappings/anomaly-results.json) file. - - To use the custom result index option, you need the following permissions: + - To use the custom results index option, you need the following permissions: - `indices:admin/create` - The Anomaly Detection plugin requires the ability to create and roll over the custom index. - `indices:admin/aliases` - The Anomaly Detection plugin requires access to create and manage an alias for the custom index. - `indices:data/write/index` - You need the `write` permission for the Anomaly Detection plugin to write results into the custom index for a single-entity detector. - - `indices:data/read/search` - You need the `search` permission because the Anomaly Detection plugin needs to search custom result indexes to show results on the anomaly detection UI. + - `indices:data/read/search` - You need the `search` permission because the Anomaly Detection plugin needs to search custom results indexes to show results on the Anomaly Detection UI. - `indices:data/write/delete` - Because the detector might generate a large number of anomaly results, you need the `delete` permission to delete old data and save disk space. - `indices:data/write/bulk*` - You need the `bulk*` permission because the Anomaly Detection plugin uses the bulk API to write results into the custom index. - - Managing the custom result index: - - The anomaly detection dashboard queries all detectors’ results from all custom result indexes. Having too many custom result indexes might impact the performance of the Anomaly Detection plugin. - - You can use [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/ism/index/) to rollover old result indexes. You can also manually delete or archive any old result indexes. We recommend reusing a custom result index for multiple detectors. - - The Anomaly Detection plugin also provides lifecycle management for custom indexes. It rolls an alias over to a new index when the custom result index meets any of the following conditions: - + - Managing the custom results index: + - The anomaly detection dashboard queries all detectors’ results from all custom results indexes. Having too many custom results indexes might impact the performance of the Anomaly Detection plugin. + - You can use [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/ism/index/) to rollover old results indexes. You can also manually delete or archive any old results indexes. We recommend reusing a custom results index for multiple detectors. + - The Anomaly Detection plugin also provides lifecycle management for custom indexes. It rolls an alias over to a new index when the custom results index meets any of the conditions in the following table. Parameter | Description | Type | Unit | Example | Required :--- | :--- |:--- |:--- |:--- |:--- - `result_index_min_size` | The minimum total size of primary shards (excluding replicas) required for index rollover. If set to 100 GiB and the index has 5 primary and 5 replica shards of 20 GiB each, the total primary shard size is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No - `result_index_min_age` | The minimum age of the index required for rollover, calculated from its creation time to the current time. | `integer` |`day` | `7` | No - `result_index_ttl` | The minimum age required to permanently delete rolled over indexes. | `integer` | `day` | `60` | No + `result_index_min_size` | The minimum total primary shard size (excluding replicas) required for index rollover. If set to 100 GiB and the index has 5 primary and 5 replica shards of 20 GiB each, then the total primary shard size is 100 GiB, triggering the rollover. | `integer` | `MB` | `51200` | No + `result_index_min_age` | The minimum index age required for rollover, calculated from its creation time to the current time. | `integer` |`day` | `7` | No + `result_index_ttl` | The minimum age required to permanently delete rolled-over indexes. | `integer` | `day` | `60` | No 1. Choose **Next**. From 998e77acec93f3b57923934606cd6baebc0b7c6d Mon Sep 17 00:00:00 2001 From: Melissa Vagi Date: Mon, 17 Jun 2024 10:25:12 -0600 Subject: [PATCH 18/18] Update _observing-your-data/ad/settings.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi --- _observing-your-data/ad/settings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_observing-your-data/ad/settings.md b/_observing-your-data/ad/settings.md index e272435a63..e3d45ebd28 100644 --- a/_observing-your-data/ad/settings.md +++ b/_observing-your-data/ad/settings.md @@ -50,4 +50,4 @@ plugins.anomaly_detection.max_concurrent_preview | 2 | The maximum number of con plugins.anomaly_detection.model_max_size_percent | 0.1 | The upper bound of the memory percentage for a model. plugins.anomaly_detection.door_keeper_in_cache.enabled | False | When set to `true`, OpenSearch places a bloom filter in front of an inactive entity cache to filter out items that are not likely to appear more than once. plugins.anomaly_detection.hcad_cold_start_interpolation.enabled | False | When set to true, enables interpolation for high-cardinality anomaly detection (HCAD) during the initial cold start period. -plugins.anomaly_detection.jvm_heap_usage_threshold | 95 | Specifies the JVM memory usage threshold, as a percentage, at which anomaly detectors will be disabled. The default value is 95%, meaning detectors will be disabled when JVM heap usage reaches 95%. \ No newline at end of file +plugins.anomaly_detection.jvm_heap_usage_threshold | 95 | Specifies the JVM memory usage threshold, as a percentage, at which anomaly detectors will be disabled. The default value is 95%, meaning that detectors will be disabled when JVM heap usage reaches 95%. \ No newline at end of file