diff --git a/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md b/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md index b5aa75ff7b..be458e3a00 100644 --- a/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md +++ b/explore-analyze/machine-learning/anomaly-detection/anomaly-detection-scale.md @@ -76,13 +76,17 @@ This consideration only applies to {{dfeeds}} that **do not** use aggregations. The `model_memory_limit` job configuration option sets the approximate maximum amount of memory resources required for analytical processing. When you create an {{anomaly-job}} in {{kib}}, it provides an estimate for this limit. The estimate is based on the analysis configuration details for the job and cardinality estimates, which are derived by running aggregations on the source indices as they exist at that specific point in time. -If you change the resources available on your {{ml}} nodes or make significant changes to the characteristics or cardinality of your data, the model memory requirements might also change. You can update the model memory limit for a job while it is closed. If you want to decrease the limit below the current model memory usage, however, you must clone and re-run the job. +If you change the resources available on your {{ml}} nodes or make significant changes to the characteristics or cardinality of your data, the model memory requirements might also change. You can update the model memory limit for a job while it is closed. + +{applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` Navigate to the **Anomaly Detection Jobs** page, open the edit flyout for the closed job, and select **Apply** under the **Model memory limit** field to apply the estimate without running the estimate API manually. If you want to decrease the limit to less than the current model memory usage, however, you must recreate the job. ::::{tip} -You can view the current model size statistics with the [get {{anomaly-job}} stats]({{es-apis}}operation/operation-ml-get-job-stats) and [get model snapshots]({{es-apis}}operation/operation-ml-get-model-snapshots) APIs. You can also obtain a model memory limit estimate at any time by running the [estimate {{anomaly-jobs}} model memory API]({{es-apis}}operation/operation-ml-estimate-model-memory). However, you must provide your own cardinality estimates. +You can view the current model size statistics with the [get {{anomaly-job}} stats]({{es-apis}}operation/operation-ml-get-job-stats) and [get model snapshots]({{es-apis}}operation/operation-ml-get-model-snapshots) APIs. + +{applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` To estimate `model_memory_limit` for an existing job in {{kib}}, select **Apply** under the **Model memory limit** field in the edit job flyout. Alternatively, use the [estimate {{anomaly-jobs}} model memory API]({{es-apis}}operation/operation-ml-estimate-model-memory) to get a model memory limit estimate. However, you must provide your own cardinality estimates when using the API. :::: -As a job approaches its model memory limit, the memory status is `soft_limit` and older models are more aggressively pruned to free up space. If you have categorization jobs, no further examples are stored. When a job exceeds its limit, the memory status is `hard_limit` and the job no longer models new entities. It is therefore important to have appropriate memory model limits for each job. If you reach the hard limit and are concerned about the missing data, ensure that you have adequate resources then clone and re-run the job with a larger model memory limit. +As a job approaches its model memory limit, the memory status is `soft_limit` and older models are more aggressively pruned to free up space. If you have categorization jobs, no further examples are stored. When a job exceeds its limit, the memory status is `hard_limit` and the job no longer models new entities. It is therefore important to have appropriate memory model limits for each job. If you reach the hard limit and are concerned about the missing data, ensure that you have adequate resources then recreate the job with a larger model memory limit. ## 8. Pre-aggregate your data [pre-aggregate-data] diff --git a/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md b/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md index 0ca849b8e6..fdeff456d1 100644 --- a/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md +++ b/explore-analyze/machine-learning/anomaly-detection/ml-ad-run-jobs.md @@ -100,12 +100,14 @@ For each {{anomaly-job}}, you can optionally specify a `model_memory_limit`, whi You can also optionally specify the `xpack.ml.max_model_memory_limit` setting. By default, it’s not set, which means there is no upper bound on the acceptable `model_memory_limit` values in your jobs. ::::{tip} -If you set the `model_memory_limit` too high, it will be impossible to open the job; jobs cannot be allocated to nodes that have insufficient memory to run them. +If you set the `model_memory_limit` too high, the job can't open because it can't be allocated to a node with insufficient memory to run it. :::: +{applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` When you edit an existing {{anomaly-job}} on the **Anomaly Detection Jobs** page, the edit job flyout shows a memory estimate under the **Model memory limit** field. Select **Apply** to update `model_memory_limit` without recreating the job. + If the estimated model memory limit for an {{anomaly-job}} is greater than the model memory limit for the job or the maximum model memory limit for the cluster, the job creation wizards in {{kib}} generate a warning. If the estimated memory requirement is only a little higher than the `model_memory_limit`, the job will probably produce useful results. Otherwise, the actions you take to address these warnings vary depending on the resources available in your cluster: -* If you are using the default value for the `model_memory_limit` and the {{ml}} nodes in the cluster have lots of memory, the best course of action might be to simply increase the job’s `model_memory_limit`. Before doing this, however, double-check that the chosen analysis makes sense. The default `model_memory_limit` is relatively low to avoid accidentally creating a job that uses a huge amount of memory. +* If you are using the default value for the `model_memory_limit` and the {{ml}} nodes in the cluster have lots of memory, the best course of action might be to increase the job’s `model_memory_limit`. Before doing this, however, double-check that the chosen analysis makes sense. The default `model_memory_limit` is relatively low to avoid accidentally creating a job that uses a huge amount of memory. * If the {{ml}} nodes in the cluster do not have sufficient memory to accommodate a job of the estimated size, the only options are: * Add bigger {{ml}} nodes to the cluster, or * Accept that the job will hit its memory limit and will not necessarily find all the anomalies it could otherwise find. diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md index e445b47838..50727800f7 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-limitations.md @@ -35,12 +35,18 @@ Nested fields are not supported for {{dfanalytics-jobs}}. These fields are ignor ### {{dfanalytics-jobs-cap}} cannot be updated [dfa-update-limitations] -You cannot update {{dfanalytics}} configurations. Instead, delete the {{dfanalytics-job}} and create a new one. +You cannot update most {{dfanalytics}} configuration settings after job creation (such as the analysis type, dependent variable, or source index). To change those settings, delete the {{dfanalytics-job}} and create a new one. + +You can update `model_memory_limit` on a stopped job using the [update {{dfanalytics-jobs}} API]({{es-apis}}operation/operation-ml-update-data-frame-analytics). {applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` In {{kib}}, navigate to the **Data Frame Analytics** page, open the edit flyout for the stopped job, and select **Apply** under the **Model memory limit** field in the edit job flyout. ### {{dfanalytics-cap}} memory limitation [dfa-dataframe-size-limitations] {{dfanalytics-cap}} can only perform analyses that fit into the memory available for {{ml}}. Overspill to disk is not currently possible. For general {{ml}} settings, see [{{ml-cap}} settings in {{es}}](elasticsearch://reference/elasticsearch/configuration-reference/machine-learning-settings.md). +For each {{dfanalytics-job}}, you can optionally specify a `model_memory_limit`, which is the approximate maximum amount of memory resources required for training and analysis. When you create a {{dfanalytics-job}} in {{kib}}, the job creation wizard can estimate this limit based on your data and analysis configuration. You can also obtain a memory estimate at any time by running the [explain {{dfanalytics}} API]({{es-apis}}operation/operation-ml-explain-data-frame-analytics). + +If a job fails to start because it requires more memory than the configured limit, stop the job and increase `model_memory_limit` using the [update {{dfanalytics-jobs}} API]({{es-apis}}operation/operation-ml-update-data-frame-analytics) or, {applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` select **Apply** under the **Model memory limit** field in the edit job flyout. For more guidance, refer to [Working with {{dfanalytics}} at scale](ml-dfa-scale.md#set-model-memory-limit). + When you create a {{dfanalytics-job}} and the inference step of the process fails due to the model is too large to fit into JVM, follow the steps in [this GitHub issue](https://github.com/elastic/elasticsearch/issues/76093) for a workaround. ### {{dfanalytics-jobs-cap}} cannot use more than 2^32^ documents for training [dfa-training-docs] diff --git a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md index 8abe9970bd..e80157b721 100644 --- a/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md +++ b/explore-analyze/machine-learning/data-frame-analytics/ml-dfa-scale.md @@ -81,6 +81,16 @@ If your data is large and you do not need to test and train on the whole source (This step only applies to {{regression}} and {{classification}} jobs.) -[Hyperparameter optimization](hyperparameters.md) is the most complicated mathematical process during model training and may take a long time. +[Hyperparameter optimization](hyperparameters.md) is the most complicated mathematical process during model training and might take a long time. -By default, optimized hyperparameter values are chosen automatically. It is possible to reduce the time taken at this step by manually configuring hyperparameters – if you fully understand the purpose of the hyperparameters and have a sensible value for any or all of them. This reduces the computing load and therefore decreases training time. +By default, optimized hyperparameter values are chosen automatically. It is possible to reduce the time taken at this step by manually configuring hyperparameters – if you fully understand the purpose of the hyperparameters and have a sensible value for any or all. This reduces the computing load and therefore decreases training time. + +## 7. Set the model memory limit [set-model-memory-limit] + +The `model_memory_limit` job configuration option sets the approximate maximum amount of memory resources required for training and analysis. When you create a {{dfanalytics-job}} in {{kib}}, the job creation wizard can estimate this limit based on your data and analysis configuration. + +If a job fails to start because it requires more memory than the configured limit, or your data characteristics change, you can update `model_memory_limit` on a stopped job using the [update {{dfanalytics-jobs}} API]({{es-apis}}operation/operation-ml-update-data-frame-analytics). {applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` In {{kib}}, navigate to the **Data Frame Analytics** page, open the edit flyout for the stopped job, and select **Apply** under the **Model memory limit** field to apply the estimate without running the explain API manually. + +::::{tip} +To get a memory estimate, use the [explain {{dfanalytics}} API]({{es-apis}}operation/operation-ml-explain-data-frame-analytics), which reports how much memory the analysis might require. {applies_to}`stack: ga 9.5`{applies_to}`serverless: ga` Alternatively, select **Apply** under the **Model memory limit** field in the edit job flyout. +::::