Update docs and improve code

elastic · May 17, 2024 · c4c1805 · c4c1805
1 parent 244375e
commit c4c1805
Show file tree

Hide file tree

Showing 2 changed files with 11 additions and 8 deletions.
diff --git a/docs/reference/ml/trained-models/apis/start-trained-model-deployment.asciidoc b/docs/reference/ml/trained-models/apis/start-trained-model-deployment.asciidoc
@@ -25,9 +25,9 @@ Currently only `pytorch` models are supported for deployment. Once deployed
 the model can be used by the <<inference-processor,{infer-cap} processor>>
 in an ingest pipeline or directly in the <<infer-trained-model>> API.
 
-A model can be deployed multiple times by using deployment IDs. A deployment ID 
-must be unique and should not match any other deployment ID or model ID, unless 
-it is the same as the ID of the model being deployed. If `deployment_id` is not 
+A model can be deployed multiple times by using deployment IDs. A deployment ID
+must be unique and should not match any other deployment ID or model ID, unless
+it is the same as the ID of the model being deployed. If `deployment_id` is not
 set, it defaults to the `model_id`.
 
 Scaling inference performance can be achieved by setting the parameters
@@ -61,7 +61,7 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=model-id]
 `cache_size`::
 (Optional, <<byte-units,byte value>>)
 The inference cache size (in memory outside the JVM heap) per node for the
-model. The default value is the size of the model as reported by the
+model. In serverless, the cache is disabled by default. Otherwise, the default value is the size of the model as reported by the
 `model_size_bytes` field in the <<get-trained-models-stats>>. To disable the
 cache, `0b` can be provided.
 
@@ -165,8 +165,8 @@ The API returns the following results:
 [[start-trained-model-deployment-deployment-id-example]]
 === Using deployment IDs
 
-The following example starts a new deployment for the `my_model` trained model 
-with the ID `my_model_for_ingest`. The deployment ID an be used in {infer} API 
+The following example starts a new deployment for the `my_model` trained model
+with the ID `my_model_for_ingest`. The deployment ID an be used in {infer} API
 calls or in {infer} processors.
 
 [source,console]
@@ -181,4 +181,4 @@ The `my_model` trained model can be deployed again with a different ID:
 --------------------------------------------------
 POST _ml/trained_models/my_model/deployment/_start?deployment_id=my_model_for_search
 --------------------------------------------------
-// TEST[skip:TBD]
+// TEST[skip:TBD]
diff --git a/...java/org/elasticsearch/xpack/ml/rest/inference/RestStartTrainedModelDeploymentAction.java b/...java/org/elasticsearch/xpack/ml/rest/inference/RestStartTrainedModelDeploymentAction.java
@@ -40,10 +40,13 @@ public RestStartTrainedModelDeploymentAction(boolean disableInferenceProcessCach
         super();
         if (disableInferenceProcessCache) {
             this.defaultCacheSize = ByteSizeValue.ZERO;
+        } else {
+            // Don't set the default cache size yet
+            defaultCacheSize = null;
         }
     }
 
-    private ByteSizeValue defaultCacheSize;
+    private final ByteSizeValue defaultCacheSize;
 
     @Override
     public String getName() {