Skip to content

Commit

Permalink
Update docs and improve code
Browse files Browse the repository at this point in the history
  • Loading branch information
maxhniebergall committed May 17, 2024
1 parent 244375e commit c4c1805
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ Currently only `pytorch` models are supported for deployment. Once deployed
the model can be used by the <<inference-processor,{infer-cap} processor>>
in an ingest pipeline or directly in the <<infer-trained-model>> API.

A model can be deployed multiple times by using deployment IDs. A deployment ID
must be unique and should not match any other deployment ID or model ID, unless
it is the same as the ID of the model being deployed. If `deployment_id` is not
A model can be deployed multiple times by using deployment IDs. A deployment ID
must be unique and should not match any other deployment ID or model ID, unless
it is the same as the ID of the model being deployed. If `deployment_id` is not
set, it defaults to the `model_id`.

Scaling inference performance can be achieved by setting the parameters
Expand Down Expand Up @@ -61,7 +61,7 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=model-id]
`cache_size`::
(Optional, <<byte-units,byte value>>)
The inference cache size (in memory outside the JVM heap) per node for the
model. The default value is the size of the model as reported by the
model. In serverless, the cache is disabled by default. Otherwise, the default value is the size of the model as reported by the
`model_size_bytes` field in the <<get-trained-models-stats>>. To disable the
cache, `0b` can be provided.

Expand Down Expand Up @@ -165,8 +165,8 @@ The API returns the following results:
[[start-trained-model-deployment-deployment-id-example]]
=== Using deployment IDs

The following example starts a new deployment for the `my_model` trained model
with the ID `my_model_for_ingest`. The deployment ID an be used in {infer} API
The following example starts a new deployment for the `my_model` trained model
with the ID `my_model_for_ingest`. The deployment ID an be used in {infer} API
calls or in {infer} processors.

[source,console]
Expand All @@ -181,4 +181,4 @@ The `my_model` trained model can be deployed again with a different ID:
--------------------------------------------------
POST _ml/trained_models/my_model/deployment/_start?deployment_id=my_model_for_search
--------------------------------------------------
// TEST[skip:TBD]
// TEST[skip:TBD]
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,13 @@ public RestStartTrainedModelDeploymentAction(boolean disableInferenceProcessCach
super();
if (disableInferenceProcessCache) {
this.defaultCacheSize = ByteSizeValue.ZERO;
} else {
// Don't set the default cache size yet
defaultCacheSize = null;
}
}

private ByteSizeValue defaultCacheSize;
private final ByteSizeValue defaultCacheSize;

@Override
public String getName() {
Expand Down

0 comments on commit c4c1805

Please sign in to comment.