aws-step-function-tasks: Allow removing launchType from RunEcsEc2Task #7967

mb-dev · 2020-05-13T21:09:14Z

capacity providers allow using ECS with dynamic capacity. https://aws.amazon.com/tw/about-aws/whats-new/2019/12/amazon-ecs-capacity-providers-now-available/ , yet CloudFormation and CDK do not yet support this feature #5471.

I propose as a stop gap solution to allow users to submit EC2 tasks without launchType:
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RunTask.html#ECS-RunTask-request-launchType

if launchType is provided, there is no way to get the task to use the capacity providers that are created outside CDK.

Use Case

I want to create capacity providers using boto3 and use them when launching EC2 tasks using step functions.

Proposed Solution

Provider a way to make launchType optional here:
https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-stepfunctions-tasks/lib/ecs/run-ecs-ec2-task.ts#L62

Or to change it after construction.

Other

👋 I may be able to implement this feature request
⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

The text was updated successfully, but these errors were encountered:

shivlaks · 2020-06-30T01:45:15Z

since RunEcsEc2Task and RunEcsFargateTask are going to be marked as @deprecated, this feature should be introduced in RunTask

shivlaks · 2020-06-30T06:51:24Z

@mb-dev PR incoming once I merge in #8451 to make launchType optional. Step Functions does not support specifying capacityProviderStrategy as a parameter.

As per documentation you've referenced

If a capacityProviderStrategy is specified, the launchType parameter must be omitted. If no capacityProviderStrategy or launchType is specified, the defaultCapacityProviderStrategy for the cluster is used.

Although this unlocks some usage, it's still incomplete until users can supply a capacity provider strategy. Since we're limited I'm inclined to agree with your suggested stop-gap. It will still be compatible when capacityProviderStrategy can be supplied.

edit: added following question:

What happens if a cluster does not have a default capacity provider strategy? I believe that's optional too?

shivlaks · 2020-06-30T07:18:00Z

per @mb-dev in comment

@shivlaks what alarmed me is that the PR makes the definition depend more on launch type than before. launchTarget sounds like launchType, and it's tied to either EC2/Fargate. I am not sure if it will be clear how to express a undefined launchType while still specifying parameters to be used at creation.

Currently no launchType might look like:

const runTask = new tasks.EcsRunTask(stack, 'Run', {
    integrationPattern: sfn.IntegrationPattern.RUN_JOB,
    cluster,
    taskDefinition,
    launchTarget: new tasks.EcsEc2LaunchTarget({
      launchType: undefined,
    }),
  });

If launchTarget would be named something like clusterConfiguration it might be easier to parse.

shivlaks · 2020-06-30T07:39:12Z

@mb-dev launchType was never really exposed as a property (with the old classes or with the new implementation)
The thinking was to associate launchTarget closely to launchType.

I was thinking of introducing something like IEcsCapacityProvider to add the capability to specify capacity providers. Users would specify either capacity providers or a launch target (which would become optional).

const runTask = new tasks.EcsRunTask(stack, 'Run', {
    integrationPattern: sfn.IntegrationPattern.RUN_JOB,
    cluster,
    taskDefinition,
    capacityProviders: [new tasks.EcsEc2CapacityProvider({
      ...
    })],
  });

and adding types for EcsFargateCapacityProvider and EcsEc2CapacityProvider classes to help populate that list of capacity providers.

what do you think?

mb-dev · 2020-06-30T15:23:04Z

@shivlaks thanks for moving the discussion here. The capacity providers are a property of ECS cluster, while the task level property is choosing one of the providers specified in the cluster definition. Given that users might create ECS cluster or capacity providers outside of CDK, and that there's an option for default capacity provider (as this stop gap suggests)

there are actually 3 scenarios for using runTask in step functions:

users specifying launchType
users specifying capacityProvider
users specifying neither capacity provider or launchtype - in this case the default capacity provider is used. See:
https://docs.aws.amazon.com/AmazonECS/latest/userguide/cluster-capacity-providers.html

Default capacity provider strategy
A default capacity provider strategy is associated with each Amazon ECS cluster. This determines the capacity provider strategy the cluster will use if no other capacity provider strategy or launch type is specified when running a task or creating a service.

One idea is to expose DEFAULT_CAPACITY_PROVIDER constant that can be assigned to launchType or launchTarget that will make it clear we want default capacity provider

mb-dev · 2020-07-01T21:15:32Z

Capacity Providers were added to Cloudformation:
4ce27f4#diff-4a4e15a081904ee16b4d84e2f5cf5aee

So now it's even more important to be able to omit launchType.

Related: #5471

shivlaks · 2020-07-02T01:26:59Z

@mb-dev

The recommended workaround for coverage that isn't quite supported in the aws-stepfunctions-tasks module is to supply the needed ASL as a custom state. You can create your task, and supply the .toStateJson() as input (you'd need to adjust for any parameters you want to add/remove/modify)

We're still discussing options, but are not convinced that the stop-gap needs to be introduced directly into the task or folded into the EcsEc2LaunchTarget / EcsFargateLaunchTarget classes yet.

launchtarget as a whole should likely be something we can omit. Can you help me better understand the EC2 / Fargate specific parameters that would need to be supplied (with the omission of launch target). I don't mind the constant idea but will want to better understand what the ASL looks like for these use cases.

mb-dev · 2020-07-02T03:48:55Z

Ooo, i totally missed the escape hatch in the changelog. I will try that tomorrow and maybe that will be enough until full capacity providers support is added.

My pre-refactor EC2 integration is straightforward (in Python):

sft.RunEcsEc2Task(
            cluster=cluster,
            task_definition=task_definition,
            integration_pattern=sfn.ServiceIntegrationPattern.SYNC,
            container_overrides={...},
        )

So I personally won't need launchTarget, just not launchType.

shivlaks · 2020-07-07T14:08:40Z

@mb-dev let me know if I can help with checking out the implementation of the custom state or if it would help for me to whip up an example!

PierreKiwi · 2021-02-03T23:00:45Z

Hello!
I have been using the CustomState to avoid this problem but now I am facing the problem of losing ResultPath during the "translation" (cf. this issue #8754).

Not really sure there is an easy way to avoid the problem (I can wrap my step in a parallel step) but annoying...

trobert2 · 2022-06-27T12:18:43Z

is there any progress here? what does the roadmap look like?
One year later, I still can't get the task to run on the correct capacity provider.

ferrarijefferson · 2023-05-29T18:46:46Z

is there any progress here? what does the roadmap look like? One year later, I still can't get the task to run on the correct capacity provider.

You can use FARGATE_SPOT with Step Functions, as shown in my tests. You just need to omit the LaunchType field in your state machine definition.

Remove the field "LaunchType": "FARGATE" from the parameters.

{
  "Comment": "State machine integrated with ECS",
  "StartAt": "Initial",
  "States": {
    "Initial": {
      "Type": "Task",
      "Resource": "arn:aws:states:::ecs:runTask.sync",
      "Parameters": {
        "Cluster": "${ECS_CLUSTER_ARN}",
        "PlatformVersion": "LATEST",
        "TaskDefinition": "${ECS_TASK_DEFINITION_ARN}",
        "NetworkConfiguration": {
          "AwsvpcConfiguration": {
            "Subnets": ${SUBNETS},
            "AssignPublicIp": "ENABLED"
          }
        },
        "Overrides": {
          "ContainerOverrides": ${ECS_CONTAINER_OVERRIDES}
        }
      },
      "TimeoutSeconds": 3600,
      "End": true
    }
  }
}

kevinbader · 2023-08-18T09:00:38Z

Just bumped into this as well. My use case is spawning long-running tasks on (expensive) GPU instances that are scaled to zero most of the time. When the StepFunction workflow is at EcsRunTask, the expectation is that the cluster spins up the GPU instance to do the work. But that only works if the default capacity provider is used; currently, the execution always fails with

No Container Instances were found in your cluster. (Service: AmazonECS; Status Code: 400; Error Code: InvalidParameterException; Request ID: 5652eb66-58cf-4720-a4d5-40bedf7ddf22; Proxy: null)

danw-mpl · 2023-11-23T22:20:14Z

I created a class which implements IEcsLaunchTarget so this is a drop-in workaround:

export class EcsFargateSpotLaunchTarget implements sfnTasks.IEcsLaunchTarget {
    /**
     * Launch the ECS task using Fargate Spot.
     * 
     * This functionality is not built into CDK yet.
     */
    constructor(private readonly cluster: ecs.Cluster, private readonly options?: sfnTasks.EcsFargateLaunchTargetOptions) { }

    /**
     * Called when the Fargate launch type configured on RunTask
     */
    public bind(_task: sfnTasks.EcsRunTask, launchTargetOptions: sfnTasks.LaunchTargetBindOptions): sfnTasks.EcsLaunchTargetConfig {
        this.cluster.enableFargateCapacityProviders();

        if (!launchTargetOptions.taskDefinition.isFargateCompatible) {
            throw new Error('Supplied TaskDefinition is not compatible with Fargate');
        }

        return {
            parameters: {
                PlatformVersion: this.options?.platformVersion,
                "CapacityProviderStrategy": [
                    {
                        "CapacityProvider": "FARGATE_SPOT",
                        "Weight": 1
                    }
                ],
            },
        };
    }
}

const task = new sfnTasks.EcsRunTask(this, 'RunTask', {
    ...
    launchTarget: new EcsFargateSpotLaunchTarget(ecsCluster)
});

Muppets · 2023-11-24T08:47:54Z

I like @danw-mpl solution. I went for a slightly different approach and removed the LaunchType property entirely, allowing the ECS task to take the default capacity provider settings from the cluster.

C# example:

    public class EcsRunTaskWithoutLaunchType : EcsRunTask
    {
        public EcsRunTaskWithoutLaunchType(Construct scope, string id, IEcsRunTaskProps props) : base(scope, id, props)
        {
        }

        protected EcsRunTaskWithoutLaunchType(ByRefValue reference) : base(reference)
        {
        }

        protected EcsRunTaskWithoutLaunchType(DeputyProps props) : base(props)
        {
        }

        public override JObject ToStateJson()
        {
            var stateJson = base.ToStateJson();

            ((JObject)stateJson["Parameters"]!).Remove("LaunchType");

            return stateJson;
        }
    }

danw-mpl · 2023-11-24T08:51:30Z

@Muppets I tried that initially and ran into an error from Step Functions - something like 'no container instances found in your cluster'.

I'd assume that's a problem with my cluster config though. Even though I had fargate capacity providers enabled.

mb-dev added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels May 13, 2020

SomayaB added the @aws-cdk/aws-stepfunctions-tasks label May 18, 2020

SomayaB assigned nija-at May 18, 2020

SomayaB added the @aws-cdk/aws-ecs Related to Amazon Elastic Container label May 18, 2020

nija-at assigned shivlaks and unassigned nija-at May 26, 2020

shivlaks mentioned this issue Jun 30, 2020

feat(stepfunctions-tasks): task construct to call RunJob on ECS #8451

Merged

shivlaks added effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Jun 30, 2020

shivlaks added effort/small Small work item – less than a day of effort and removed effort/medium Medium work item – several days of effort labels Jun 30, 2020

shivlaks added the p2 label Aug 21, 2020

ericzbeard unassigned shivlaks Jun 18, 2021

vobornij mentioned this issue Jun 21, 2021

(aws_ecs): Add support for DefaultCapacityProviderStrategy to the Cluster L2 construct #15230

Closed

BenChaimberg removed the @aws-cdk/aws-ecs Related to Amazon Elastic Container label Jul 21, 2021

philiptromans mentioned this issue Apr 21, 2022

(stepfunctions-tasks): Allow specifying CapacityProviderStrategy in EcsRunTask #20013

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-step-function-tasks: Allow removing launchType from RunEcsEc2Task #7967

aws-step-function-tasks: Allow removing launchType from RunEcsEc2Task #7967

mb-dev commented May 13, 2020 •

edited

shivlaks commented Jun 30, 2020

shivlaks commented Jun 30, 2020 •

edited

shivlaks commented Jun 30, 2020

shivlaks commented Jun 30, 2020

mb-dev commented Jun 30, 2020 •

edited

mb-dev commented Jul 1, 2020 •

edited

shivlaks commented Jul 2, 2020 •

edited

mb-dev commented Jul 2, 2020

shivlaks commented Jul 7, 2020

PierreKiwi commented Feb 3, 2021

trobert2 commented Jun 27, 2022

ferrarijefferson commented May 29, 2023 •

edited

kevinbader commented Aug 18, 2023

danw-mpl commented Nov 23, 2023

Muppets commented Nov 24, 2023 •

edited

danw-mpl commented Nov 24, 2023

aws-step-function-tasks: Allow removing launchType from RunEcsEc2Task #7967

aws-step-function-tasks: Allow removing launchType from RunEcsEc2Task #7967

Comments

mb-dev commented May 13, 2020 • edited

Use Case

Proposed Solution

Other

shivlaks commented Jun 30, 2020

shivlaks commented Jun 30, 2020 • edited

shivlaks commented Jun 30, 2020

shivlaks commented Jun 30, 2020

mb-dev commented Jun 30, 2020 • edited

mb-dev commented Jul 1, 2020 • edited

shivlaks commented Jul 2, 2020 • edited

mb-dev commented Jul 2, 2020

shivlaks commented Jul 7, 2020

PierreKiwi commented Feb 3, 2021

trobert2 commented Jun 27, 2022

ferrarijefferson commented May 29, 2023 • edited

kevinbader commented Aug 18, 2023

danw-mpl commented Nov 23, 2023

Muppets commented Nov 24, 2023 • edited

danw-mpl commented Nov 24, 2023

mb-dev commented May 13, 2020 •

edited

shivlaks commented Jun 30, 2020 •

edited

mb-dev commented Jun 30, 2020 •

edited

mb-dev commented Jul 1, 2020 •

edited

shivlaks commented Jul 2, 2020 •

edited

ferrarijefferson commented May 29, 2023 •

edited

Muppets commented Nov 24, 2023 •

edited