Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stepfunctions): distributed map state #24331

Closed
wants to merge 38 commits into from

Conversation

beck3905
Copy link

Adds support for Step Functions Map state in Distributed mode. I decided to implement this as a new state type rather than as properties on the existing Map state. I thought that would make for a simpler user experience rather than having a bunch of complicated combinations of configuration properties.

Closes #23265.
Closes #23216.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK p2 labels Feb 24, 2023
@aws-cdk-automation aws-cdk-automation requested a review from a team February 24, 2023 21:01
@beck3905
Copy link
Author

I decided to create this as draft because it's my first PR here and my first time working with Typescript (I'm a Python guy). I appreciate any feedback you can provide and please let me know if you think I can mark this ready for official review.

Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.

A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.

@beck3905
Copy link
Author

Also, thanks to @ayush987goyal and their PR #23306 for inspiration and guidance. I borrowed some of this work from their PR.

@beck3905 beck3905 changed the title feat(stepfunctions): Distributed Map State feat(stepfunctions): distributed map state Feb 24, 2023
@aws-cdk-automation aws-cdk-automation dismissed their stale review February 24, 2023 21:14

✅ Updated pull request passes all PRLinter validations. Dissmissing previous PRLinter review.

@beck3905 beck3905 marked this pull request as ready for review February 24, 2023 21:20
@beck3905
Copy link
Author

I just decided to promote this to "ready for review". I think I've covered everything in the contributing guide and design guide. I look forward to your feedback.

@beck3905 beck3905 force-pushed the distributed-map branch 2 times, most recently from 8b67a01 to 8a968e3 Compare February 27, 2023 15:41
@beck3905
Copy link
Author

@kaizencc tagging you here because I see you were involved with the PR flow chart in the contributing guide. According to the flow chart it seems that this PR should be labelled p1 but it is labelled as p2. I'm curious what the disconnect is there. Thanks.

Copy link
Contributor

@comcalvi comcalvi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, looks good overall

packages/@aws-cdk/aws-stepfunctions/README.md Outdated Show resolved Hide resolved

In this mode, the Map state supports up to 40 concurrent iterations.

A Map state set to Inline is known as an Inline Map state. Use the Map state in Inline mode if your workflow's execution history won't exceed 25,000 entries, or if you don't require more than 40 concurrent iterations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A Map state set to Inline is known as an Inline Map state. Use the Map state in Inline mode if your workflow's execution history won't exceed 25,000 entries, or if you don't require more than 40 concurrent iterations.
Use the Map state in Inline mode if your workflow's execution history won't exceed 25,000 entries, or if you don't require more than 40 concurrent iterations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to clean this up if you feel strongly about it, but this was also copy/pasted directly from the docs. https://docs.aws.amazon.com/step-functions/latest/dg/concepts-asl-use-map-state-inline.html

packages/@aws-cdk/aws-stepfunctions/README.md Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-stepfunctions/README.md Outdated Show resolved Hide resolved
// Leaving this commented for review, as I want to make sure this truly isn't necessary.
// My understanding is that the `Ref` of the role should be enough to create an implicit dependency.
// However, this was adding both a dependency on the role and the policy, but the policy
// depends on the state machine thus creating a circular dependency. By removing
// this statement, the circular dependency is no longer an issue.
// If removing this is approved in the PR process, I will delete it before merge.
resource.node.addDependency(this.role);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently creates a circular dependency? Does this mean that every state machine currently fails to deploy with an error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to clean this up a little bit and I removed this comment. The issue here is that Distributed Map states require that the state machine role have execute permissions on the state machine itself. So this resulted in the policy depending on the state machine, the state machine depending on the state machine role and the state machine role depending on the policy. Thus, a circular dependency. I was able to get around this by using some wildcards instead of the specific state machine arn in the policy document. This is only an issue for the Distributed Map state because I had to add the execution permissions to the policy for the state machine in that case.

packages/@aws-cdk/aws-stepfunctions/lib/states/map.ts Outdated Show resolved Hide resolved
@mergify mergify bot dismissed comcalvi’s stale review February 27, 2023 22:46

Pull request has been modified.

@beck3905
Copy link
Author

Quick question from PR review comments. There were many suggestions regarding fixing whitespace in the code. I setup my dev environment in VSCode as described in the contributing guide. I guess I expected the tooling to format the code according to the desired style guide. Does the CDK repo not have a code formatter tool? Did I miss that? I did notice that code would change on save in my IDE to add missing punctuation and adjust whitespace, but apparently it didn't get all of it for some reason or it wasn't in line with the desired coding style.

@beck3905 beck3905 force-pushed the distributed-map branch 3 times, most recently from 8a0704d to 827ef17 Compare February 28, 2023 18:07
Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.

A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.

@aws-cdk-automation aws-cdk-automation added the pr/needs-cli-test-run This PR needs CLI tests run against it. label Feb 28, 2023
@aws-cdk-automation aws-cdk-automation dismissed their stale review February 28, 2023 18:12

✅ Updated pull request passes all PRLinter validations. Dissmissing previous PRLinter review.

@TheRealAmazonKendra
Copy link
Contributor

TheRealAmazonKendra commented Feb 28, 2023

Quick question from PR review comments. There were many suggestions regarding fixing whitespace in the code. I setup my dev environment in VSCode as described in the contributing guide. I guess I expected the tooling to format the code according to the desired style guide. Does the CDK repo not have a code formatter tool? Did I miss that? I did notice that code would change on save in my IDE to add missing punctuation and adjust whitespace, but apparently it didn't get all of it for some reason or it wasn't in line with the desired coding style.

Running yarn lint --fix will typically fix these.

Copy link
Contributor

@TheRealAmazonKendra TheRealAmazonKendra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started working my way through this thinking that it was just adding functionality to an existing L2, but upon getting part of the way though, I'm seeing that distributed-map is a new L2. Given that and the size of this PR, I think this needs to go through the RFC process to ensure we're getting the design correct.

packages/@aws-cdk/aws-stepfunctions/lib/state-graph.ts Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-stepfunctions/lib/state-graph.ts Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-stepfunctions/lib/state-machine.ts Outdated Show resolved Hide resolved
@mergify mergify bot dismissed TheRealAmazonKendra’s stale review March 1, 2023 02:37

Pull request has been modified.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: e8b7352
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@ronyyosef
Copy link

Any news about this pr?

@aws-cdk-automation
Copy link
Collaborator

This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

@alexandr2110pro
Copy link

Hey! Will it ever be finished? Or is there another implementation of DistributedMap in CDK somewhere else?

@mcacker
Copy link

mcacker commented Jul 25, 2023

Any progress on pushing this PR / feature through your process? I'd really like to see a feature to address DistributedMap in the CDK as embedding CF JSON in the code is less than ideal.

@aws-cdk-automation
Copy link
Collaborator

This PR has been deemed to be abandoned, and will be automatically closed. Please create a new PR for these changes if you think this decision has been made in error.

@aws-cdk-automation aws-cdk-automation added the closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. label Jul 28, 2023
@mrgrain mrgrain added p1 and removed p2 labels Oct 13, 2023
mergify bot pushed a commit that referenced this pull request Feb 9, 2024
Adds support for Step Functions Map state in Distributed mode. Currently, in order to create a Distributed Map in CDK, CDK users have to define a Custom State containing their Amazon States Language definition. 

This solution consists of the creation of a new L2 construct, `DistributedMap`. This design decision was made due to the fact that some fields are exclusive to Distributed Maps, such as `ItemReader`. Adding support for it through the existing `Map` L2 construct would lead to some fields being conditionally available.

Some design decisions that were made:
- I created an abstract class `MapBase` that encapsulates all fields currently supported by both `inline` and `distributed` maps. This includes all currently supported fields in the CDK except for `iterator` and `parameters` (deprecated fields). Those are now part of the Map subclass which extends `MapBase`. All new Distributed Maps fields are part of the new `DistributedMap` construct (also a subclass of `MapBase`)
- Permissions specific to Distributed Maps are added as part of this new construct

Thanks to @beck3905 and their PR #24331 for inspiration. A lot of the ideas here are re-used from the PR cited.

Closes #23216 

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
TheRealAmazonKendra pushed a commit that referenced this pull request Feb 9, 2024
Adds support for Step Functions Map state in Distributed mode. Currently, in order to create a Distributed Map in CDK, CDK users have to define a Custom State containing their Amazon States Language definition. 

This solution consists of the creation of a new L2 construct, `DistributedMap`. This design decision was made due to the fact that some fields are exclusive to Distributed Maps, such as `ItemReader`. Adding support for it through the existing `Map` L2 construct would lead to some fields being conditionally available.

Some design decisions that were made:
- I created an abstract class `MapBase` that encapsulates all fields currently supported by both `inline` and `distributed` maps. This includes all currently supported fields in the CDK except for `iterator` and `parameters` (deprecated fields). Those are now part of the Map subclass which extends `MapBase`. All new Distributed Maps fields are part of the new `DistributedMap` construct (also a subclass of `MapBase`)
- Permissions specific to Distributed Maps are added as part of this new construct

Thanks to @beck3905 and their PR #24331 for inspiration. A lot of the ideas here are re-used from the PR cited.

Closes #23216 

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. p1 pr/needs-cli-test-run This PR needs CLI tests run against it.
Projects
None yet
10 participants