Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(core): new tagging of existing AWS::EKS::AccessEntry resources is failing #29393

Closed
diranged opened this issue Mar 7, 2024 · 14 comments
Closed
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service blocked Work is blocked on this issue for this codebase. Other labels or comments may indicate why. bug This issue is a bug. effort/medium Medium work item – several days of effort p2

Comments

@diranged
Copy link

diranged commented Mar 7, 2024

Describe the bug

Upgrading from 2.130.0 -> 2.131.0 includes #28989 - and while this might work for newly created resources, it fails on upgrades of existing resources. The error returned by the AWS CF API is:

 Resource handler returned message: "This operation can only be performed on Access Entries with a type of "STANDARD". (Service: Eks, Status Code: 400, Request ID: 04b5dbe3-6ed0-4cb6-a3c7-a2c69c3e52e9)"

Expected Behavior

Certain types of AWS::EKS::AccessEntries apparently cannot be updated in-place, so we can't just add Tags to a resources like this without versioning and replacing the resource.

Current Behavior

The current behavior is to add the tags to all the existing AWS::EKS::AccessEntry resources. We saw this change come through in our PRs from a Dependabot upgrade:

image

We ran the change through the integration tests (which launch all new resources) and they worked fine ... but when we merged the change to existing environments, the deploy failed:

image

This resource cannot be updated in-place when the type is EC2_LINUX.

Reproduction Steps

Create resource with old version .. upgrade... try deploy.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.131.0

Framework Version

No response

Node.js Version

18

OS

Linux

Language

TypeScript

Language Version

No response

Other information

No response

@diranged diranged added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 7, 2024
@github-actions github-actions bot added the @aws-cdk/core Related to core CDK functionality label Mar 7, 2024
@diranged diranged changed the title (core): new tagging of AWS::EKS::AccessEntry resources is failing (core): new tagging of existing AWS::EKS::AccessEntry resources is failing Mar 7, 2024
@pahud
Copy link
Contributor

pahud commented Mar 7, 2024

Hi Matt,

I think we do not have AccessEntry L2 construct and you might be using L1 for that. Can you share some code snippets with the minimal props so we could see how you provision the cluster as well as the access entries?

@pahud pahud added @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service p2 and removed needs-triage This issue or PR still needs to be triaged. labels Mar 7, 2024
@pahud
Copy link
Contributor

pahud commented Mar 7, 2024

BTW, loos like you have successfully provisioned the clusters with AccessConfig support using AccessEntry L1 construct and I will be working on #28588. Your code snippets would be highly appreciated for me.

@pahud pahud added the effort/medium Medium work item – several days of effort label Mar 7, 2024
@diranged
Copy link
Author

diranged commented Mar 8, 2024

@pahud Sure... it's obviously not directly relevant to this ticket, but as soon as the AWS::EKS::AccessEntry resource came out, we transitioned away from the existing eks.Cluster construct and moved to using the eks.CfnCluster resource and then wrapping it in an ImportedCluster via the ClusterfromClusterAttributes backdoor.

(Please do not judge this code... I stripped out most of the comments, and it's broken up into cleaner pieces ... this is just a conglomeration of a number of pieces so you can see the basic idea)

Roughly speaking, here's our NdClusterNative construct:

/** @format */

import { Annotations, Stack } from 'aws-cdk-lib';
import { IVpc, SecurityGroup } from 'aws-cdk-lib/aws-ec2';
import * as eks from 'aws-cdk-lib/aws-eks';
import { IOpenIdConnectProvider, IRole, Role } from 'aws-cdk-lib/aws-iam';
import { IKey, Key } from 'aws-cdk-lib/aws-kms';
import { Construct, IConstruct } from 'constructs';
import { OurVpc } from '../aws-vpc';

export interface OurCluster extends IConstruct {
  readonly cluster: eks.ICluster;
  readonly oidcProvider: IOpenIdConnectProvider;
  readonly kubernetesVersion: eks.KubernetesVersion;
  readonly kmsKey: IKey;
  readonly nodeSecurityGroup: SecurityGroup;
  readonly nodeRole: Role;
  readonly vpc: IVpc;

  grantNodeRole(scope: Construct, role: IRole): undefined;
  grantArn(scope: Construct, id: string, arn: string, groups: string[], k8sGroups?: string[]): undefined;
}

export interface OurClusterProps {
  readonly clusterName?: string;
  readonly kubernetesVersion?: eks.KubernetesVersion;
  readonly vpc?: IVpc;
  readonly tags?: Record<string, string>;
}

export class OurClusterNative extends Construct {
  public static DEFAULT_SERVICE_IPV4_CIDR = '172.20.0.0/16';
  public static DEFAULT_KUBERNETES_VERSION = eks.KubernetesVersion.V1_28;

  public readonly kubernetesVersion: eks.KubernetesVersion;
  public readonly kmsKey: IKey;
  public readonly vpc: IVpc;
  public readonly tags: Record<string, string>;

  private readonly cfnCluster: eks.CfnCluster;
  private readonly clusterRole: Role;
  private readonly adminRole: Role;

  constructor(scope: Construct, id: string, props: OurClusterProps) {
    super(scope, id);
    
    /** Most of this is actually stored in a "base" class for us ... but i put it in here for completness */
    Annotations.of(this).acknowledgeWarning(
      '@aws-cdk/aws-eks:clusterMustManuallyTagSubnet',
      'We know that pre-created subnets must be tagged with kubernetes.io/role/elb=1',
    );
    this.kubernetesVersion = props.kubernetesVersion || OurClusterNative.DEFAULT_KUBERNETES_VERSION;
    this.tags = { ...props.tags, ...OUR_OWN_CUSTOM_TAGS };
    this.kmsKey = new Key(this, 'ClusterSecretEncryptionKey', { enabled: true, enableKeyRotation: true });
    this.vpc = props.vpc || new OurVpc(this, 'ClusterVPC').resource;

    /**
     * First create the IAM Role for the native {@link eks.CfnCluster} resource
     * - we do this exactly the way that the AWS CDK L2 {@link eks.Cluster}
     * code does.
     */
    this.clusterRole = this._clusterRole();

    this.cfnCluster = new eks.CfnCluster(this, 'Default', {
      /**
       * Default top level settings
       */
      version: this.kubernetesVersion.version,
      roleArn: this.clusterRole.roleArn,
      logging: {
        clusterLogging: {
          enabledTypes: [
            { type: eks.ClusterLoggingTypes.API },
            { type: eks.ClusterLoggingTypes.AUDIT },
            { type: eks.ClusterLoggingTypes.AUTHENTICATOR },
            { type: eks.ClusterLoggingTypes.CONTROLLER_MANAGER },
            { type: eks.ClusterLoggingTypes.SCHEDULER },
          ],
        },
      },

      /**
       * https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-accessentry.html
       *
       * This is what allows us to grant access into the cluster via native
       * CloudFormation resources, rather than relying on the "aws-auth"
       * ConfigMap.
       */
      accessConfig: {
        authenticationMode: 'API_AND_CONFIG_MAP',
        // this would be the CDK deploy role itself, which cannot be used to
        // manage the cluster.
        bootstrapClusterCreatorAdminPermissions: false,
      },

      /** Networking configuration for the Kubernetes control plane */
      resourcesVpcConfig: {
        endpointPublicAccess: true,
        endpointPrivateAccess: true,
        subnetIds: props.controlPlaneSubnets
          ? props.controlPlaneSubnets.map((s) => s.subnetId)
          : this.vpc.privateSubnets.map((s) => s.subnetId),
      },

      /** IMMUTABLE FIELDS - CHANGES HERE REQUIRE CREATING NEW CLUSTERS */
      name: props.clusterName,
      encryptionConfig: [{ provider: { keyArn: this.kmsKey.keyArn }, resources: ['secrets'] }],
      kubernetesNetworkConfig: {
        ipFamily: eks.IpFamily.IP_V4,
        serviceIpv4Cidr: OurClusterNative.DEFAULT_SERVICE_IPV4_CIDR,
      },

      /**
       * Let the cluster create its own control plane security group. We can
       * access it later ourselves via an attribute.
       *
       * securityGroup:...
       */
    });

    /**
     * Apply our tags to the L1 construct. Surprisingly the `AWS::EKS::Cluster`
     * resource doesn't even include the default `aws::cloudformation::...`
     * tags.
     */
    Object.entries(this.tags).forEach(([key, val]) => {
      this.cfnCluster.tags.setTag(key, val);
    });

    /**
     * Create the cluster administration IAM role, then create the Kubernetes
     * IAM Role for the {@link eks.KubectlProvider} that will be used to manage
     * the cluster.
     */
    this.adminRole = this._adminRole();
    const kubectlRole = this._kubectlRole(this.adminRole);

    /**
     * Pre-create the KubectlProvider for the cluster to make sure we are using
     * the latest base layers. See {@link OurKubectlProvider} for why this is in
     * its own custom construct.
     */
    const ndKubectlProvider = OurKubectlProvider.getOrCreate(this, {
      clusterName: this.cfnCluster.ref,
      kubectlRoleArn: this.adminRole.roleArn,
      kubectlLambdaRole: kubectlRole,
    });

    /**
     * Generate an L2 {@link eks.ICluster} compatible `ImportedCluster` from
     * the L1 construct. This is what allows us to add in all of the
     * interesting constructs like the {@link eks.KubernetesManifest} resource
     * and the {@link eks.KubectlProvider} to a native CloudFormation resource.
     */
    this.cluster = eks.Cluster.fromClusterAttributes(this, 'ClusterL2', {
      vpc: this.vpc,
      clusterName: this.cfnCluster.ref,
      clusterEndpoint: this.cfnCluster.attrEndpoint,
      clusterCertificateAuthorityData: this.cfnCluster.attrCertificateAuthorityData,
      clusterSecurityGroupId: this.cfnCluster.attrClusterSecurityGroupId,
      clusterEncryptionConfigKeyArn: this.kmsKey.keyArn,
      ipFamily: eks.IpFamily.IP_V4,

      /**
       * https://github.com/aws/aws-cdk/blob/v2.123.0/packages/aws-cdk-lib/aws-eks/lib/cluster.ts#L1863-L1868
       */
      openIdConnectProvider: new eks.OpenIdConnectProvider(this, 'OpenIdConnectProvider', {
        url: this.cfnCluster.attrOpenIdConnectIssuerUrl,
      }),

      /**
       * Configure the {@link KubectlProvider} so that resources like the
       * `KubernetesManifest` can be used.
       */
      kubectlLambdaRole: ndKubectlProvider.lambdaRole,
      kubectlLayer: ndKubectlProvider.layer,
      kubectlProvider: ndKubectlProvider.provider,
      kubectlRoleArn: ndKubectlProvider.roleArn,
      prune: true,
    });

    /**
     * Trigger the lazily-instantiated openIdConnectProvider() get function to
     * provision OIDC IRSA configuration for the cluster.
     */
    this.oidcProvider = this.cluster.openIdConnectProvider;

    /** Create a security group and IAM Role for the nodes */
    ... and some more things internal to us
  }

  /** Grants a {@link Role} "system:nodes" access into the cluster. */
  public grantNodeRole(scope: Construct, role: IRole): undefined {
    new eks.CfnAccessEntry(scope, `${role.node.id}-Access`, {
      clusterName: this.cfnCluster.ref,
      type: 'EC2_LINUX',
      principalArn: role.roleArn,
    });
  }

  /** Grants a {@link Role} access to the cluster with the supplied list of permissions ARNs. */
  public grantArn(scope: Construct, id: string, arn: string, groups: string[], k8sGroups?: string[]): undefined {
    new eks.CfnAccessEntry(scope, id, {
      clusterName: this.cfnCluster.ref,
      accessPolicies: this._groupsToAccessPolicies(groups),
      type: 'STANDARD',
      principalArn: arn,
      kubernetesGroups: k8sGroups,
    });
  }

  /**
   * Create the IAM Role that the EKS Control Plane itself will use
   *
   * https://github.com/aws/aws-cdk/blob/v2.123.0/packages/aws-cdk-lib/aws-eks/lib/cluster.ts#L1496-L1503
   */
  private _clusterRole(): Role {
    const role = new Role(this, 'ClusterRole', {
      assumedBy: new ServicePrincipal('eks.amazonaws.com'),
      managedPolicies: [ManagedPolicy.fromAwsManagedPolicyName('AmazonEKSClusterPolicy')],
    });
    return role;
  }

  /**
   * Creates the IAM Role that the {@link KubectlProvider} will use to assume
   * the {@link OurClusterNative.adminRole}.
   */
  private _kubectlRole(adminRole: Role): Role {
    const role = new Role(this, 'KubectlRole', {
      assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
      managedPolicies: [ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaBasicExecutionRole')],
    });

    // https://github.com/aws/aws-cdk/blob/v2.123.0/packages/aws-cdk-lib/aws-eks/lib/cluster.ts#L1652-L1655
    adminRole.assumeRolePolicy!.addStatements(
      new PolicyStatement({ actions: ['sts:AssumeRole'], principals: [role] }),
    );

    return role;
  }

  /**
   * Creates a cluster administration role that will be used to interface with
   * the cluster. This is assumed by the Kubectl Lambda role primarily. See
   * {@link _kubectlRole}).
   */
  private _adminRole(): Role {
    const role = new Role(this, 'AdminRole', {
      assumedBy: new ServicePrincipal('cloudformation.amazonaws.com', { region: Stack.of(this).region }),
    });
    this.grantArn(role, 'AdminRole-Access', role.roleArn, [EksAccessPolicyArn.AMAZON_EKS_CLUSTER_ADMIN_POLICY]);
    return role;
  }

  /**
   * Converts a list of group strings into Access Policy Properties. Also
   * validates the inputs to make sure that the strings look right at
   * compilation time.
   */
  private _groupsToAccessPolicies(groups: string[]): eks.CfnAccessEntry.AccessPolicyProperty[] {
    const policies: eks.CfnAccessEntry.AccessPolicyProperty[] = [];
    groups.forEach((g) => {
      if (!g.startsWith('arn:aws:eks::aws:cluster-access-policy/')) {
        throw Error(...)
      }
      policies.push({ policyArn: g, accessScope: { type: EksAccessScopeType.CLUSTER } });
    });
    return policies;
  }
}

@pahud
Copy link
Contributor

pahud commented Mar 8, 2024

Thank you @diranged this is a very awesome sample for me. I have no idea how to fix the tag issue off the top of my head as we are still looking at access entry. If you have any proposed solutions, please let us know.

@pahud
Copy link
Contributor

pahud commented Mar 8, 2024

OK back to the tag issue. Looking at your screenshot, sounds like you just can't add new tags on existing AccessEntry with type EC2_LINUX. Is it correct?

@diranged
Copy link
Author

diranged commented Mar 8, 2024

OK back to the tag issue. Looking at your screenshot, sounds like you just can't add new tags on existing AccessEntry with type EC2_LINUX. Is it correct?

Correct.... which means in some way or another, CDK needs to understand that. We've run into this before with other kinds of issues where we need some way for CDK to not change a resource once its been created because it will fail... I am not sure what the right answer here is.

@pahud
Copy link
Contributor

pahud commented Mar 12, 2024

@diranged

This is weird as it seems CFN does not allow you to add Tags on some existing resources which doesn't make any sense. I am trying to reproduce it on my end.

@pahud
Copy link
Contributor

pahud commented Mar 12, 2024

OK I can reproduce this bug now but this should be a bug from cloudformation not CDK.

This is how I create the cluster and AccessEntry with CfnAccessEntry

    // custom function that returns eks.Cluster
    const cluster = getEksCluster(this);
    const ng = new eks.Nodegroup(this, 'NG', {
      cluster,
      desiredSize: 1,
    })
    const entry = new eks.CfnAccessEntry(this, 'AccessEntry', {
      clusterName: cluster.clusterName,
      principalArn: ng.role.roleArn,
      type: 'EC2_LINUX',
    })

Now if I add Tags like this in the CfnAccessEntry

    const entry = new eks.CfnAccessEntry(this, 'AccessEntry', {
      clusterName: cluster.clusterName,
      principalArn: ng.role.roleArn,
      type: 'EC2_LINUX',
      // add tags like this after initial deployment
      tags: [
        {
          key: 'foo',
          value: 'bar',
        }
      ]
    })

cdk diff

Resources
[~] AWS::EKS::AccessEntry AccessEntry AccessEntry 
 └─ [+] Tags
     └─ [{"Key":"foo","Value":"bar"}]

cdk synth

 "AccessEntry": {
   "Type": "AWS::EKS::AccessEntry",
   "Properties": {
    "ClusterName": {
     "Ref": "EksClusterFAB68BDB"
    },
    "PrincipalArn": {
     "Fn::GetAtt": [
      "NGNodeGroupRoleD3B1A43D",
      "Arn"
     ]
    },
    "Tags": [
     {
      "Key": "foo",
      "Value": "bar"
     }
    ],
    "Type": "EC2_LINUX"
   },

And cdk deploy to update the deployed AccessEntry:

7:08:06 PM | UPDATE_FAILED | AWS::EKS::AccessEntry | AccessEntry
Resource handler returned message: "This operation can only be performed on Access Entries with a type of "STANDARD". (Service: Eks, Status
Code: 400, Request ID: 0d4360f2-d81c-4d60-ad0a-4db7e107b910)" (RequestToken: ed98adda-7e8c-e04b-bc75-d87ed8dd0319, HandlerErrorCode: Invalid
Request)

I think it's CFN that blocks the Tags update. I'll create an internal ticket for clarifying.

@pahud
Copy link
Contributor

pahud commented Mar 12, 2024

internal tracking: V1294749525

@pahud pahud removed the @aws-cdk/core Related to core CDK functionality label Mar 12, 2024
@pahud pahud self-assigned this Mar 12, 2024
@pahud pahud added the blocked Work is blocked on this issue for this codebase. Other labels or comments may indicate why. label Mar 12, 2024
@evgenyka evgenyka added the needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. label Mar 14, 2024
@pahud
Copy link
Contributor

pahud commented Mar 19, 2024

This is a CFN bug. I will work with relevant teams internally and update here when we have new progress.

@pahud
Copy link
Contributor

pahud commented Apr 8, 2024

@diranged Hi Matt

This bug should have been fixed. Can you please verify?

@pahud pahud added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-cfn This issue is waiting on changes to CloudFormation before it can be addressed. labels Apr 8, 2024
@pahud pahud removed their assignment Apr 8, 2024
@diranged
Copy link
Author

diranged commented Apr 9, 2024

@diranged Hi Matt

This bug should have been fixed. Can you please verify?

Yes - working on testing it today..

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Apr 9, 2024
@diranged
Copy link
Author

diranged commented Apr 9, 2024

Confirmed, this is fixed in CFN now.

@diranged diranged closed this as completed Apr 9, 2024
Copy link

github-actions bot commented Apr 9, 2024

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service blocked Work is blocked on this issue for this codebase. Other labels or comments may indicate why. bug This issue is a bug. effort/medium Medium work item – several days of effort p2
Projects
None yet
Development

No branches or pull requests

3 participants