Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(aws): multiple zone roles #5057

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

hjoshi123
Copy link
Contributor

@hjoshi123 hjoshi123 commented Feb 3, 2025

Description

For more context I am trying to solve the problem where one external dns instance can manage multiple aws hosted zones through role assumption. So ideally we wouldn't need multiple instances or an umbrella chart if my proposal is accepted.
As far the --aws-profile= that would still be valid since external dns would need a profile to have permissions to do the sts:AssumeRole. For backwards compatibility, we could still keep the --aws-assume-role= too although my current change would create a breaking change for it.

My change currently uses the map:

aws-domain-roles=zzz.com=aws:arn...
Once the map is ready then the configured profile through the aws profile or the default profile is used to assume roles for different domains and then whenever the change matches to the hosted zone those creds are used.

Fixes #4526

Checklist

  • Unit tests updated
  • End user documentation updated

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 3, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign szuecs for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

linux-foundation-easycla bot commented Feb 3, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot
Copy link
Contributor

Welcome @hjoshi123!

It looks like this is your first PR to kubernetes-sigs/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 3, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @hjoshi123. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Feb 3, 2025
@ivankatliarchuk
Copy link
Contributor

Hey. It will not fixes the issue, but it relates. As the issue is more about HOW to, and not WHY something is not supported.

The problem should be clear enough, so that the maintainers will decide where or not it worth to be added to master (disclaimer, I'm not one of them)

You may need to start or look at documentation. As not clear how and what this pull request is solving. It's more like a new feature. So worth to explain you case and setup. As well as share how this was tested, specifically need to make sure this not break current functionality and what is wrong with current flags like --aws-profile=, --aws-assume-role= and etc.

@hjoshi123
Copy link
Contributor Author

For more context @ivankatliarchuk I am trying to solve the problem where one external dns instance can manage multiple aws hosted zones through role assumption. So ideally we wouldn't need multiple instances or an umbrella chart if my proposal is accepted.
As far the --aws-profile= that would still be valid since external dns would need a profile to have permissions to do the sts:AssumeRole. For backwards compatibility, we could still keep the --aws-assume-role= too although my current change would create a breaking change for it.

My change currently uses the map:

aws-domain-roles=zzz.com=aws:arn...

Once the map is ready then the configured profile through the aws profile or the default profile is used to assume roles for different domains and then whenever the change matches to the hosted zone those creds are used.
I agree with you that maintainers might have to decide if this is something that is needed to be supported or not.

@szuecs or @mloiseleur do you guys feel this is within the scope? If yes I could start adding unit tests and make the PR cleaner

@ivankatliarchuk
Copy link
Contributor

Worth adding to the header of the pull request a description

@@ -347,7 +347,7 @@ func TestAWSZones(t *testing.T) {
func TestAWSZonesWithTagFilterError(t *testing.T) {
client := NewRoute53APIStub(t)
provider := &AWSProvider{
clients: map[string]Route53API{defaultAWSProfile: client},
clients: map[string][]*AWSZoneConfig{defaultAWSProfile: {{Route53Config: client}}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that reads not obvious enough to me for being a "clients".

type AWSZoneConfig struct {
Config awsv2.Config
HostedZoneName string
Route53Config Route53API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my phone so no lookup possible, so maybe I am wrong.
The name seems to be wrong. I think elsewhere we call it client iirc.

@ivankatliarchuk
Copy link
Contributor

ivankatliarchuk commented Feb 20, 2025

For more context I am trying to solve the problem where one external dns instance can manage multiple aws hosted zones through role assumption. So ideally we wouldn't need multiple instances or an umbrella chart if my proposal is accepted.

Not sure what this statement even means. In the issue itself I have described a way, as all the AWS IAM roles should be least privilege, all the external-dns deployments are isolated and deployed to a required location. Umbrella chart was given as an example tool that could be used, but not required. So it's not a disadvantage, it's one of the options how to manage multiple AWS accounts and zones in a secure way.

Initial issue #4526 is about HOW to setup, as there are missing documentation, not missing capabilities.

At the moment external-dns support profiles, how proposed solution differ from profiles setup?

And issue #4526 is about multiple accounts not multiple Domains to be managed by different IAM roles.

I would love to see a tutorial, currently not too clear how this will work when there are 100+ zones in account and let's say are a dozen of AWS accounts.

@hjoshi123
Copy link
Contributor Author

@ivankatliarchuk this solution handles the problem of multiple accounts. It allows you mention the roles you want to use for the hosted zone/domain you want. It could be delegated domains in different accounts or diff domains altogether. As long as the profile you mention has permission to assume the role(s), this solution can manage it. As @szuecs also pointed during my discussion with him in slack, I would also add documentation around this and that if there are multiple roles within the same account then there could be rate limit issues.
I am attaching a trial run that I did on my local machine using my code. Part of is redacted due to the accounts being work related.
Screenshot 2025-02-19 at 08 09 24

@ivankatliarchuk
Copy link
Contributor

Just few concerns

  • From a security standpoint, I have concerns about using a single external-dns instance to manage multiple accounts. This approach seems risky. While I understand there might be a use case for this, I'm very concerned about the security implications and I'm not convinced it's a sound practice. But it's fine, the users could decide.
  • AWS STS API rate limits. At the moment there are issues with Route53 rate limits, this is another dimension on top. external-dns could lock all IAM related actions in the account.....
  • aws-domain-roles=zzz.com=aws:arn... Why not just to pass a list of roles to assume, we already have all the flags for domains, tags, zones and etc? Have a look at filters

@hjoshi123
Copy link
Contributor Author

@ivankatliarchuk to address some of your concerns:

  • Even if a single external dns manages a lot of them all of them would be through assume roles, given an end user sets the role to use least trust policies it should alleviate that. Although that would be good to be included in the documentation.
  • The reason I made it a map is because, not all roles could have the same permissions to manage different hosted zones. There could be a scenario where domain xxx is only allowed to be edited by role1. Hence, the map would provide them a way to map hosted zones to roles.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2025
@hjoshi123 hjoshi123 marked this pull request as ready for review February 25, 2025 03:38
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 25, 2025
@hjoshi123 hjoshi123 changed the title feat(multiple-aws-zone-roles): multiple aws zone roles [WIP] feat(multiple-aws-zone-roles): multiple aws zone roles Feb 25, 2025
@hjoshi123 hjoshi123 requested a review from szuecs February 25, 2025 03:39
@mloiseleur
Copy link
Contributor

/retitle feat(aws): multiple zone roles

@k8s-ci-robot k8s-ci-robot changed the title feat(multiple-aws-zone-roles): multiple aws zone roles feat(aws): multiple zone roles Feb 25, 2025
@mloiseleur
Copy link
Contributor

@hjoshi123 Would you please update documentation accordingly to show how it can be used ?

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 25, 2025
@hjoshi123
Copy link
Contributor Author

@mloiseleur should I also edit the helm chart to add the option to the schema and values file as part of the PR?

@ivankatliarchuk
Copy link
Contributor

ivankatliarchuk commented Feb 28, 2025

High level though and observations that needs to be addressed

  • What this pull request is solving, not clear. The initial issue is about HOW to do multi account and this already supported, more or less. I cold not be 100% sure, but there are missing maybe tutorials at first place?
  • Potential Technical debt. Multiple roles assumption per single external-dns instance. All cloud tools like crollplane-providers, external-secrets, aws-load-balancer-controller, aws-ebs-csi-driver.... all follow model single instance - single role
  • Potential Technical debt. Clash with --aws-assume-role, how this should behave now if --aws-assume-role and this new way is specified? Why one could not simply specify --aws-assume-role, multiple times?
  • Potential Technical debt. Clash with --profile how is going to behave with multiple profiles?
  • Potential Technical debt. AWS STS Api Rate limits, they are high enough. Current situation, external-dns throttle Route53 rate limits in whole account time to time, with this change it could throttle API requests. external-dns is not a single citizen in AWS accounts and we not rate limiting this AWS IAM API requests
  • Potential Technical debt. Let's say we have it set, DOMAIN-A=ROLE,DOMAIN-B=ROLE, DOMAIN-C=ROLE. Are this domains include child, or only current domains?
  • Potential Technical debt. Let's say we have it set, DOMAIN-A=ROLE,DOMAIN-B=ROLE, DOMAIN-C=ROLE. We have --domain-filter=domain-a, how to configure domain filtering for specific accounts, or any other filters? Configuration could became quite difficult to manage quiet quickly
  • Potential Technical debt. Size of the kubernetes manifest is limited. Domains and roles quite lengthy, how many accounts we could fit before the size limit exited?

I'm not sure how is all set. But what stopping an engineer, let's say when there are accounts A,B,C,D,E to simply configure an IAM roles and trusts for account X where external-dns is running, so that single role could have access to Route53 in accounts A,B,C,D,E.

I may misunderstood something, same time releasing as is, sounds quite risky option.

@hjoshi123 hjoshi123 requested a review from mloiseleur February 28, 2025 15:46
@hjoshi123
Copy link
Contributor Author

@ivankatliarchuk Clarifying some points as you told:

  • --aws-assume-role is a string parameter and cannot be provided multiple times because that's not how its written to handle. I have also integrated backwards compatibility meaning that if someone does mention --aws-assume-role and not given in the new field it still works as is. Nothing breaks. If they do provide both, then the map has higher priority.
  • Multiple accounts werent inherently supported in a single external dns unless you deploy multiple instances of them or create an umbrella chart. My code changes doesn't really change the security profile of the code since we already do the role assumption part. It's just an extension to that code.
  • From what I know, domain filters restrict external dns from making changes for other domains, so we have --domain-filter=DOMAIN-A and DOMAIN-A=ROLE,DOMAIN-B=ROLE, DOMAIN-C=ROLE it shouldnt do anything on other domains except domain a. I am not changing any of that behavior.
  • Let's say we have it set, DOMAIN-A=ROLE,DOMAIN-B=ROLE, DOMAIN-C=ROLE. Are this domains include child, or only current domains? The behavior remains same for this. I haven't changed anything. As long as it matches the filter, it uses the particular client.
  • For rate limiting I agree about what you said, and also have added it to documentation. We could revisit it again as part of technical debt.
I'm not sure how is all set. But what stopping an engineer, let's say when there are accounts A,B,C,D,E to simply configure an IAM roles and trusts for account X where external-dns is running, so that single role could have access to Route53 in accounts A,B,C,D,E.

This is not possible as that would be role chaining.. let's say the primary external dns has a pod identity, it can now assume only one role which can have access to a particular account. We can't give the pod identity access to multiple accounts because AWS doesn't allow that. This option would be good if AWS allowed it.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 3, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 4, 2025
@ivankatliarchuk
Copy link
Contributor

This pull request is a bit controversial to me. There is no matching issue, linked issue is asking for HOW to do it.

On aws, I could create a role A, attach to external DNS and create roles B,C,D,E,.... in other accounts and configure a trust to role A. Hence external-dns could have access to all route53 endpoints in account B,C,D,E without code changes in current codebase.

More important, we not following the model for other k8s-sigs products, where single instance-single account. I could be wrong here, but not found other examples.

@hjoshi123
Copy link
Contributor Author

hjoshi123 commented Mar 5, 2025

@ivankatliarchuk I am not sure if the approach you were mentioning works. Let's say external dns has credentials through pod identity or IRSA it can be told to assume only one role. That role can only have access to route53 of that account. Role chaining is not possible with the current setup.

About the second point, I am not sure. I would defer it to you guys to decide @ivankatliarchuk @szuecs @mloiseleur. I do know in external secrets you can deploy multiple secret stores in a declarative way with different roles on a single deployment without using umbrella charts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting UP external-dns for Multiple Accounts in AWS
5 participants