Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from rusoto to aws-sdk-rust #849

Closed
wants to merge 65 commits into from

Conversation

josb
Copy link
Contributor

@josb josb commented Jul 3, 2022

Fixes #575. Posting this for feedback.

@josb josb requested a review from a team July 3, 2022 19:03
@josb josb marked this pull request as draft July 3, 2022 19:04
@josb josb changed the title WIP: Migrate from rusoto to aws-sdk-rust Migrate from rusoto to aws-sdk-rust Jul 3, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 3, 2022

Codecov Report

Base: 64.91% // Head: 64.80% // Decreases project coverage by -0.10% ⚠️

Coverage data is based on head (d0c1d2a) compared to base (fb0734a).
Patch coverage: 34.22% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #849      +/-   ##
==========================================
- Coverage   64.91%   64.80%   -0.11%     
==========================================
  Files          68       68              
  Lines       11790    11755      -35     
==========================================
- Hits         7653     7618      -35     
  Misses       4137     4137              
Impacted Files Coverage Δ
.../symbolicator-service/src/services/download/gcs.rs 42.36% <0.00%> (ø)
...mbolicator-service/src/services/download/sentry.rs 22.11% <0.00%> (ø)
...s/symbolicator-service/src/services/download/s3.rs 42.51% <30.23%> (+2.34%) ⬆️
crates/symbolicator-sources/src/sources.rs 88.20% <62.50%> (-3.82%) ⬇️
...symbolicator-service/src/services/download/http.rs 82.51% <100.00%> (ø)
.../symbolicator-service/src/services/download/mod.rs 87.03% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for picking this up again!

#[error("S3 error code: {1} (http status: {0})")]
S3WithCode(StatusCode, String),
#[error("aws-sdk: failed to fetch data from S3")]
S3SDK(#[from] aws_smithy_http::byte_stream::Error),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be S3Sdk? Doesn't clippy complain about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any complaints from Clippy. It's only used here. Do you want me to rename it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, let's rename this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

crates/symbolicator/src/services/download/mod.rs Outdated Show resolved Hide resolved
crates/symbolicator/src/services/download/s3.rs Outdated Show resolved Hide resolved
self.create_s3_client(provider, region)
let provider = LazyCachingCredentialsProvider::builder()
.load(provide_credentials_fn(|| async {
aws_config::ecs::EcsCredentialsProvider::builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should check with the folks who added the AutoRefreshingProvider and ask them to test this to see if it works equivalently. I recall at some point they had an outage around this. git blame should point in the right direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll investigate this. Once I have this branch working well enough, I can also test this myself to some extent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still pending me looking into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am keeping an eye on the seemingly-related awslabs/aws-sdk-rust#629.

tracing::debug!("Skipping response from s3://{}/{}: {}", bucket, &key, err0);
return match &err0 {
ConstructionFailure(err) => {
println!("ERROR: ConstructionFailure: {:?}", err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you're still debugging all this. But you could capture all in a single match arm and maybe capture it as a sentry error before throwing cancelled up the chain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea but I am not sure how to go about that, any pointers? Doesn't it require access to a Sentry installation and corresponding API keys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for some guidance here as I know nothing about Sentry. SOme of these can probably grouped into a single response but the mapping from Rusoto errors to AWS SDK errors isn't clear to me (although granted I have not spent much time investigating this yet).

Also, the Rusoto code does some XML parsing of the AWS responce, I wonder how much of that is still required with the AWS SDK.

let pos = AWS_REGIONS.iter().position(|&x| x == value);
match pos {
Some(_) => Ok(Region::new(String::from(value))),
None => Err(E::custom(format!("unknown region: {}", value))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this means we need to updated our code every time they add a region? That might not be ideal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this code to use the EC2 describe_regions function and cache the result. Things I don't like about my solution:

  • I'm calling tokio::runtime::Runtime::new() to grab a runtime so I can call block_on on it. This is needed because the describe_regions call uses an async send wrapper. Would it not be more appropriate to use one of the existing runtimes created in server.rs?
  • I'm creating a shared_config using aws_config::from_env().load().await; not even sure this will work, given that it should really use the credentials_provider (Container or Static) associated with the current source. Thoughts?

Also, does the use of the #[cached] macro look OK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this does indeed break the unit tests. Have to rethink this, because we need the region list during serde config parsing yet we won't have the credentials until after parsing. And there's no way to get the region list anonymously beforehand so we can pre-populate the list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just accept any string in serde and just deal with the error once you try and use an invalid region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll do that.

By the way, Rusoto currently also hardcodes the list of known regions, so if a new region is added, a recompile is needed anyway once Rusoto is updated to include it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I add a simple test at startup to see if sts get-caller-identity works and if it fails, error out with an appropriate error message?


fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("string or tuple")
formatter.write_str("string")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this type change going to be a problem for some configurations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; custom regions, which use tuples per the docs, don't currently work, as the AWS Rust SDK doesn't support them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch, that's quite a regression. this custom region support was also contributed by someone who needed it so it this would probably break for them. there is no way at all to support this? the exact way of configuring doesn't necessarily have to stay the same, but somehow supporting it would be needed i think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is in-scope for the AWS SDK, at any rate it doesn't support it today so if we want to offer this functionality we'll have to add it ourselves. :-/

@Swatinem
Copy link
Member

I yes, we introduced moka for some in-memory LRU caches like the client_cache. It has a moka::future variant that has almost the same API that can accept futures.

Copy link
Member

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look very good, thanks for sticking with it for so long!

The only missing thing would be to correctly map the errors. But I wouldn’t spend too much effort on that.
If we correctly flag connection timeout and notfound, we should be good. everything else can be an opaque S3Error.

crates/symbolicator-sources/src/sources.rs Outdated Show resolved Hide resolved
crates/symbolicator-sources/src/sources.rs Outdated Show resolved Hide resolved
assert_eq!(cfg.source_key.access_key, "the-access-key");
assert_eq!(cfg.source_key.secret_key, "the-secret-key");
}
_ => unreachable!(),
}
}

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably continue supporting custom regions. I commented above

crates/symbolicator-sources/Cargo.toml Outdated Show resolved Hide resolved
crates/symbolicator-service/src/services/download/s3.rs Outdated Show resolved Hide resolved
crates/symbolicator-service/src/services/download/s3.rs Outdated Show resolved Hide resolved
crates/symbolicator-service/src/services/download/s3.rs Outdated Show resolved Hide resolved
crates/symbolicator-service/src/services/download/s3.rs Outdated Show resolved Hide resolved
Comment on lines 169 to 173
.bucket(bucket.clone())
.key(key.clone())
.send();

let source = RemoteDif::from(file_source.clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe if you clean up all the println!s below, you can avoid these clones as well.

Copy link
Contributor Author

@josb josb Nov 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by converting to clones to references. I'll deal with the println!s separately.

To get rid of the file_source.clone() I implemented a trait, please let me know if I should just keep the existing approach.

@@ -264,6 +257,7 @@ impl S3Downloader {
}
}

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can leave these tests in, they should be noops in case our credentials are not set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to spend some time to convert the tests from rusoto to aws-sdk-rust, please stay tuned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests now compile.

@Swatinem
Copy link
Member

If the tests are giving troubles due to credentials, I can also pull this into a local branch manually and merge from there, that way it should have the right test credentials.

Jos Backus added 11 commits November 23, 2022 15:29
Rusoto hardcoded the available regions, aws-sdk-rust does not. But
there's no way to obtain the current list of valid regions without
making API calls, which means having working credentials and either
defaulting to some region (like us-east-1) or having the user specify a
valid region anyway.
@josb
Copy link
Contributor Author

josb commented Nov 25, 2022

At this point, as far as I know the major issues are:

  • Custom region support; not sure how to go about this.
  • SdkError<GetObjectError> handling in download_source(); I might be able to clean this up a bit but was hoping for some better guidance from [guide section]: Error handling awslabs/aws-sdk-rust#191.
  • The failing no_permission integration test, related to the previous item.

Anything else you can think of?

@Swatinem
Copy link
Member

Custom region support; not sure how to go about this.

I believe its okay to be as lenient as possible. People will get NotFound / NoPermission if they provide an invalid region.

@josb
Copy link
Contributor Author

josb commented Nov 28, 2022

BTW by "custom region support" I mean this:

In order to use a custom region for an S3 compatible service such as Ceph or minio, specify a tuple: ["custom-region-name", "http://minio-address/"].

My patch took away support for this because unlike Rusoto, the Rust AWS SDK doesn't support a Custom Region variant. Perhaps this can be implemented by wrapping the AWS SDK Regionstruct in an enum with a variant for the regular Region and one for the custom one; not sure how much work that would be.

@Swatinem
Copy link
Member

Hm, I don’t think we really support those. But I could be wrong. First time I read about it.

Jos Backus added 3 commits November 30, 2022 13:16
Unlike Rusoto, the Rust AWS SDK doesn't hardcode the currently known
regions, and no validation is provided. API requests with invalid region
names will fail.
@josb
Copy link
Contributor Author

josb commented Dec 3, 2022

This discussion seems relevant; the answer given suggests that a custom endpoint resolver can be used, as I guess the region is ultimately used to synthesize the URI of the wanted service endpoint.

@Swatinem
Copy link
Member

Swatinem commented Dec 5, 2022

Ahhh, now I understand what this is about. I should have read the comment more closely (which needs updating btw :-)

This is a Visitor that forwards string types to rusoto_core::Region's FromStr impl and forwards tuples to rusoto_core::Region's Deserialize impl.

That enum variant indeed consists of name + endpoint:

https://docs.rs/rusoto_core/latest/rusoto_core/enum.Region.html#variant.Custom

I think this is where my confusion was coming from.

I believe in this case we can mimic that behavior and have a:

enum Region {
  Builtin(String),
  Custom {
    name: String,
    endpoint: String,
  }
}

Then on usage one would have to match to then create the appropriate endpoint.

@Swatinem
Copy link
Member

Thanks @josb for the great work here!

I pulled in your changes and created a new PR: #954

That merges the latest changes, adds back support for custom regions and tries to clean up the error handling a bit, though the error handling from the upstream SDK is truely horrific :-(

@Swatinem Swatinem closed this Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate from rusoto to aws-sdk-rust
4 participants