-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EventEngine] PosixEventEngine DNS Resolver #32701
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! Remaining comments are mostly minor.
Please let me know if you have any questions. Thanks!
kHealthCheckRecordName}, | ||
" "); | ||
// TODO(yijiem): build a portable solution for Windows | ||
FILE* f = popen(command.c_str(), "r"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we might still be duplicating some stuff unnecessarily here, but I might be wrong, and I don't have time to really dig in to understand the details. I'm going to defer to @apolcyn for review of the tests here, with particular focus on making sure that we are not needlessly reinventing any wheels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I found just a few small things to address.
#include "src/core/lib/iomgr/sockaddr.h" | ||
#ifdef GRPC_POSIX_SOCKET_ARES_EV_DRIVER | ||
#include "src/core/lib/event_engine/posix_engine/tcp_socket_utils.h" | ||
#elif defined(GRPC_WINDOWS_SOCKET_ARES_EV_DRIVER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe add a // TODO(yijiem)
here for the next step, so the empty block doesn't look like a mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the #elif
here doesn't really make sense. We would include something here for Windows when porting the Windows' version of IsIpv6LoopbackAvailable()
from iomgr to EE. But there is no need to reserve it here now.
namespace grpc_event_engine { | ||
namespace experimental { | ||
|
||
grpc_core::TraceFlag grpc_trace_ares_resolver(false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: WDYT about just reusing cares_resolver
for the trace flag now that we're landing it in production? Since one implementation will be replaced by the other, and no process will be running both ares versions concurrently, it might make sense to not introduce a separate trace flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Done.
// Don't query for TXT records if the target is "localhost" | ||
if (absl::EqualsIgnoreCase(host, "localhost")) { | ||
event_engine_->Run([callback = std::move(callback)]() mutable { | ||
callback(std::vector<std::string>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just be callback({});
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting, callback
takes a absl::StatusOr
which has an explicit default constructor which prohibits list initialization: https://source.corp.google.com/piper///depot/google3/third_party/absl/status/statusor.h;l=216-221?q=file:third_party%2Fabsl%20absl::StatusOr&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental
fd_node_list_.begin(), fd_node_list_.end(), | ||
[sock = socks[i]](const auto& node) { return node->as == sock; }); | ||
if (iter == fd_node_list_.end()) { | ||
new_list.emplace_back(std::make_unique<FdNode>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this is an std::list<std::unique_ptr<FdNode>>
, you don't need emplace here, push_bach
is preferred. I think we have linting for that internally, so you may get that same recommendation on import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
BTW love the push_bach
since JSBach is my favorite musician :).
@@ -487,9 +490,62 @@ EventEngine::TaskHandle PosixEventEngine::RunAfterInternal( | |||
return handle; | |||
} | |||
|
|||
#if GRPC_ARES == 1 && defined(GRPC_POSIX_SOCKET_TCP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for the future - something you already know, but worth commenting - that this engine is expected to run on platforms that don't have Ares enabled (Android is the only supported platform I'm aware of anymore if iOS works with Ares, but possibly other platforms will use it well). Before the EE dns functionality can be enabled generally, a basic A/AAAA-only native resolver needs to be implemented as well.
It's probably worth having a TODO for that here or below next to the Crash("unimplemented")
bit. If I missed a TODO for this somewhere already, apologies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Added a TODO below next to the Crash("unimplemented")
.
@@ -138,14 +140,23 @@ class PosixEventEngine final : public PosixEventEngineWithFdSupport, | |||
public: | |||
class PosixDNSResolver : public EventEngine::DNSResolver { | |||
public: | |||
~PosixDNSResolver() override; | |||
#if GRPC_ARES == 1 && defined(GRPC_POSIX_SOCKET_TCP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On platforms that don't support ares, this class will have an auto-generated public default constructor. Making a private, no-op default constructor, either in non-ares builds or just always having it, would prevent this from being instantiated on platforms that don't support ares. It's a bit of defensive programming, but I think it conveys the intention a bit better at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I explicitly deleted the default constructor which should convey the same meaning here.
@drfloob @apolcyn @markdroth, continue the discussion on the concerns in using c-ares on iOS: I'm drafting an implementation with DNSService API (#33233). As I mentioned earlier, this API doesn't support specifying custom DNS servers but can only use which ever servers configured in the system. I took a look at the iomgr implementation, looks like name server is not actually used. So I wonder what's the use case of specifying a different dns server in event engine? is it OK for iOS I always use the system DNS? The other thing that's not supported is using a custom timeout, given the new DNS API removed the deadline parameter, I assume it's OK to rely on system default timeout (5s in my test)? CC: @sampajano |
Sorry for the delay, I just returned from vacation.
It is OK for gRPC on iOS to always use the built-in DNS system. Customizing the DNS authority is an advanced feature which has never been supported on iOS.
I think this is OK too. I'm not aware of any requests for advanced DNS resolution capabilities on iOS, so the default timeout should also be fine. The same is true for Android, to the best of my knowledge. |
I'm fine with iOS not supporting custom DNS servers. But we need to make sure that if a user provides a URI with a custom DNS server, we do not just silently ignore the custom DNS server; we need to reject such a URI if we cannot honor it. I think this means that the EE impl should return a failure when creating the |
I agree. @yijiem can you add a bit to the documentation in your API update here: #33744. Something like "If the caller requests a custom DNS resolver, and the EventEngine implementation does not support it, this must return an error." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great! The only blocking issue is merging the changes from #33744 into this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, Yijie!
This PR implements a c-ares based DNS resolver for EventEngine with the reference from the original [grpc_ares_wrapper.h](../blob/master/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.h). The PosixEventEngine DNSResolver is implemented on top of that. Tests which use the client channel resolver API ([resolver.h](../blob/master/src/core/lib/resolver/resolver.h#L54)) are ported, namely the [resolver_component_test.cc](../blob/master/test/cpp/naming/resolver_component_test.cc) and the [cancel_ares_query_test.cc](../blob/master/test/cpp/naming/cancel_ares_query_test.cc). The WindowsEventEngine DNSResolver will use the same EventEngine's grpc_ares_wrapper and will be worked on next. The [resolve_address_test.cc](https://github.com/grpc/grpc/blob/master/test/core/iomgr/resolve_address_test.cc) which uses the iomgr [DNSResolver](../blob/master/src/core/lib/iomgr/resolve_address.h#L44) API has been ported to EventEngine's dns_test.cc. That leaves only 2 tests which use iomgr's API, notably the [dns_resolver_cooldown_test.cc](../blob/master/test/core/client_channel/resolvers/dns_resolver_cooldown_test.cc) and the [goaway_server_test.cc](../blob/master/test/core/end2end/goaway_server_test.cc) which probably need to be restructured to use EventEngine DNSResolver (for one thing they override the original grpc_ares_wrapper's free functions). I will try to tackle these in the next step. <!-- If you know who should review your pull request, please assign it to that person, otherwise the pull request would get assigned randomly. If your pull request is for a specific language, please add the appropriate lang label. -->
This PR implements a c-ares based DNS resolver for EventEngine with the reference from the original grpc_ares_wrapper.h. The PosixEventEngine DNSResolver is implemented on top of that. Tests which use the client channel resolver API (resolver.h) are ported, namely the resolver_component_test.cc and the cancel_ares_query_test.cc. The WindowsEventEngine DNSResolver will use the same EventEngine's grpc_ares_wrapper and will be worked on next.
The resolve_address_test.cc which uses the iomgr DNSResolver API has been ported to EventEngine's dns_test.cc. That leaves only 2 tests which use iomgr's API, notably the dns_resolver_cooldown_test.cc and the goaway_server_test.cc which probably need to be restructured to use EventEngine DNSResolver (for one thing they override the original grpc_ares_wrapper's free functions). I will try to tackle these in the next step.