New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add server selection log messages #1221
Conversation
JAVA-4754
JAVA-4754
@@ -118,6 +119,7 @@ public final class Entities { | |||
private final Map<String, TestLoggingInterceptor> clientLoggingInterceptors = new HashMap<>(); | |||
private final Map<String, TestConnectionPoolListener> clientConnectionPoolListeners = new HashMap<>(); | |||
private final Map<String, TestServerListener> clientServerListeners = new HashMap<>(); | |||
private final Map<String, TestClusterListener> clientClusterListeners = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new code using clientClusterListeners
in this PR was written based on the existing code that uses clientServerListeners
.
private ClusterOpeningEvent clusterOpeningEvent; | ||
private ClusterClosedEvent clusterClosingEvent; | ||
private final List<ClusterDescriptionChangedEvent> clusterDescriptionChangedEvents = new ArrayList<>(); | ||
public final class TestClusterListener implements ClusterListener { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in TestClusterListener
are needed for the spec tests added in this PR.
public void waitForServerDescriptionChangedEvents( | ||
final Predicate<ServerDescriptionChangedEvent> matcher, final int count, final Duration duration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature of this method in TestServerListener
was changed to make the method similar to TestClusterListener.waitForClusterDescriptionChangedEvents
.
The rest of the changes in TestServerListener
is meant to fix concurrency issues in the code, which we had because we previously mistakenly decided that ClusterListener
/ServerListener
/ServerMonitorListener
don't need to care about concurrency (see https://jira.mongodb.org/browse/JAVA-5185).
ReadPreference primaryPreferred = ReadPreference.primaryPreferred(); | ||
List<ServerDescription> primaryPreferredServers = new ReadPreferenceServerSelector(primaryPreferred) | ||
.select(cluster.getCurrentDescription()); | ||
if (primaryPreferred.isEmpty()) { | ||
if (primaryPreferredServers.isEmpty()) { | ||
// Skip doing server selection if we anticipate that no server is readily selectable. | ||
// This approach is racy, and it is still possible to become blocked selecting a server | ||
// even if `primaryPreferredServers` is not empty. | ||
return; | ||
} | ||
|
||
Connection connection = null; | ||
try { | ||
StaticBindingContext context = new StaticBindingContext(NoOpSessionContext.INSTANCE, serverApi, | ||
IgnorableRequestContext.INSTANCE, new OperationContext()); | ||
connection = cluster.selectServer(clusterDescription -> { | ||
for (ServerDescription cur : clusterDescription.getServerDescriptions()) { | ||
if (cur.getAddress().equals(primaryPreferred.get(0).getAddress())) { | ||
return Collections.singletonList(cur); | ||
} | ||
} | ||
return Collections.emptyList(); | ||
}, context.getOperationContext()).getServer().getConnection(context.getOperationContext()); | ||
connection = cluster.selectServer( | ||
new ServerSelector() { | ||
@Override | ||
public List<ServerDescription> select(final ClusterDescription clusterDescription) { | ||
for (ServerDescription cur : clusterDescription.getServerDescriptions()) { | ||
if (cur.getAddress().equals(primaryPreferredServers.get(0).getAddress())) { | ||
return Collections.singletonList(cur); | ||
} | ||
} | ||
return Collections.emptyList(); | ||
} | ||
|
||
@Override | ||
public String toString() { | ||
return "ReadPreferenceServerSelector{" | ||
+ "readPreference=" + primaryPreferred | ||
+ '}'; | ||
} | ||
}, | ||
context.getOperationContext()).getServer().getConnection(context.getOperationContext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is just to implement toString
for this custom ServerSelector
. Also left a comment explaining the idea behind this custom selector and the issue with it.
logger.debug(message, exception); | ||
} | ||
} | ||
logUnstructured(logMessage, Logger::isDebugEnabled, Logger::debug, Logger::debug); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inference magic is strong in this code, when it comes to the last two arguments.
driver-core/src/main/com/mongodb/internal/logging/LogMessage.java
Outdated
Show resolved
Hide resolved
throw createTimeoutException(serverSelector, curDescription); | ||
Long remainingTimeNanos = maxWaitTimeNanos == null ? null : maxWaitTimeNanos - (curTimeNanos - startTimeNanos); | ||
|
||
if (remainingTimeNanos != null && remainingTimeNanos <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note how now remainingTimeNanos <= 0
is used, while previously curTimeNanos - startTimeNanos > maxWaitTimeNanos
was used, which was equivalent to remainingTimeNanos < 0
. The reason we must use <=
instead of <
is that when maxWaitTimeNanos
is 0, we must not wait (as per the getServerSelectionTimeout
docs), and <= 0
makes sure we don't wait, while <
is not guaranteed to give us that behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Javadoc says:
A value of 0 means that it will timeout immediately if no server is available.
A negative value means to wait indefinitely.
The spec only says:
A serverSelectionTimeoutMS of zero MAY have special meaning in some drivers;
zero's meaning is not defined in this spec, but all drivers SHOULD document the meaning of zero.
So this looks good to me.
/** | ||
* Not supported. | ||
*/ | ||
OPERATION("operation"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The final decision on whether to support operation
and to what extent will be done in https://jira.mongodb.org/browse/JAVA-5197.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a couple of questions and suggestions. Additionally, could you please confirm if manual testing was performed for unstructured logging?
driver-core/src/main/com/mongodb/internal/logging/StructuredLogger.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/logging/LogMessage.java
Outdated
Show resolved
Hide resolved
JAVA-4754
Yes, I've just done that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just a question.
We might run into rebasing issues with CSOT that would require an understanding of this task, even without refactoring, but I don't see a way around that.
|
||
long startTimeNanos = System.nanoTime(); | ||
long curTimeNanos = startTimeNanos; | ||
long maxWaitTimeNanos = getMaxWaitTimeNanos(); | ||
Long maxWaitTimeNanos = getMaxWaitTimeNanos(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method and related code need to be refactored in order to accomplish the logging spec changes? Note that much of this has been refactored already in the CSOT branch. It seems that we only need to either add methods that log messages, or remove existing logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remainingTimeNanos
has to be computed and logged, unless it's infinite. This required changing the code.- This PR also fixes a bug in the code that works with the timeout. See Add server selection log messages #1221 (comment) for the details.
- Creating exceptions and logging them is related, which means that the line
throwIfIncompatible(curDescription);
needed to be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that much of this has been refactored already in the CSOT branch.
I won't be surprised if merging this code from master
with CSOT will effectively require reintroducing the spec changes in the selectServer
methods (not from scratch, though, as the methods that create exceptions and the methods that log them and other messages don't need changes)
throw createTimeoutException(serverSelector, curDescription); | ||
Long remainingTimeNanos = maxWaitTimeNanos == null ? null : maxWaitTimeNanos - (curTimeNanos - startTimeNanos); | ||
|
||
if (remainingTimeNanos != null && remainingTimeNanos <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Javadoc says:
A value of 0 means that it will timeout immediately if no server is available.
A negative value means to wait indefinitely.
The spec only says:
A serverSelectionTimeoutMS of zero MAY have special meaning in some drivers;
zero's meaning is not defined in this spec, but all drivers SHOULD document the meaning of zero.
So this looks good to me.
driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/connection/BaseCluster.java
Outdated
Show resolved
Hide resolved
JAVA-4754
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One optional change request.
// This hardcoded message guarantees that the `FAILURE` entry for `MongoTimeoutException` does not include | ||
// any information that is specified via other entries, e.g., `SELECTOR` and `TOPOLOGY_DESCRIPTION`. | ||
// The logging spec requires us to avoid such duplication of information. | ||
? MongoTimeoutException.class.getName() + ": Timed out while waiting for a suitable server" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including the class name seems unnecessary in this context. But if you have a good reason for it, I'm ok if it stays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other branch (failure.toString()
) includes it, so I mimicked it for consistency.
The implemented spec changes: mongodb/specifications@c773534.
JAVA-4754