-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Double free in tls_release_read_buffer() #24427
Comments
Another occurrence of the error, again in Progam output:
macOS exception report:
|
Unfortunately I don't have one of those! A double free is quite perplexing since the location where we free the buffer is in static int tls_release_read_buffer(OSSL_RECORD_LAYER *rl)
{
TLS_BUFFER *b;
b = &rl->rbuf;
if ((rl->options & SSL_OP_CLEANSE_PLAINTEXT) != 0)
OPENSSL_cleanse(b->buf, b->len);
OPENSSL_free(b->buf);
b->buf = NULL;
return 1;
} As you can see, immediately after freeing the buffer we immediately set the pointer to it to NULL. So, any subsequent attempt to free the buffer on the same thread (while erroneous) would just be a simple harmless free of a NULL ptr rather than a double free. Which suggests one of two things: 1) there is some other reference to the buffer hanging around somewhere else which is being freed in some other location in the code ...(but if so where??? I can't think why that would occur) or 2) possibly a multi-threaded issue?? If two threads happen to have references to the same SSL object then that could definitely cause spurious failures like this. Is your app multi-threaded? I note that the only call to Do you have any tools such as asan or equivalent on this platform which would help identify the location in the code where the original free occurred? |
The app is indeed multi-threaded, but I believe that only one of the threads uses the ssl connection. SSL_MODE_RELEASE_BUFFERS is not used in the app, but I don't know whether it is set by Python. I did not use asan so far, but can have a look at it. I'll look into all of that when I'm back from vacation (week after next). |
could you condense this into a reproducer that you can share here? Barring that, running your crashing client under helgrind to detect potential thread races would be useful. note you may need to use the PYTHONMALLOC environment variabel to make python behave in a way amenable to helgrind |
When using an SSL connection for the STOMP protocol, and I make a test to recover from a network disconnect, I get an exception in the thread that uses OpenSSL, reporting a double free in
tls_release_read_buffer
, presumably this one: https://github.com/openssl/openssl/blob/master/ssl/record/methods/tls_common.c#L284This is in a Python environment on macOS, and the stomp-py Python package is used for support of the STOMP protocol.
The OpenSSL API function called by Python is
ssl3_shutdown
, and the Python function called by the stomp-py Python package isssl.SSLSocket.shutdown()
. Note the update in my comment, below, where the Python function and OpenSSL API function is different, but the OpenSSL function that traps is the same.The OpenSSL was installed with Homebrew.
OpenSSL: 3.3.0
macOS: Ventura 13.6.6
Python: 3.12.3
stomp-py: 8.1.2
The network between the Python client application and the STOMP server is via VPN, and when disconnecting the network in this test, I simply disable the VPN.
The error does not always happen, but with a chance >10%.
Important: A far as I can see, the error so far happened only when I had two instances of the program running. They run in separate Python processess, but on the same OS. They both target the same STOMP server. I had one case where both programs trapped (where I did not collect the exception reports), but in the two cases I reported here, only one of the programs trapped and the other did not.
The Python application that establishes this connection is zhmc_log_forwarder from https://github.com/zhmcclient/zhmc-log-forwarder, but to reproduce the issue with that application, an IBM Z HMC would be necessary :-)
For reproduction with that application:
The error happens when the retry logic attempts to reconnect while the network is still disconnected.
Output of the program:
The call stack with the failure can be seen in Thread 2 in the exception report, below.
Full macOS report of the exception:
The text was updated successfully, but these errors were encountered: