Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Segfault when reading CSV inside Flight server #28374

Closed
asfimport opened this issue May 1, 2021 · 3 comments
Closed

[Python] Segfault when reading CSV inside Flight server #28374

asfimport opened this issue May 1, 2021 · 3 comments

Comments

@asfimport
Copy link

Using pyarrow.csv.read_csv inside a Flight server results in a segfault. This did not happen in pyarrow 3.0.0.

The CI build of a library we're building failed and made us aware of the issue.

Attached, a CSV and Python server/client can be found that demonstrates the problem.

  • Run the server with python crash.py server.

  • Run the client with python crash.py client. The server segfaults with 'Segmentation fault (core dumped)'.

    The crash does not happen when just reading the CSV (python crash.py).

    This is the stacktrace generated by coredumpctl debug of a debug build of commit 2746266:
    {code:java}
    #0  0x00007f9275cffedc in __gnu_cxx::__atomic_add (__val=1, __mem=0x10) at /usr/include/c++/10.2.0/ext/atomicity.h:55

#1  __gnu_cxx::__atomic_add_dispatch (__val=1, __mem=0x10) at /usr/include/c++/10.2.0/ext/atomicity.h:96

#2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_add_ref_copy (this=0x8)

   at /usr/include/c++/10.2.0/bits/shared_ptr_base.h:142

#3  0x00007f9275cfe0a5 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count (this=0x7f92735a2778,  
   __r=...) at /usr/include/c++/10.2.0/bits/shared_ptr_base.h:740

#4  0x00007f9275cfd01f in std::__shared_ptr<arrow::StopSourceImpl, (__gnu_cxx::_Lock_policy)2>::__shared_ptr (

   this=0x7f92735a2770) at /usr/include/c++/10.2.0/bits/shared_ptr_base.h:1181

#5  0x00007f9275cfd045 in std::shared_ptrarrow::StopSourceImpl::shared_ptr (this=0x7f92735a2770)

   at /usr/include/c++/10.2.0/bits/shared_ptr.h:149

#6  0x00007f9275cfd06b in arrow::StopToken::StopToken (this=0x7f92735a2770)

   at /home/jeroen/dev/python/apache-arrow/dist/include/arrow/util/cancel.h:57

#7  0x00007f9275ce96f7 in __pyx_pf_7pyarrow_4_csv_read_csv (__pyx_self=0x0, __pyx_v_input_file=0x7f929e9f28b0,  
   __pyx_v_read_options=0x7f929f49ee80 <_Py_NoneStruct>, __pyx_v_parse_options=0x7f929f49ee80 <_Py_NoneStruct>,  
   __pyx_v_convert_options=0x7f929f49ee80 <_Py_NoneStruct>, __pyx_v_memory_pool=0x7f929f49ee80 <_Py_NoneStruct>)

   at /home/jeroen/dev/python/apache-arrow/arrow/python/build/temp.linux-x86_64-3.8/_csv.cpp:14208

#8  0x00007f9275ce8b92 in __pyx_pw_7pyarrow_4_csv_1read_csv (__pyx_self=0x0, __pyx_args=0x7f929ea64be0, __pyx_kwds=0x0)

   at /home/jeroen/dev/python/apache-arrow/arrow/python/build/temp.linux-x86_64-3.8/_csv.cpp:14036

#9  0x00007f929f22cf98 in ?? () from /usr/lib/libpython3.8.so.1.0

#10 0x00007f929f22d5f8 in _PyObject_MakeTpCall () from /usr/lib/libpython3.8.so.1.0

Based on my limited understanding of the code, it looks like the error is here:
[https://github.com/apache/arrow/blob/master/python/pyarrow/_csv.pyx#L799]
{code:java}
    with SignalStopHandler() as stop_handler:
                io_context = CIOContext(
                    maybe_unbox_memory_pool(memory_pool),
                    (<StopToken> stop_handler.stop_token).stop_token)

Where stop_token is null, because the SignalStopHandler had an empty list of signals on creation (https://github.com/apache/arrow/blob/master/python/pyarrow/error.pxi#L191).

        if (signal_handlers_enabled and
                threading.current_thread() is threading.main_thread()):
            self._signals = [
                sig for sig in (signal.SIGINT, signal.SIGTERM)
                if signal.getsignal(sig) not in (signal.SIG_DFL,
                                                 signal.SIG_IGN, None)]
        if not self._signals.empty():
            self._stop_token = StopToken()
            self._stop_token.init(GetResultValue(
                SetSignalStopSource()).token())
            self._enabled = True

Environment: Arch Linux 5.11.16-arch1-1
Originally found on GitHub Actions Ubuntu 20.04.2
Python 3.8 and Python 3.9
Reporter: Jeroen Hoekx
Assignee: David Li / @lidavidm

Original Issue Attachments:

PRs and other links:

Note: This issue was originally created as ARROW-12622. Please see the migration documentation for further details.

@asfimport
Copy link
Author

David Li / @lidavidm:
Thanks for reporting this & digging into it. Both the Flight server and the CSV reader try to install signal handlers to properly react to Ctrl-C while in C++ code, so it looks like they conflict; I'll take a look.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 10227
#10227

@asfimport
Copy link
Author

Jeroen Hoekx:
I can confirm that 5259d2b fixes the issue. Our integration tests complete successfully with that version.

 

Thanks for the quick fix and looking forward to see it appear in a release (although we have excluded the pyarrow 4.0.0 from our requirements list for now, so it is mitigated).

@asfimport asfimport added this to the 4.0.1 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants