-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT, BUG: bump OpenBLAS #20362
MAINT, BUG: bump OpenBLAS #20362
Conversation
* Fixes scipy#20294. * Pull in new OpenBLAS binaries as discussed in the above issue, to solve a bug that affects downstream consumers like `scikit-learn`. The OpenBLAS binary builds happened via MacPython/openblas-libs#149 with CI results at links below; the single s390x arch failure should be safe to ignore, we don't distribute binaries for that arch: https://app.travis-ci.com/github/MacPython/openblas-libs/builds/269741262 https://github.com/MacPython/openblas-libs/actions/runs/8493017997 https://github.com/MacPython/openblas-libs/actions/runs/8493018404 * I had an issue when testing locally with `CIBW_BUILD=cp311-* cibuildwheel --platform linux --archs x86_64`. Let's see if the regular CI catches anything before I try a full blown wheel build here, hopefully that's just some local issue I have. ``` [210/1479] Linking target scipy/special/_ellip_harm_2.cpython-311-x86_64-linux-gnu.so FAILED: scipy/special/_ellip_harm_2.cpython-311-x86_64-linux-gnu.so cc -o scipy/special/_ellip_harm_2.cpython-311-x86_64-linux-gnu.so scipy/special/_ellip_harm_2.cpython-311-x86_64-linux-gnu.so.p/meson-generated__ellip_harm_2.c.o scipy/special/_ellip_harm_2.cpython-311-x86_64-linux-gnu.so.p/sf_error.c.o -Wl,--as-needed -Wl,--allow-shlib-undefined -Wl,-O1 -shared -fPIC -Wl,--start-group -lm -Wl,--version-script=/project/scipy/_build_utils/link-version-pyinit.map -L/usr/local/lib '-l$(libprefix}openblas' -Wl,--end-group /opt/rh/devtoolset-10/root/usr/libexec/gcc/x86_64-redhat-linux/10/ld: cannot find -l$(libprefix}openblas collect2: error: ld returned 1 exit status ``` [skip circle]
Ok, the CI replicates my failure, maybe the hash size/format on the new strings then. |
Ah, the shared lib naming scheme probably just changed a bit. |
Current layout from the
Last known working layout from a non-stable release:
I confirmed that reverting to the original versions/hashes before gh-20215 allows |
On Windows the error looks like:
On local Linux box:
Of course There are quite a few harsh/version/searching related commits at: https://github.com/MacPython/openblas-libs/commits/scipy/ |
On the assumption that the linker stuff gets propagated by pkg-config/
Probably not a coincidence that there's a change in substitution pattern at the first observed failure (last line above/version used in this branch). There are some suffix changes recently upstream here: |
Maybe a stupid question but in your earlier comment is it really correct to have |
I think you may be right. The smoke seems to be coming from the pkg-config
|
* attempt to deal with this issue in `*.pc` `Libs` formatting: scipy/scipy#20362 (comment) * this initial attempt aims to effectively revert back to the original `Libs:` line format that was last functional in SciPy wheel builds
I've initiated an initial attempt to rebuild the OpenBLAS binaries upstream with this patch: MacPython/openblas-libs@082affc Wheel builds are at:
This assumes that the uploads will be able to overwrite the previous binaries. |
* `openblas_support` module now contains a shim to change the pkg-config `openblas.pc` `Libs:` line back to its previous formatting in attempt to do deal with: scipy#20362 (comment) [skip circle]
Although my attempt to fix this upstream failed miserably (the So, I'll flush the CI to see if the Windows version is also happy here, and if so then I'll try full blown wheel builds here after that. Perhaps the increasingly-ugly |
* `openblas_support` module now contains a shim to change the pkg-config `openblas.pc` `Libs:` line to avoid a typo in `libprefix` specification in attempt to do deal with: scipy#20362 (comment) [skip circle]
c93f00e
to
8511570
Compare
* `openblas_support` module now contains a shim to change the pkg-config `openblas.pc` `Libs:` line to avoid a typo in `libprefix` specification in attempt to do deal with: scipy#20362 (comment) [skip circle]
8511570
to
aa22e80
Compare
* `openblas_support` module and `cibw_before_build_win.sh` now contain shims to change the pkg-config `openblas.pc` `Libs:` line to avoid a typo in `libprefix` specification in attempt to do deal with: scipy#20362 (comment) [skip circle] [skip cirrus]
aa22e80
to
408681a
Compare
* shims needed for pkg-config file modification on Windows [skip circle] [skip cirrus]
The only failing CI check here now is unrelated (gh-20365). So, I'll push in an empty commit to probe the wheel builds proper with the new OpenBLAS + shims in this branch. |
* empty commit to test wheel builds [wheel build]
Summarizing current status here, after solving some of the initial problems with shims in this PR branch:
Certainly it is clear why I don't like to frequently try bumping OpenBLAS, I guess some upstream changes have already made things trickier between the fairly recent last bump for the non-wheel infra. |
No, those are the ILP64 builds, they're not usable for SciPy. We need the ones without |
The
Just rerunning the failed builds on that job should fix the problem. (side note: I don't have permissions to do that, could I be added @mattip?) |
The macOS arm64 failures look like a
The runners get picked at random, a re-run made one of the two jobs pass. The second one I retriggered three times, but no luck getting the older runner image. |
@tylerjereddy I bet an upgrade to |
Ah, I ignored that because the job had already passed in the previous iteration, but now I see not only did it fail to upload, it also deleted the correct asset before it did:
|
@rgommers Thanks for the suggestions, I added you as an admin on that repo, so that should do the trick. Once the job that failed to upload completes I'll push a |
* Based on reviewer feedback, bump the `cibuildwheel` version in attempt to deal with GitHub actions MacOS ARM wheel build flakiness. * Also fixed an upstream OpenBLAS build/CI issue that prevented a binary from being uploaded for our consumption in our wheel builds. [wheel build]
Upstream CI upload seems fixed now, and I'm retrying the wheel builds with the latest |
I think the one Cirrus Linux ARM job just timed out at the 1 hour mark exactly, and I believe we've seen that before on occasion and is not related to any changes here. The rest of the CI/wheel builds are green so I may proceed with squash-merging. |
The shims in |
* Fixes scipy#20294. * Pull in new OpenBLAS binaries as discussed in the above issue, to solve a bug that affects downstream consumers like `scikit-learn`. * Some additional shims were needed: bumping `cibuildwheel` to improve GitHub actions MacOS M1 runner support, and `openblas_support` module needed an adjustment to process out an apparent syntax error in the pkg-config file for openblas that we pull in from upstream.
Fixes BUG: Hang on Windows in scikit-learn with 1.13rc1 and 1.14.dev (maybe due to OpenBLAS 0.3.26 update?) #20294.
Pull in new OpenBLAS binaries as discussed in the above issue, to solve a bug that affects downstream consumers like
scikit-learn
. The OpenBLAS binary builds happened via MAINT: build new OpenBLAS for SciPy MacPython/openblas-libs#149 with CI results at links below; the single s390x arch failure should be safe to ignore, we don't distribute binaries for that arch:I had an issue when testing locally with
CIBW_BUILD=cp311-* cibuildwheel --platform linux --archs x86_64
. Let's see if the regular CI catches anything before I try a full blown wheel build here, hopefully that's just some local issue I have.[skip circle]