Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement two_qubit_decompose.num_basis_gates in rust #11019

Merged
merged 40 commits into from Jan 29, 2024

Conversation

jlapeyre
Copy link
Contributor

@jlapeyre jlapeyre commented Oct 15, 2023

EDIT: This PR introduces dependency on faer which depends on rustc 1.67, which was released in January 2023. This would cause difficulties for some users who must use older versions of rustc. So this PR remains in draft mode at present.

This PR does two things

I looked at profiling with @mtreinish. It shows that a not-insignificant fraction of CPU time is spent here

  • At the same time, introduce a dependency on the linear algebra library faer. We are looking for a linear library to call from rust. faer is one of several options. A big advantage of faer is that it is pure rust so it incurs no additional build/package/distribute burden. The author prioritizes performance and some benchmarks look good. The project was started, or at least became public, less than a year ago.

    This PR uses faer compute a determinant and eigenvalues of a 4x4 unitary matrix.

Todo:

  • Done make weyl_coordinates return correct values. Or at least agree with current Python implementation.
  • Done make num_basis_get return the correct values. It is currently broken. I just got the code to compile, and have not debugged. Five tests fail if you do tox -epy -- test.python.quantum_info.test_synthesis. No other python tests fail.
  • Done Currently this PR copies numpy complex matrices element by element to faer matrices. This should not be required. The author of faer told me that a version of faer that supports efficient conversion should be released tomorrow. To be clear, the version released tomorrow should support reading numpy arrays without copying.
  • Release notes, formatting, etc. will be done after the above is straightened out.
  • There are some inefficiencies and inelegant things in this PR as stands. We may want to correct these now, or put this off till later.

This partially addresses #8774 . Some of the obvious-looking numerical code in two-qubit synthesis is not in fact a bottleneck. However, this PR would improve performance a bit.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Oct 16, 2023

Existing tests pass. The following shows that num_basis_gate is 13 times faster with this PR.

from qiskit.providers.fake_provider import FakeVigo
from qiskit.quantum_info.synthesis.two_qubit_decompose import TwoQubitBasisDecomposer
from qiskit.circuit.library import CXGate
from qiskit.transpiler.passes.synthesis.unitary_synthesis import _choose_euler_basis
from qiskit.quantum_info import random_unitary

backend = FakeVigo()
conf = backend.configuration()
basis = conf.basis_gates

basis_fidelity = 1.0
euler_basis = _choose_euler_basis(basis)
pulse_optimize = None
gate = CXGate()

def make_twoq_decomposer():
    backup_optimizer = TwoQubitBasisDecomposer(
        gate,
        basis_fidelity=basis_fidelity,
        euler_basis=euler_basis,
        pulse_optimize=pulse_optimize,
    )
    return backup_optimizer

u = random_unitary(4)
decomp = make_twoq_decomposer()

Then

In [2]: %timeit decomp.old_num_basis_gates(u)
96.3 µs ± 838 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [3]: %timeit decomp.num_basis_gates(u)
7.91 µs ± 48.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


#[pyfunction]
#[pyo3(text_signature = "(basis_b, basis_fidelity, unitary, /")]
pub fn _num_basis_gates(basis_b : f64, basis_fidelity : f64, unitary : PyReadonlyArray2<Complex<f64>>) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we can do here is call blocks_to_matrix (https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/convert_2q_block_matrix.rs#L25 ) directly from rust. The output of blocks_to_matrix (via a layer of indirection) gets wrapped in a UnitaryGate and then calls num_basis_gates here: https://github.com/Qiskit/qiskit/blob/main/qiskit/transpiler/passes/optimization/consolidate_blocks.py#L134 to evaluate one of the conditions on whether synthesis is worth it or not (and if it is do the gate consolidation). So we could avoid some python call overhead by just returning this optionally directly from rust.

Although we probably can save this for a follow up to this PR.

Copy link
Contributor Author

@jlapeyre jlapeyre Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to put it in here. It looks pretty straightforward, and should be done. But there are a few other issues with this PR to take care of.

@jlapeyre jlapeyre marked this pull request as ready for review October 16, 2023 18:38
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Eric-Arellano
  • @Qiskit/terra-core
  • @ikkoham
  • @kevinhartman
  • @levbishop
  • @mtreinish

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Oct 16, 2023

Several tests are failing. Many that call num_basis_gates from consolidate_blocks. Fixed

Cargo.toml Outdated Show resolved Hide resolved
@jlapeyre
Copy link
Contributor Author

jlapeyre commented Oct 17, 2023

EDIT: This bug has been fixed in faer 0.13.1

Some of the matrices encountered in the test suite trigger a bug (or bugs) in faer.
sarah-ek/faer-rs#78

@jlapeyre
Copy link
Contributor Author

Calculating eigenvalues of 4 x 4 matrices is faster with faer. Here eigvals is from numpy, and eigenvalues is a Python interface (in this PR) to the faer routine.

In [16]: u = random_unitary(4).to_matrix()

In [17]: %timeit eigenvalues(u)  # faer
3.83 µs ± 24.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [18]: %timeit eigvals(u)  # numpy
12.5 µs ± 65.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [19]: 12.5 / 3.83
Out[19]: 3.2637075718015667

* Moved `eigenvalues` into module `utils`.
* Access num_basis_gates via module two_qubit_decompose
@coveralls
Copy link

coveralls commented Oct 17, 2023

Pull Request Test Coverage Report for Build 7700497765

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • -14 of 135 (89.63%) changed or added relevant lines in 4 files are covered.
  • 113 unchanged lines in 15 files lost coverage.
  • Overall coverage increased (+0.009%) to 89.535%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/two_qubit_decompose.rs 107 108 99.07%
crates/accelerate/src/utils.rs 9 22 40.91%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 93.81%
qiskit/passmanager/passmanager.py 1 93.62%
qiskit/primitives/containers/bindings_array.py 1 92.59%
qiskit/transpiler/passes/layout/set_layout.py 1 95.65%
crates/qasm2/src/lex.rs 2 91.94%
qiskit/utils/parallel.py 3 81.48%
qiskit/test/base.py 5 81.75%
crates/qasm2/src/parse.rs 6 97.62%
crates/accelerate/src/sabre_swap/layer.rs 7 95.27%
qiskit/circuit/gate.py 7 90.91%
Totals Coverage Status
Change from base Build 7659269267: 0.009%
Covered Lines: 59661
Relevant Lines: 66634

💛 - Coveralls

jlapeyre and others added 3 commits January 23, 2024 13:51
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Apply several suggested edits made in review

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
* We introduced to_num_complex, but this required upgrading faer. We chose 0.15
* Upgrade our code, in turn, for the upgrade to faer 0.15
* Remove a function exported via pyo3 that was only used for testing
* Remove `def old_num_basis_gate`
Copy link
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that in #11460 we deprecated quantum_info/synthesis and moved synthesis code to qiskit/synthesis.

@jlapeyre jlapeyre force-pushed the use-faer-crate branch 2 times, most recently from 1018bb8 to 09b4ecb Compare January 25, 2024 02:24
We recently increased the version of the faer dependency.
The newer version has an abs2 in the api named `faer_abs2`.

So we can delete `myabs2`.
This was previously at 0.15 for accidental reasons.
@mtreinish mtreinish self-assigned this Jan 29, 2024
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, I think this is basically ready. I just had a few very minor inline comments.

crates/accelerate/Cargo.toml Outdated Show resolved Hide resolved
Comment on lines 77 to 79
let pi = PI;
let pi2 = PI / 2.0;
let pi4 = PI / 4.0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make these module constants, especially because pi / 4 is reused in multiple functions. I expect it'd be compiled out but it's probably bit better to be explicit here. It'd be something like:

const PI2: f64 = PI / 2.0;
const PI4: f64 = PI / 4.0;
const PI32: f64 = 3.0 * PI2;

crates/accelerate/src/two_qubit_decompose.rs Outdated Show resolved Hide resolved
crates/accelerate/src/two_qubit_decompose.rs Outdated Show resolved Hide resolved
crates/accelerate/src/two_qubit_decompose.rs Outdated Show resolved Hide resolved
Comment on lines +155 to +159
// The originial Python had `np.argmax`, which returns the lowest index in case two or more
// values have a common maximum value.
// `max_by` and `min_by` return the highest and lowest indices respectively, in case of ties.
// So to reproduce `np.argmax`, we use `min_by` and switch the order of the
// arguments in the comparison.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment here, it explains the logic clearly and otherwise it'd be confusing to see this logic otherwise.

@mtreinish mtreinish added this pull request to the merge queue Jan 29, 2024
Merged via the queue into Qiskit:main with commit 11a826b Jan 29, 2024
12 checks passed
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Mar 11, 2024
This commit is a follow up to Qiskit#11019 that uses the rust implementation
of computing the weyl coordinates for a unitary and the
transform_to_magic_basis functions everywhere. Previously they were only
used internally by the num_basis_gates() rust function to estimate the
number of basis gates needed for a decomposition of a given unitary.
This will likely be superseded by Qiskit#11946 when that is working, but this
is mainly an incremental step working towards finishin Qiskit#11946 that
proves the internal rust functions work in the larger decomposer code to
isolate the source of failures in Qiskit#11946.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants