Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LLVM x86 SSE4.2 intrinsics #3622

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

TDecking
Copy link

SSE4.2 is arguably the least important SIMD extension for the x86 ISA, but it should still be supported for the sake of completeness.

@TDecking TDecking force-pushed the sse4_2 branch 2 times, most recently from 1563021 to 6ee977c Compare May 20, 2024 00:51
@RalfJung
Copy link
Member

@eduardosm you've done all the other x86 intrinsics -- can you help review this one? That would be great. :)
(No pressure, and no rush either.)

@RalfJung
Copy link
Member

RalfJung commented May 20, 2024

@TDecking
Thanks for the PR!

I see you've done a whole lot of back-and-forth with CI. Is running tests locally not working for you? We have instructions for that here. If that does not work, please describe your problem on Zulip; having to wait for CI adds a lot of friction so we should figure out how to get it all running on your machine.

Copy link
Contributor

@eduardosm eduardosm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I never had the nerve to implement SSE4.2 string manipulation intrinsics.

I believe this is still work-in-progress, so I left some comments that I hope you find useful to get things working :)

src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
@TDecking
Copy link
Author

@eduardosm Thanks for your input. The implementation is fixed and ready.

src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
@TDecking
Copy link
Author

@eduardosm Ready.

@TDecking
Copy link
Author

@rustbot ready

@rustbot rustbot added the S-waiting-on-review Status: Waiting for a review to complete label May 25, 2024
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I can't really review the actual implementations here, as I don't have any clue what they are even trying to implement. 😂 I think in general that's fine, since we have tests -- but to make future maintenance possible here, I ask you to please add a lot more comments. There's a bunch of concepts that are just assumed to be known here, which you have to explain -- like how a string is passed to these functions, or what the difference is between "implicit" and "explicit" (or what these terms even mean).

Also, how good is the test coverage? If someone added a random bug in the middle of one of your functions, would the tests catch that? If not, then it'd be great if you could add more tests, so that when we have to adjust this code in the future (e.g. for Miri API changes) we don't accidentally break it.

check_shim(unprefixed_name, this, link_name, abi, args)?;
let mask = compare_strings(this, &str1, &str2, len, imm)?;

if imm & 0b100_0000 != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this comparison about?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distinguishing between a bit mask and a byte mask. This comparison is now commented.

}

// Used to implement the `_mm_cmpestrm` and the `_mm_cmpistrm` functions.
// These functions compare the input strings and return the resulting mask.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the "resulting mask"?

}

/// The main worker for the string comparison intrinsics, where the given
/// strings are analyzed according to the given immediate byte.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to briefly say what the arguments mean, and link to the full documentation.

Consider that this will have to be maintained by people that have never seen any of these intrinsics ever before. :) Please help them by leaving comments they can use as starting point should there ever be bugs or other questions around these intrinsics.

Ok(result)
}

fn check_shim<'mir, 'tcx: 'mir>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we have a global method called check_shim, please call this something more specific. Also add a doc comment saying what it does, and what all the components in the return type mean.

This seems to be used by all the string functions. So maybe this is a good place to explain what an "input string" even is -- how is the string buffer passed to the function, how is the length determined, etc. Again, assume the reader knows nothing about SSE4.2 (or really, SSE in general), and add enough comments that would give them a basic idea of the structure and goals of the code.

}

#[target_feature(enable = "sse4.2")]
unsafe fn test_crc() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an easily accessible set of test vectors we could use to increase coverage here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I am aware of it. I decided to attach an additional test case for each of the intrinsics just in case.

}

#[target_feature(enable = "sse4.2")]
unsafe fn test_str() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having some tests is a great way to understand what an API does... but here the tests have absolutely no comments so it's not super helpful unfortunately. Is it possible to add some comments that explain what is being tested (at a high level, i.e. for a reader that doesn't know the SSE4.2 functions)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I increased the test coverage of the string analysis intrinsics to cover the remaining features offered by them.
These as well as the remaining tests have comments attached to them.

@RalfJung
Copy link
Member

@eduardosm thanks a lot for your help with the initial review here! 💛

@RalfJung
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels May 27, 2024
src/shims/x86/sse42.rs Outdated Show resolved Hide resolved
@TDecking
Copy link
Author

@rustbot ready.

@RalfJung I've added some additional test cases, which should cover every possible code path now.
I've also improved the documentation, but I'm not certain wether it is sufficient. To what extent may I assume that a maintainer
will google the intrinsics when performing a major refactor?

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Waiting for a review to complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants