-
Notifications
You must be signed in to change notification settings - Fork 295
fix-gcc-10 #1242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix-gcc-10 #1242
Conversation
|
Thanks. It would be great to have a CI reproducer first, could you have a look or do you want me to handle it? |
|
I could cherry pick the commit here in case you want to merge them in one go |
yes, please do, first the ci commit then the fix :-) |
eac3479 to
d5ed51f
Compare
|
The fact that it breaks it is a good thing for me. I was chasing bugs in finufft with gcc-10. This might be the cause. flatironinstitute/finufft#780 |
the issue seems to be gcc-10 auto vectorization not xsimd explicit one. Adding that this code too causes issues: https://marco.godbolt.org/z/bjo6o8TfG |
|
if that's a GCC issue, specific to gcc-10, we could add something specific, but is that worth the effort? |
|
gcc-10 has a problem with avx512 but there is also an issue that I introduced when changing the avx512 swizzle: flatironinstitute/finufft#780 (comment) I should fix that. For gcc-10 if this does not pass CI, I am not sure what is best. I just need xsimd to work with gcc-10 as is not too old of a compiler otherwise I need a #error in finufft if avx512 is used to compile finufft. |
test/test_shuffle.cpp
Outdated
| } | ||
| }; | ||
|
|
||
| #if defined(__GNUC__) && (__GNUC__ == 10) && XSIMD_WITH_AVX512F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang also defines __GNUC__, probably not as __GNUC__ == 10 but && !defined (__clang__) would be safer.
| template <typename ElemT, typename U, U... Vs> | ||
| XSIMD_INLINE constexpr bool is_cross_lane() noexcept | ||
| { | ||
| static_assert(std::is_integral<U>::value, "swizzle mask values must be integral"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could probably also static_assert one sizeof...(U) to be on the safe side.
| static_assert(sizeof...(Vs) >= 1, "Need at least one lane"); | ||
| constexpr std::size_t N = sizeof...(Vs); | ||
| constexpr std::size_t lane_elems = 16 / sizeof(ElemT); | ||
| return cross_impl128<0, N, lane_elems, U, Vs...>::value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're in C++14, which has better constexpr support you could probably write this in a more procedural style, using a temporary array and a for loop to perform the check?
e9699a8 to
346301a
Compare
Replace _mm512_cvtsi512_si32 with _mm_cvtsi128_si32(_mm512_castsi512_si128(self)) to fix compilation issues with GCC 10. The _mm512_cvtsi512_si32 intrinsic is not available in older compiler versions.
Public API now checks 128-bit (16-byte) lanes, the standard for SSE/AVX/AVX512. Internal helper available for explicit lane sizes. Uses C++14 constexpr procedural style.
346301a to
1f39847
Compare
Add compiler-specific workaround for GCC 10 with AVX-512F in shuffle tests. Use zip_lo/zip_hi as stable reference for the expected interleave pattern instead of manually constructing the reference arrays.
1f39847 to
df8db98
Compare
On my machine with gcc-10 I get the error:
This fixes it for me.