Improve and update documentation#1207
Conversation
README.md
Outdated
| ### Auto detection of the instruction set extension to be used | ||
|
|
||
| The same computation operating on vectors and using the most performant instruction set available: | ||
| The same computation operating on vectors and using the most performant instruction set available at compile time, absed on the provided compiler flags (e.g. ``-mavx2`` for GCC and Clang to target AVX2): |
There was a problem hiding this comment.
we could add a reference to this and maybe suggest the arch levels to the user?
There was a problem hiding this comment.
I provide an example here: https://github.com/DiamonDinoia/random
Update supported architecture and make various usage scenario more explicit. Fix #1202
372af5c to
19a6d45
Compare
|
we could add a reference to [2]this and maybe suggest the arch levels to the
user?
I'm not sure about that: we're ine README, and I assume people should know stuff
about SIMD when using xsimd. One thing we could do is point at an academic
lesson on SIMD (in addition to the wikipedia page I've already added).
|
I agree, maybe in the dynamic dispatch page we can mention this. |
|
All these changes look positive; however, they still don't make clear the interaction between Xsimd's capabilities and the compiler target for the application. We have 3 primary variables:
What is the interplay between these variables? Can I do something like compile Xsimd for AVX512 no matter what, and automagically, with one client binary (compiled targeting say SSE2, since that was the former Windows 10 baseline hardware requirement), and if the processor supports say AVX2, Xsimd will leverage that? Or are we capped at SSE2 no matter what in this scenario? If I'm using dispatch to handle multiple architectures, say SSE2 and AVX2, do I target SSE2 or AVX2 in my compiler settings? Answers to these questions are still not made clear. This has little to do with understanding how SIMD works per se, nor processor architecture settings, and everything to do with how the library works under the hood, which is still not crystal clear. |
|
Hi @Jaegermeiste, Maybe this presentation helps?
|
|
Interesting - so that makes sense: compile the base application and dispatcher at the lowest common denominator (e.g. SSE2), and each "advanced" version of a translation unit needs to be separately compiled with the appropriate flags, prior to the linking stage. That clears the fog for me. I think that's the clarity that needs to explictly be in the documentation. A nice bonus would be how to reasonably accomplish setting those compiler flags at the per-file level in MSVC and GCC, but I have a feeling you guys would see that as out of scope. |
- Add some note on integration, as a followup to #1207 - Harmonize capitalization of titles And generic documentation improvements to make things, hopefully, easier to understand.
- Add some note on integration, as a followup to #1207 - Harmonize capitalization of titles And generic documentation improvements to make things, hopefully, easier to understand.
Update supported architecture and make various usage scenario more explicit.
Fix #1202