Dark Bit Factory & Gravity
PROGRAMMING => C / C++ /C# => Topic started by: Raizor on August 02, 2012
-
Does anyone know if it's possible to use SSE2 extensions when using the NODEFAULTLIB flag? I'm just wondering if theres some neat workaround that will give me the speed advantage with minimal code size increase. Any suggestions would be great, thanks.
-
I don't see why not. The SSE compiler flags generate SSE instructions not function calls to libraries, as do the intrinsics. You might need to code a few libraries of your own to fix link problems, e.g. long->float and float->long. Depends on compile flags/VC version.
Jim
-
When you enable sse code generation, VC tries to auto-vectorize (and it's not very good at it).
This ends up in two possible situations:
- It computes intermediate values in parallel,producing a single output value
- It computes four output values in parallel
The latter produces quite a bit of code-size overhead because it has to handle the non-multiple-of-4 cases.
-
Thanks guys, I've managed to get it working quite nicely and code size doesn't seem to have been affected too much. K++