[ros-users] Question about cmake and SSE optimizations

Radu Bogdan Rusu rusu at willowgarage.com
Tue Mar 23 19:37:47 UTC 2010


Ah yes, binaries... darn. Remember when you used to compile your entire OS from scratch? Those were the days.

It's not going to be easy to provide binary packages for all architectures/optimizations. I mean 32bit vs 64bit is one 
thing, but now if we want different packages for SSE, SSE2, SSE3, SSE4... that won't go well.

Something tells me that we might want to keep these things as cmake macros for source packages, and just go with one or 
two optimizations for binary ones (?).

Cheers,
Radu.

Rosen Diankov wrote:
> i guess since users have to manually add the debian locations to their
> sources.list, the problem can be resolved by keeping sse optimizations
> in different debian packages urls. so yes, it would be great to have
> rosbuild_add_sse_flags, although i don't see any reason why we
> couldn't always enable the checks and then have some global flags that
> force to a particular sse for distribution purposes.
> 
> rosen,
> 
> 
> 2010/3/23 Rosen Diankov <rosen.diankov at gmail.com>:
>> awesome work with sse3 jose! i just checked it on a non-sse3 cpu and
>> it behaves as expected.
>>
>> i agree with radu that we should go with the highest sse available,
>> however you have to be very careful when distributing binaries
>> (especially now that ros has debian releases). i'm not sure what would
>> be the appropriate way to distribute packages compiled with no-sse,
>> sse1 only, sse1-2 only, etc. in another, more simpler project i was
>> in, we ended up having one executable for every setting..... for
>> sse-optimized math libraries, these checks are actually done during
>> runtime, so they can support multiple sse versions without dealing
>> with this problem
>>
>> rosen,
>>
>> 2010/3/23 Jose Gonzalez <de0a100 at gmail.com>:
>>> Here is the version I got with SSE3 (I'm not familiar with MSVC so I
>>> didn't include it)
>>>
>>> # check for SSE extensions
>>> include(CheckCXXSourceRuns)
>>> if( CMAKE_COMPILER_IS_GNUCC OR CMAKE_COMPILER_IS_GNUCXX )
>>>  set(SSE_FLAGS)
>>>
>>>  set(CMAKE_REQUIRED_FLAGS "-msse3")
>>>  check_cxx_source_runs("
>>>   #include <pmmintrin.h>
>>>
>>>   int main()
>>>   {
>>>      __m128d a, b;
>>>      double vals[2] = {0};
>>>      a = _mm_loadu_pd(vals);
>>>      b = _mm_hadd_pd(a,a);
>>>      _mm_storeu_pd(vals, b);
>>>      return 0;
>>>   }"
>>>   HAS_SSE3_EXTENSIONS)
>>>
>>>  set(CMAKE_REQUIRED_FLAGS "-msse2")
>>>  check_cxx_source_runs("
>>>   #include <emmintrin.h>
>>>
>>>   int main()
>>>   {
>>>       __m128d a, b;
>>>       double vals[2] = {0};
>>>       a = _mm_loadu_pd(vals);
>>>       b = _mm_add_pd(a,a);
>>>       _mm_storeu_pd(vals,b);
>>>       return 0;
>>>    }"
>>>    HAS_SSE2_EXTENSIONS)
>>>
>>>  set(CMAKE_REQUIRED_FLAGS "-msse")
>>>  check_cxx_source_runs("
>>>   #include <xmmintrin.h>
>>>   int main()
>>>   {
>>>       __m128 a, b;
>>>       float vals[4] = {0};
>>>       a = _mm_loadu_ps(vals);
>>>       b = a;
>>>       b = _mm_add_ps(a,b);
>>>       _mm_storeu_ps(vals,b);
>>>       return 0;
>>>   }"
>>>   HAS_SSE_EXTENSIONS)
>>>
>>>  set(CMAKE_REQUIRED_FLAGS)
>>>
>>>  if(HAS_SSE3_EXTENSIONS)
>>>   message(STATUS "Using SSE3 extensions")
>>>   set(SSE_FLAGS "-msse3 -mfpmath=sse")
>>>  elseif(HAS_SSE2_EXTENSIONS)
>>>   message(STATUS "Using SSE2 extensions")
>>>   set(SSE_FLAGS "-msse2 -mfpmath=sse")
>>>  elseif(HAS_SSE_EXTENSIONS)
>>>   message(STATUS "Using SSE extensions")
>>>   set(SSE_FLAGS "-msse -mfpmath=sse")
>>>  endif()
>>>
>>>  add_definitions(${SSE_FLAGS})
>>> elseif(MSVC)
>>>  check_cxx_source_runs("
>>>   #include <emmintrin.h>
>>>
>>>   int main()
>>>   {
>>>       __m128d a, b;
>>>       double vals[2] = {0};
>>>       a = _mm_loadu_pd(vals);
>>>       b = _mm_add_pd(a,a);
>>>       _mm_storeu_pd(vals,b);
>>>       return 0;
>>>    }"
>>>    HAS_SSE2_EXTENSIONS)
>>>  if( HAS_SSE2_EXTENSIONS )
>>>   message(STATUS "Using SSE2 extensions")
>>>   add_definitions( "/arch:SSE2 /fp:fast -D__SSE__ -D__SSE2__" )
>>>  endif()
>>> endif()
>>>
>>> On Tue, Mar 23, 2010 at 2:24 PM, Radu Bogdan Rusu <rusu at willowgarage.com> wrote:
>>>> In terms of SSE only, we can safely go to the highest SSE version available.
>>>>
>>>> Brian Gerkey wrote:
>>>>> It would be great to see this functionality wrapped up in a single
>>>>> CMake macro, something like:
>>>>>   rosbuild_add_sse_flags(target)
>>>>> It would do checks, compute flags, then call
>>>>> rosbuild_add_compile_flags(target ...).  Of course, this assumes that
>>>>> you want the highest / best level of SSE available; I'm not familiar
>>>>> enough with these optimizations to know if that's appropriate, or if
>>>>> you'd rather have finer-grained control.
>>>>>
>>>>> I'd happily add such a macro to rosbuild, for general reuse.  Come to
>>>>> think of it, this would be a nice module to contribute to CMake
>>>>> itself.
>>>>>
>>>>>       brian.
>>>>>
>>>>> On Tue, Mar 23, 2010 at 11:14 AM, Radu Bogdan Rusu
>>>>> <rusu at willowgarage.com> wrote:
>>>>>> Great job Rosen! That's exactly what I had in mind.
>>>>>>
>>>>>> We should add SSE3 and SSE4. Gotta love that dot product :)
>>>>>>
>>>>>> Cheers,
>>>>>> Radu.
>>>>>>
>>>>>> Rosen Diankov wrote:
>>>>>>> i'm pasting cmake code i use for sse checks. it checks for both sse1
>>>>>>> and sse2, although similar checks can be used for sse3-5, there's also
>>>>>>> a check for msvc
>>>>>>>
>>>>>>> # check for SSE extensions
>>>>>>> include(CheckCXXSourceRuns)
>>>>>>> if( CMAKE_COMPILER_IS_GNUCC OR CMAKE_COMPILER_IS_GNUCXX )
>>>>>>>   set(SSE_FLAGS)
>>>>>>>
>>>>>>>   set(CMAKE_REQUIRED_FLAGS "-msse2")
>>>>>>>   check_cxx_source_runs("
>>>>>>>     #include <emmintrin.h>
>>>>>>>
>>>>>>>     int main()
>>>>>>>     {
>>>>>>>         __m128d a, b;
>>>>>>>         double vals[2] = {0};
>>>>>>>         a = _mm_loadu_pd(vals);
>>>>>>>         b = _mm_add_pd(a,a);
>>>>>>>         _mm_storeu_pd(vals,b);
>>>>>>>         return 0;
>>>>>>>      }"
>>>>>>>      HAS_SSE2_EXTENSIONS)
>>>>>>>
>>>>>>>   set(CMAKE_REQUIRED_FLAGS "-msse")
>>>>>>>   check_cxx_source_runs("
>>>>>>>     #include <xmmintrin.h>
>>>>>>>     int main()
>>>>>>>     {
>>>>>>>         __m128 a, b;
>>>>>>>         float vals[4] = {0};
>>>>>>>         a = _mm_loadu_ps(vals);
>>>>>>>         b = a;
>>>>>>>         b = _mm_add_ps(a,b);
>>>>>>>         _mm_storeu_ps(vals,b);
>>>>>>>         return 0;
>>>>>>>     }"
>>>>>>>     HAS_SSE_EXTENSIONS)
>>>>>>>
>>>>>>>   set(CMAKE_REQUIRED_FLAGS)
>>>>>>>
>>>>>>>   if(HAS_SSE2_EXTENSIONS)
>>>>>>>     message(STATUS "Using SSE2 extensions")
>>>>>>>     set(SSE_FLAGS "-msse2 -mfpmath=sse")
>>>>>>>   elseif(HAS_SSE_EXTENSIONS)
>>>>>>>     message(STATUS "Using SSE extensions")
>>>>>>>     set(SSE_FLAGS "-msse -mfpmath=sse")
>>>>>>>   endif()
>>>>>>>
>>>>>>>   add_definitions(${SSE_FLAGS})
>>>>>>> elseif(MSVC)
>>>>>>>   check_cxx_source_runs("
>>>>>>>     #include <emmintrin.h>
>>>>>>>
>>>>>>>     int main()
>>>>>>>     {
>>>>>>>         __m128d a, b;
>>>>>>>         double vals[2] = {0};
>>>>>>>         a = _mm_loadu_pd(vals);
>>>>>>>         b = _mm_add_pd(a,a);
>>>>>>>         _mm_storeu_pd(vals,b);
>>>>>>>         return 0;
>>>>>>>      }"
>>>>>>>      HAS_SSE2_EXTENSIONS)
>>>>>>>   if( HAS_SSE2_EXTENSIONS )
>>>>>>>     message(STATUS "Using SSE2 extensions")
>>>>>>>     add_definitions( "/arch:SSE2 /fp:fast -D__SSE__ -D__SSE2__" )
>>>>>>>   endif()
>>>>>>> endif()
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2010/3/24 Jose Gonzalez <de0a100 at gmail.com>:
>>>>>>>> Thanks for the comments?
>>>>>>>>
>>>>>>>> What about  check_cxx_compiler_flag("-msse2" COMPILER_SUPPORT_SSE2)?
>>>>>>>> Have you ever tried it?
>>>>>>>>
>>>>>>>> On Tue, Mar 23, 2010 at 12:16 PM, Radu Bogdan Rusu
>>>>>>>> <rusu at willowgarage.com> wrote:
>>>>>>>>> Depending on what compiler and cpu you use, some of these flags might be already turned on by default. You can check
>>>>>>>>> that using:
>>>>>>>>>
>>>>>>>>> $ gcc -Q -O3 --help=target
>>>>>>>>>
>>>>>>>>> for example.
>>>>>>>>>
>>>>>>>>> Other than that, we don't have a proper way to enable/disable different SSE optimizations for now. Most ROS packages
>>>>>>>>> cram a add_definitions ("-msse... etc") in their CMakeLists.txt.
>>>>>>>>>
>>>>>>>>> Patches to make this more standard across multiple platforms and enable/disable SSE1-5 automatically would be great!
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Radu.
>>>>>>>>>
>>>>>>>>> Jose Gonzalez wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Doing some experiments in the past with OpenCV (linux/g++) I realized
>>>>>>>>>> that there is a significant performance difference between the
>>>>>>>>>> versions compiled with autoconf and cmake. Digging in the compilation
>>>>>>>>>> scripts I came to the conclusion that the only extra flags that were
>>>>>>>>>> added by the autoconf scripts were the SSE optimizations (and openMP).
>>>>>>>>>>
>>>>>>>>>> Talking about the SSE optimizations, I can see how ROS sets the flags
>>>>>>>>>> in a similar way. When the detected platform is i686, all the SSE
>>>>>>>>>> optimizations are disabled by default (there is a comment "SSE/SSE2
>>>>>>>>>> might probably be not available for some old AMD or VIA processors").
>>>>>>>>>> Before finding out a solution by myself I'd like to ask the list if
>>>>>>>>>> somebody knows a standard way of testing for SSE optimizations in
>>>>>>>>>> cmake.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jose
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>> Download Intel® Parallel Studio Eval
>>>>>>>>>> Try the new software tools for yourself. Speed compiling, find bugs
>>>>>>>>>> proactively, and fine-tune applications for parallel performance.
>>>>>>>>>> See why Intel Parallel Studio got high marks during beta.
>>>>>>>>>> http://p.sf.net/sfu/intel-sw-dev
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ros-users mailing list
>>>>>>>>>> ros-users at lists.sourceforge.net
>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/ros-users
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ros-users mailing list
>>>>>>>>>> ros-users at code.ros.org
>>>>>>>>>> https://code.ros.org/mailman/listinfo/ros-users
>>>>>>>>> --
>>>>>>>>> | Radu Bogdan Rusu | http://rbrusu.com/
>>>>>>>>> _______________________________________________
>>>>>>>>> ros-users mailing list
>>>>>>>>> ros-users at code.ros.org
>>>>>>>>> https://code.ros.org/mailman/listinfo/ros-users
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> ros-users mailing list
>>>>>>>> ros-users at code.ros.org
>>>>>>>> https://code.ros.org/mailman/listinfo/ros-users
>>>>>>>>
>>>>>> --
>>>>>> | Radu Bogdan Rusu | http://rbrusu.com/
>>>>>> _______________________________________________
>>>>>> ros-users mailing list
>>>>>> ros-users at code.ros.org
>>>>>> https://code.ros.org/mailman/listinfo/ros-users
>>>>>>
>>>> --
>>>> | Radu Bogdan Rusu | http://rbrusu.com/
>>>> _______________________________________________
>>>> ros-users mailing list
>>>> ros-users at code.ros.org
>>>> https://code.ros.org/mailman/listinfo/ros-users
>>>>
>>> _______________________________________________
>>> ros-users mailing list
>>> ros-users at code.ros.org
>>> https://code.ros.org/mailman/listinfo/ros-users
>>>

-- 
| Radu Bogdan Rusu | http://rbrusu.com/



More information about the ros-users mailing list