Intraprocess publishing is only significantly faster if you're publishing a shared_ptr to the message, otherwise it still has to serialize/deserialize as usual.<div><br></div><div>So instead of keeping an image as part of the class, filling that image, and then publishing it, you need to allocate the image each time you publish:</div>

<div>sensor_msgs::ImagePtr image(new sensor_msgs::Image);</div><div>sensor_msgs::fillImage(*image, "bgr8", height, width, 3 * width, data);</div><div>pub_.publish(image);</div><div><br></div><div>It doesn't look like this is explained anywhere on the wiki other than the cturtle release notes: <a href="http://www.ros.org/wiki/ROS/ChangeList/1.2/roscpp_changes#Publish.2BAC8-Subscribe_Changes">http://www.ros.org/wiki/ROS/ChangeList/1.2/roscpp_changes#Publish.2BAC8-Subscribe_Changes</a>, I'll remedy that tomorrow.</div>

<div><br></div><div>Josh<br><br><div class="gmail_quote">On Fri, Sep 17, 2010 at 1:01 AM, Cartwright, Joel J <span dir="ltr"><<a href="mailto:J.J.Cartwright@hw.ac.uk">J.J.Cartwright@hw.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div>

<p><font size="2">Hi all,<br>

<br>

I've been running some rough performance tests on transferring images<br>

using nodelets, and have found the overhead to be higher than I<br>

expected. For reference, I'm running ROS cturtle on Ubuntu 10.04 64 bit<br>

(stock kernel 2.6.32-24-generic) on a 2 GHz Intel Core 2 Duo laptop. CPU<br>

use monitored approximately using htop.<br>

<br>

Exchanging one 1920x1280 bgr8 image at 10 fps incurs ~30% CPU use for<br>

the nodelet manager. As expected, this is better than when the nodelets<br>

are run standalone (then ~25% CPU each), but I wanted to check if I'm<br>

doing something wrong that might affect nodelet performance. I've been<br>

using raw Image publishing rather than image_transport, to keep the test<br>

as simple as possible.<br>

<br>

The image 'source' is just an uninitialised unsigned char array of the<br>

correct size. As a comparison, I created a node that merely copies the<br>

source array a number of times, and found that 5 memcpy operations<br>

produce about the same CPU load as the best nodelet case, around 30%.<br>

Of course CPU caching may improve the speed of those 5 consecutive<br>

memcpy operations, so I'm not saying "ROS is copying the data 5 times".<br>

<br>

I've attached the relevant source files. The question is: Is this the<br>

best we can do at present for image transfer, and if not, are there<br>

modifications on the horizon that will improve performance?<br>

<br>

Joel<br>

--<br>

Research Assistant<br>

Ocean Systems Laboratory<br>

Heriot-Watt University, UK<br>

</font>

</p>

</div>

<br>

<hr>

<font face="ARIAL,HELVETICA" size="-1" color="GRAY">

Heriot-Watt University is a Scottish charity 

registered under charity number SC000278.

<br>

</font>

<br>_______________________________________________<br>

ros-users mailing list<br>

<a href="mailto:ros-users@code.ros.org">ros-users@code.ros.org</a><br>

<a href="https://code.ros.org/mailman/listinfo/ros-users" target="_blank">https://code.ros.org/mailman/listinfo/ros-users</a><br>

<br></blockquote></div><br></div>