That's just what I was looking for - thanks. Performance is vastly
improved now, down to about 12% CPU usage on the same within-nodelet test.


12% still seems a bit high --  I'm running your test now and it's using 0% of my CPU.  Then again I'm on a pretty beefy machine -- what hardware are you running on?

Also, I've updated the wiki to mention this form of publishing:
http://www.ros.org/wiki/roscpp/Overview/Publishers%20and%20Subscribers#Intraprocess_Publishing
http://www.ros.org/wiki/nodelet#Publishing_from_a_Nodelet

Josh