* roscpp: much better *intra*process latency performance
 * nodelets: basically our alternative to shared memory that takes
advantage of the improved roscpp intraprocess performance
 * rospy: much better serialization performance (thanks to James Bowman)

roscpp's marshalling performance has improved slightly as well.  It will improve significantly for deserialization once we can break backwards compatibility and remove the Message base class.
