You were right that the problem is not rosout dying. In my latest set of trials, the failure occurred about 200 trials in without rosout respawning this time. On the other hand, there's no sign of deadlock. The scenario is that node1 sends a ReadyMessage to node2 telling it that it is ready for the next image/depth pair. node2 sends an image and depth map to node3 (depth_to_cloud) to constuct a PointCloud from the depth map to pass back along to node1. The problem seems to be that the chain of messages gets broken at depth_to_cloud. Below is a sample of the debug output I'm seeing from depth_to_cloud. This output continues indefinitely until I kill the depth_to_cloud process. Am I right to believe that this output explains the behavior I am seeing? If so, what can I do about it? Also, I think this output may be responsible for the respawning of rosout. rosout uses quite a lot of memory during these connection failure messages. Could it perhaps be trying to store all of these in memory in addition to in the log files and eventually running out of memory? Thanks, Mike [roscpp_internal] [2010-07-02 09:45:14,190] [thread 0x7f2e9ba8f910]: [DEBUG] Connection to publisher [TCPROS connection to [0.0.0.0:25344 on socket 53]] to topic [/rgbd/depth] dropped [roscpp_internal] [2010-07-02 09:45:14,314] [thread 0x7f2e9a28c910]: [DEBUG] Retrying connection to [pr-seattle-1:57774] for topic [/rgbd/image] [roscpp_internal] [2010-07-02 09:45:14,315] [thread 0x7f2e9a28c910]: [DEBUG] Resolved publisher host [pr-seattle-1] to [127.0.1.1] [roscpp_internal] [2010-07-02 09:45:14,315] [thread 0x7f2e9a28c910]: [DEBUG] Enabling TCP Keepalive on socket [53] [roscpp_internal] [2010-07-02 09:45:14,315] [thread 0x7f2e9a28c910]: [DEBUG] Connect succeeded to [pr-seattle-1:57774] on socket [53] [roscpp_internal] [2010-07-02 09:45:14,315] [thread 0x7f2e9a28c910]: [DEBUG] recv() failed with error [Connection refused] [roscpp_internal] [2010-07-02 09:45:14,319] [thread 0x7f2e9ba8f910]: [DEBUG] Socket [53] received 0/65536 bytes, closing [roscpp_internal] [2010-07-02 09:45:14,319] [thread 0x7f2e9ba8f910]: [DEBUG] TCP socket [53] closed [roscpp_internal] [2010-07-02 09:45:14,319] [thread 0x7f2e9ba8f910]: [DEBUG] Connection to publisher [TCPROS connection to [0.0.0.0:25344 on socket 53]] to topic [/rgbd/image] dropped [roscpp_internal] [2010-07-02 09:45:14,350] [thread 0x7f2e9a28c910]: [DEBUG] Retrying connection to [pr-seattle-1:57774] for topic [/rgbd/depth] [roscpp_internal] [2010-07-02 09:45:14,350] [thread 0x7f2e9a28c910]: [DEBUG] Resolved publisher host [pr-seattle-1] to [127.0.1.1] [roscpp_internal] [2010-07-02 09:45:14,350] [thread 0x7f2e9a28c910]: [DEBUG] Enabling TCP Keepalive on socket [53] [roscpp_internal] [2010-07-02 09:45:14,350] [thread 0x7f2e9a28c910]: [DEBUG] Connect succeeded to [pr-seattle-1:57774] on socket [53] [roscpp_internal] [2010-07-02 09:45:14,351] [thread 0x7f2e9a28c910]: [DEBUG] recv() failed with error [Connection refused] [roscpp_internal] [2010-07-02 09:45:14,351] [thread 0x7f2e9ba8f910]: [DEBUG] Socket [53] received 0/65536 bytes, closing [roscpp_internal] [2010-07-02 09:45:14,351] [thread 0x7f2e9ba8f910]: [DEBUG] TCP socket [53] closed [roscpp_internal] [2010-07-02 09:45:14,351] [thread 0x7f2e9ba8f910]: [DEBUG] Connection to publisher [TCPROS connection to [0.0.0.0:25344 on socket 53]] to topic [/rgbd/depth] dropped [roscpp_internal] [2010-07-02 09:45:14,365] [thread 0x7f2e9a28c910]: [DEBUG] Retrying connection to [pr-seattle-1:33946] for topic [/rgbd/depth] [roscpp_internal] [2010-07-02 09:45:14,365] [thread 0x7f2e9a28c910]: [DEBUG] Resolved publisher host [pr-seattle-1] to [127.0.1.1] [roscpp_internal] [2010-07-02 09:45:14,365] [thread 0x7f2e9a28c910]: [DEBUG] Enabling TCP Keepalive on socket [53] [roscpp_internal] [2010-07-02 09:45:14,365] [thread 0x7f2e9a28c910]: [DEBUG] Connect succeeded to [pr-seattle-1:33946] on socket [53] [roscpp_internal] [2010-07-02 09:45:14,365] [thread 0x7f2e9a28c910]: [DEBUG] recv() failed with error [Connection refused] [roscpp_internal] [2010-07-02 09:45:14,366] [thread 0x7f2e9ba8f910]: [DEBUG] Socket [53] received 0/65536 bytes, closing [roscpp_internal] [2010-07-02 09:45:14,366] [thread 0x7f2e9ba8f910]: [DEBUG] TCP socket [53] closed [roscpp_internal] [2010-07-02 09:45:14,367] [thread 0x7f2e9ba8f910]: [DEBUG] Connection to publisher [TCPROS connection to [0.0.0.0:25344 on socket 53]] to topic [/rgbd/depth] dropped [roscpp_internal] [2010-07-02 09:45:14,463] [thread 0x7f2e9a28c910]: [DEBUG] Retrying connection to [pr-seattle-1:33946] for topic [/rgbd/image] [roscpp_internal] [2010-07-02 09:45:14,463] [thread 0x7f2e9a28c910]: [DEBUG] Resolved publisher host [pr-seattle-1] to [127.0.1.1] [roscpp_internal] [2010-07-02 09:45:14,463] [thread 0x7f2e9a28c910]: [DEBUG] Enabling TCP Keepalive on socket [53] [roscpp_internal] [2010-07-02 09:45:14,463] [thread 0x7f2e9a28c910]: [DEBUG] Connect succeeded to [pr-seattle-1:33946] on socket [53] [roscpp_internal] [2010-07-02 09:45:14,463] [thread 0x7f2e9a28c910]: [DEBUG] recv() failed with error [Connection refused] [roscpp_internal] [2010-07-02 09:45:14,464] [thread 0x7f2e9ba8f910]: [DEBUG] Socket [53] received 0/65536 bytes, closing [roscpp_internal] [2010-07-02 09:45:14,464] [thread 0x7f2e9ba8f910]: [DEBUG] TCP socket [53] closed > rosout dying shouldn't affect this unless it's somehow deadlocked those nodes... can you attach gdb to them and get traces from all their threads with "thread apply all bt"? > > Josh