[ros-users] Problem synchronizing between nodes

John Daly jmdaly at gmail.com
Sat Dec 4 05:08:03 UTC 2010


Hi Ken,

Thanks for your quick reply!

> I don't write controllers, but the issue I think you're having is that
> you're writing clock-driven code (i.e. rate.sleep()), when you really
> want data-driven code.  A data-driven node receives a message, does
> something to it, and publishes the result(s), w/o attempting to loop
> at a particular rate -- the receipt of the message doubles as a clock.
> You would still need a clock-driven code to determine the rate of the
> data going into the system.

I think I've implemented what you suggested. In Node A, I still have a
while loop that's sleeping for some period of time, and then
publishing the state of the robot. This simulates data coming in from
the robot at a regular interval. Then, instead of a while loop with a
sleep in Node B, I've put all of the control law calculations into a
callback that runs when Node A publishes the robot state.

When I run this, I notice that with Node A publishing at a rate of 10
Hz, things seem ok from a synchronization point of view. But my
algorithms need to run at a faster rate, around 100 Hz, to give
correct results. When I run Node A at 100 Hz, it looks like things are
not synchronizing properly. To test this, I put print statements in
the code - one when Node A publishes the state, and another when the
callback in Node B runs. The output "Just finished integrating the
dynamics..." comes from the Node A while loop, while "callback
controller at..." comes from the Node B callback.

The first output here is from running Node A at 10 Hz:

Just finished integrating the dynamics at t = 		1291437697329539060
callback controller at t = 				1291437697329794883
Just finished integrating the dynamics at t = 		1291437697429383039
callback controller at t = 				1291437697429588079
Just finished integrating the dynamics at t = 		1291437697529381036
callback controller at t = 				1291437697529568910
Just finished integrating the dynamics at t = 		1291437697629343032
callback controller at t = 				1291437697629530906
Just finished integrating the dynamics at t = 		1291437697729335069
callback controller at t = 				1291437697729532957
Just finished integrating the dynamics at t = 		1291437697829340934

Things are definitely running in the right order here. (The time is in
nanoseconds I believe). Running Node A at 100 Hz gives the following:

Just finished integrating the dynamics at t = 		1291437442491056919
Just finished integrating the dynamics at t = 		1291437442501096010
Just finished integrating the dynamics at t = 		1291437442511007070
Just finished integrating the dynamics at t = 		1291437442521030902
callback controller at t = 				1291437442523684024
callback controller at t = 				1291437442524015903
callback controller at t = 				1291437442524272918
callback controller at t = 				1291437442524468898
Just finished integrating the dynamics at t = 		1291437442531044960
Just finished integrating the dynamics at t = 		1291437442540982961
Just finished integrating the dynamics at t = 		1291437442551059007
callback controller at t = 				1291437442558670997
callback controller at t = 				1291437442558999061
callback controller at t = 				1291437442559350967
Just finished integrating the dynamics at t = 		1291437442561067104
callback controller at t = 				1291437442561253070
Just finished integrating the dynamics at t = 		1291437442571063995
callback controller at t = 				1291437442571234941
Just finished integrating the dynamics at t = 		1291437442581100940
Just finished integrating the dynamics at t = 		1291437442590981006
Just finished integrating the dynamics at t = 		1291437442600979089
Just finished integrating the dynamics at t = 		1291437442610976934
callback controller at t = 				1291437442612562894
callback controller at t = 				1291437442612710952
callback controller at t = 				1291437442612912893
callback controller at t = 				1291437442613102912

So in this case, Node A will publish a number of times, then the
callback in Node B will run a number of times (I presume working on
messages that have been cued up.) The fact that the two processes
aren't in sync anymore is a big problem for this algorithm, as it
means Node A will be working on old data for several timesteps, and
the same with the callback in Node B.

I'm wondering if maybe this is a process scheduling problem. It seems
like Node A isn't being preempted when I'd like it to be. And I can't
find any mechanism in Python for a process to tell the scheduler that
it's ready to be preempted.

So my goal is to have these two processes running in sync, even at
higher frequencies. If you've got any thoughts on what I might be able
to do, that would be great!

Thanks for your help,

-John



More information about the ros-users mailing list