[ros-users] roscore not starting -- multiple network interfaces a problem?

Patrick Bouffard bouffard at eecs.berkeley.edu
Sun Feb 27 21:32:34 UTC 2011


Thanks Ken,

I had played around with a few combinations of ROS_IP, ROS_HOSTNAME
and ROS_MASTER_URI last night but I guess I didn't hit on the right
one. By setting:

export ROS_IP=10.32.43.1
export ROS_MASTER_URI=http://10.32.43.1:11311

.. roscore starts without a hiccup. I noticed also that if I only set
ROS_MASTER_URI, that it also works, though there is a pause between
when it prints out "NODES" and "auto-starting new master". So I'm
thinking it's best to have both set but I'd like to have a bit more
clarity on what the difference is.

For the record here's what happened when I tried your test steps:

{{{
In [1]: import xmlrpclib, os

In [2]: s = xmlrpclib.ServerProxy(os.environ['ROS_MASTER_URI'])

In [3]: s
Out[3]: <ServerProxy for localhost:11311/RPC2>

In [4]: s.getParam('/', '/rosdistro')
^C---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)

/home/bouffard/<ipython console> in <module>()

/usr/lib/python2.6/xmlrpclib.pyc in __call__(self, *args)
   1197         return _Method(self.__send, "%s.%s" % (self.__name, name))
   1198     def __call__(self, *args):
-> 1199         return self.__send(self.__name, args)
   1200
   1201 ##


/usr/lib/python2.6/xmlrpclib.pyc in __request(self, methodname, params)
   1487             self.__handler,
   1488             request,
-> 1489             verbose=self.__verbose
   1490             )
   1491

/usr/lib/python2.6/xmlrpclib.pyc in request(self, host, handler,
request_body, verbose)
   1233         self.send_host(h, host)
   1234         self.send_user_agent(h)
-> 1235         self.send_content(h, request_body)
   1236
   1237         errcode, errmsg, headers = h.getreply()

/usr/lib/python2.6/xmlrpclib.pyc in send_content(self, connection, request_body)
   1347         connection.putheader("Content-Type", "text/xml")
   1348         connection.putheader("Content-Length", str(len(request_body)))
-> 1349         connection.endheaders()
   1350         if request_body:
   1351             connection.send(request_body)

/usr/lib/python2.6/httplib.pyc in endheaders(self)
    906             raise CannotSendHeader()
    907
--> 908         self._send_output()
    909
    910     def request(self, method, url, body=None, headers={}):

/usr/lib/python2.6/httplib.pyc in _send_output(self)
    778         msg = "\r\n".join(self._buffer)
    779         del self._buffer[:]
--> 780         self.send(msg)
    781
    782     def putrequest(self, method, url, skip_host=0,
skip_accept_encoding=0):

/usr/lib/python2.6/httplib.pyc in send(self, str)
    737         if self.sock is None:
    738             if self.auto_open:
--> 739                 self.connect()
    740             else:
    741                 raise NotConnected()

/usr/lib/python2.6/httplib.pyc in connect(self)
    718         """Connect to the host and port specified in __init__."""
    719         self.sock = socket.create_connection((self.host,self.port),
--> 720                                              self.timeout)
    721
    722         if self._tunnel_host:

/usr/lib/python2.6/socket.pyc in create_connection(address, timeout)
    552             if timeout is not _GLOBAL_DEFAULT_TIMEOUT:
    553                 sock.settimeout(timeout)
--> 554             sock.connect(sa)
    555             return sock
    556

/usr/lib/python2.6/socket.pyc in connect(self, *args)

KeyboardInterrupt:

In [5]:
}}}

One thing that's still a bit confusing to me is this statement in the
ROS_IP/ROS_HOSTNAME section of the EnvironmentVariables page:

"""
With the exception of 'localhost', it does not affect the actual bound
address as ROS components bind to all available network interfaces. If
the value is set to localhost, the ROS component will bind only to the
loopback interface. This will prevent remote components from being
able to talk to your local component.
"""

Is this referring only to ROS_HOSTNAME? I was thinking that it would
apply as well to ROS_IP=127.0.0.1. It might be clearer if each of
these variables had its own section.

Also, based on what we've seen is there a (low priority, mind you)
ticket warranted here? Not sure if it would be a defect on roscore or
perhaps an enhancement to roswtf to give the hint that ROS_MASTER_URI
(and maybe also ROS_IP/ROS_HOSTNAME) should be set under certain
conditions. Or even an enhancement to roscore so that if it takes
longer than some timeout at that stage of startup you get a hint as to
what to do.

Cheers,
Pat


On Sun, Feb 27, 2011 at 10:34 AM, Ken Conley <kwc at willowgarage.com> wrote:
> On Sun, Feb 27, 2011 at 12:59 AM, Patrick Bouffard
> <bouffard at eecs.berkeley.edu> wrote:
>> Hi, I've just setup a new Ubuntu 10.10 box that will be running some
>> ROS nodes, occasionally including roscore. I installed diamondback
>> from debs this evening. This particular machine has a more complex
>> networking setup than others I've setup before and I suspect that is
>> giving me issues with running ROS.
>>
>> I'm pretty sure everything is setup as it ought to be in terms of my
>> .bashrc (just source /opt/ros/diamondback/setup.bash). But when I run
>> roscore it just hangs. After waiting awhile, after pressing Ctrl+C
>> once, the following is output:
>
> This is saying to me that something is wrong whenever something tries
> to talk to the host described in the master URI.  The only network
> call that occurs by this point is a call to check the existing
> parameter server.
>
> Here is a pure Python script you can use to test this behavior:
>
> import xmlrpclib, os
> s = xmlrpclib.ServerProxy(os.environ['ROS_MASTER_URI'])
> s.getParam('/', '/rosdistro')
>
> You can change the os.environ['ROS_MASTER_URI'] to use different
> hostnames/IP addresses to test the behavior of the network you setup.
>
>>
>> {{{
>> ^C... logging to
>> /home/bouffard/.ros/log/ea02b894-424c-11e0-a499-00226bbd5586/roslaunch-lynx-3561.log
>> Checking log directory for disk usage. This may take awhile.
>> Press Ctrl-C to interrupt
>> Done checking log file disk usage. Usage is <1GB.
>>
>> started roslaunch server http://lynx:52141/
>> ros_comm version 1.4.4
>>
>> SUMMARY
>> ========
>>
>> PARAMETERS
>>  * /rosversion
>>  * /rosdistro
>>
>> NODES
>>
>> auto-starting new master
>> process[master]: started with pid [3576]
>> ROS_MASTER_URI=http://lynx:11311/
>>
>> setting /run_id to ea02b894-424c-11e0-a499-00226bbd5586
>> process[rosout-1]: started with pid [3589]
>> started core service [/rosout]
>> }}}
>>
>> At this point things seem to be working; roswtf returns no errors or
>> warnings, I can run, e.g., rxconsole, rostopic list outputs /rosout
>> and /rosout_agg, etc. But having to hit Ctrl+C is not so great.
>>
>> Also, without roscore running, if I run roswtf it also hangs after displaying:
>>
>> {{{
>> bouffard at lynx:~$ roswtf
>> Loaded plugin tf.tfwtf
>> No package or stack in context
>> ================================================================================
>> Static checks summary:
>>
>> No errors or warnings
>> ================================================================================
>> }}}
>>
>> If I then hit Ctrl+C I get the following traceback:
>>
>> {{{
>> ^CTraceback (most recent call last):
>>  File "/opt/ros/diamondback/ros/bin/roswtf", line 35, in <module>
>>    roswtf.roswtf_main()
>>  File "/opt/ros/diamondback/stacks/ros_comm/utilities/roswtf/src/roswtf/__init__.py",
>> line 93, in roswtf_main
>>    _roswtf_main()
>>  File "/opt/ros/diamondback/stacks/ros_comm/utilities/roswtf/src/roswtf/__init__.py",
>> line 208, in _roswtf_main
>>    master = master_online()
>>  File "/opt/ros/diamondback/stacks/ros_comm/utilities/roswtf/src/roswtf/__init__.py",
>> line 100, in master_online
>>    master.getPid('/roswtf')
>>  File "/usr/lib/python2.6/xmlrpclib.py", line 1199, in __call__
>>    return self.__send(self.__name, args)
>>  File "/usr/lib/python2.6/xmlrpclib.py", line 1489, in __request
>>    verbose=self.__verbose
>>  File "/usr/lib/python2.6/xmlrpclib.py", line 1235, in request
>>    self.send_content(h, request_body)
>>  File "/usr/lib/python2.6/xmlrpclib.py", line 1349, in send_content
>>    connection.endheaders()
>>  File "/usr/lib/python2.6/httplib.py", line 908, in endheaders
>>    self._send_output()
>>  File "/usr/lib/python2.6/httplib.py", line 780, in _send_output
>>    self.send(msg)
>>  File "/usr/lib/python2.6/httplib.py", line 739, in send
>>    self.connect()
>>  File "/usr/lib/python2.6/httplib.py", line 720, in connect
>>    self.timeout)
>>  File "/usr/lib/python2.6/socket.py", line 554, in create_connection
>>    sock.connect(sa)
>>  File "<string>", line 1, in connect
>> KeyboardInterrupt
>> bouffard at lynx:~$
>> }}}
>
> My theory is that this is the same pause as described above.  It's
> hanging in an xmlrpc call to the master (aka Parameter Server).
>
>> Just to check that it wasn't something in the latest diamondback
>> release candidate, I dist-upgrade'd and tried these same commands on
>> another couple machines (that have been running some version of ROS
>> for awhile and are similarly configured, Ubuntu 10.0, diamondback
>> debs) with no problems.
>>
>> Based on the roswtf traceback and the main weirdness of the current
>> box being its network config (it has three wired network interfaces),
>> I'm suspecting it has something to do with that. However, I still see
>> the same behaviour if I sudo ifdown all the interfaces besides lo.
>>
>> I'm not a networking expert but I noticed on the EnvironmentVariables
>> wiki page: ".. ROS components bind to all available network
>> interfaces.". Could this have something to do with my issues?
>
> You can change this behavior by setting ROS_IP or ROS_HOSTNAME.  Using
> either tells a particular process to bind to a specific interface.
> All evidence thus far is that something is wrong with how the 'lynx'
> hostname is configured.
>
>  - Ken
>
>> Here's the output of ifconfig -a in case that helps:
>>
>> {{{
>> bouffard at lynx:~$ ifconfig -a
>> eth1      Link encap:Ethernet  HWaddr xx:xx;xx:xx:xx:xx
>>          inet addr:128.32.43.208  Bcast:128.32.43.255  Mask:255.255.255.0
>>          inet6 addr: fe80::218:8bff:fe74:766d/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:1066 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:504 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:329324 (329.3 KB)  TX bytes:116257 (116.2 KB)
>>          Interrupt:17
>>
>> eth2      Link encap:Ethernet  HWaddr xx:xx;xx:xx:xx:xx
>>          inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
>>          inet6 addr: fe80::e291:f5ff:fe94:cc3/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:1264 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:1342 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:78005 (78.0 KB)  TX bytes:67277 (67.2 KB)
>>          Interrupt:17 Base address:0xef00
>>
>> eth3      Link encap:Ethernet  HWaddr xx:xx;xx:xx:xx:xx
>>          inet addr:10.32.43.1  Bcast:10.32.43.255  Mask:255.255.255.0
>>          inet6 addr: fe80::222:6bff:febd:5586/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:235591 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:451507 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:17233068 (17.2 MB)  TX bytes:629352191 (629.3 MB)
>>          Interrupt:16 Base address:0x2e00
>>
>> lo        Link encap:Local Loopback
>>          inet addr:127.0.0.1  Mask:255.0.0.0
>>          inet6 addr: ::1/128 Scope:Host
>>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>>          RX packets:8526 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:8526 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:817773 (817.7 KB)  TX bytes:817773 (817.7 KB)
>> }}}
>>
>> eth1 is the connection to the internet, eth2 is a crossover cable to
>> another machine, and eth3 connected to a private subnet. Iptables is
>> configured to allow machines on the 10.32.43.x subnet to access the
>> internet via eth1. It's possible something I did in setting that up
>> had the side-effect of messing with ROS, as I said I'm no networking
>> expert. Hopefully one of you is! :)
>>
>> Thanks,
>> Pat
>> _______________________________________________
>> ros-users mailing list
>> ros-users at code.ros.org
>> https://code.ros.org/mailman/listinfo/ros-users
>>
>



More information about the ros-users mailing list