Hello Guido,
On Tue, Jul 03, 2012 at 05:07:17PM -0300, Guido Iribarren wrote:
Hello there again, I have observed a problem since updating to 2012.2 and enabled BLAII
I'm compiling logs to understand what's happening, but as always, reading logs only gets me more lost :( So here i am again begging for help
There are some debug levels for BLA as well, and you can now get the claimlist with batctl (which is basically the list of clients a gateway feels responsible for) - this may help for debugging. But first, we should clarify some more details for your setup.
the setup is the same I described in yesterday's attachment, but what's not pictured is an ethernet cable between colmena-casa and f8d11504758. f8d11504758 is the only router that connects to the internet (through WAN cable), and it's also the only one that has dnsmasq running and gw_mode=server. All the other nodes have gw_mode=client
All of the nodes have bridge_loop_avoidance=1 (even though there are no other utp connections, so it could in fact be enabled only on colmena-casa and f8d11504758)
with this setup, dhcp requests from the mesh sometimes get "lost", either they don't reach f8d11504758 or the reply doesn't get out
Questions: * which node runs the DHCP server? colmena-casa, f8d11504758 or something else? * at which point is DHCP getting lost? is the DISCOVER/REQUEST from the client getting lost, or the reply from the server? * Can you specify "sometimes" a little bit more? What are the circumstances, how often does it happen?
this didn't happen with batman 2012.1 , setup as indicated by the BLAI wiki page (batctl if add br-lan) furthermore, with batman 2012.2 , BLAII activated, but gw_mode=off in all nodes, DHCP also works fine.
Mhm, that's rather strange ... we had a similar problem when ap isolation was activated. Do you have this feature turned on?
So DHCP is only having problems when gw-mode is turned on colmena-casa and f8d11504758?
So, a few questions arise: is it a problem to activate bridge_loop_avoidance=1 in all nodes, regardless of the fact that they "need" it or not? (that is, it is activated on nodes that don't have any ethernet cables connected and couldn't possibly create a bridge loop)
No, that it is not a problem - you can activate it everywhere. It will just send some additional control packets on bat0, but won't do anything as long as it does not detect other gateways.
would it make a difference, if I add br-lan to bat0 (batctl if add br-lan) the way I used to do with batman 2012.1 ?
That won't help, because the design of BLA changed and the old BLA has been removed. Please keep the bridge out of bat0. :)
Cheers, Simon