Hi!
One more precise explanation of issues. Is this really a limitation of Batman?
Mitar
---------- Forwarded message ---------- From: max b maxb.personal@gmail.com Date: Mon, Aug 10, 2015 at 5:01 AM Subject: Re: [Babel-users] Fwd: Why we switched to Babel
Marc will probably chime in here as we had plenty of sit downs trying to figure out how to incorporate BATMAN into our topology, but I (hope) I can help articulate the issue:
Our topology was the following (very simplified). It mirrored (and was inspired by) the wlan slovenijia project.
Cloud (or colocation) hosted exit server running batman l2tp interface | V l2tp interface OpenWRT wifi router running batman open0 interface | V heterogenous set of hosts connected wirelessly over an open network.
The routers create l2tp tunnels to the exit server when they have an direct internet connection (ie. someone is sharing their bandwidth). The open0 and l2tp interfaces were added to batman, along with an adhoc interface which would connect to other routers over wifi. The l2tp interfaces had to have a non-standard MTU, they were an l2tp interface carrying batman-adv "frames".
The problem is that the hosts connecting to the open network would generally not respect dhcp options setting an MTU. A significant number of hosts - including I believe all windows machines and at least IOS devices - would just set the MTU of their wireless interface to 1500. Under normal circumstances, this wouldn't be an issue because "When a packet arrives at a node from a client with too large an mtu, what SHOULD happen in a normal forwarding situation (per RFC 1191) is that the node issue a ICMP 'Destination Unreachable' packet with a "Fragmentation required" code. The client then uses this information to reset its mtu." (I'm quoting an email exchange from our mailing list: https://sudoroom.org/pipermail/mesh-dev/2014-October/000019.html)
However, in our situation, we were essentially stuck bridging two interfaces that had different MTUs because they were both added to this batman-adv layer 2 forwarding daemon. Because there was no layer 3 routing going on, the ICMP message would never be sent. We were just seeing dropped frames in wireshark analyses - client devices wouldn't adjust their mtu and they wouldn't fragment because they wouldn't know to.
We tried a couple workarounds, including an iptables MSS clamping rule, but that only works for TCP traffic. We were eventually stuck with what we came to realize was an unworkable stack+topology.
The email exchange https://sudoroom.org/pipermail/mesh-dev/2014-October/000019.html could help out with any additional confusion.
I still think that a heterogenous mesh with babel on top of batman-adv could be an excellent combination of dynamic routing with layer 2 mesh roaming, something like this diagram: https://libre-mesh.org/projects/libre-mesh/wiki/Network_Architecture?version...
The libre-mesh folks mentioned that they've been experimenting with that sort of architecture, but I don't think we're really going to need it any time soon (and we weren't exactly in love with bmx6, but that's probably outside of the scope here).
Hope that was reasonably helpful.
Max
b.a.t.m.a.n@lists.open-mesh.org