[B.A.T.M.A.N.] Fwd: [Babel-users] Fwd: Why we switched to Babel - B.A.T.M.A.N - lists.open-mesh.org

10 Aug 2015


      Hi!
One more precise explanation of issues. Is this really a limitation of Batman?
Mitar
---------- Forwarded message ----------
From: max b maxb.personal@gmail.com
Date: Mon, Aug 10, 2015 at 5:01 AM
Subject: Re: [Babel-users] Fwd: Why we switched to Babel
Marc will probably chime in here as we had plenty of sit downs trying
to figure out how to incorporate BATMAN into our topology, but I
(hope) I can help articulate the issue:
Our topology was the following (very simplified). It mirrored (and was
inspired by) the wlan slovenijia project.
Cloud (or colocation) hosted exit server running batman
 l2tp interface
  |
 V
l2tp interface
OpenWRT wifi router running batman
open0 interface
 |
V
heterogenous set of hosts connected wirelessly over an open network.
The routers create l2tp tunnels to the exit server when they have an
direct internet connection (ie. someone is sharing their bandwidth).
The open0 and l2tp interfaces were added to batman, along with an
adhoc interface which would connect to other routers over wifi. The
l2tp interfaces had to have a non-standard MTU, they were an l2tp
interface carrying batman-adv "frames".
The problem is that the hosts connecting to the open network would
generally not respect dhcp options setting an MTU. A significant
number of hosts - including I believe all windows machines and at
least IOS devices - would just set the MTU of their wireless interface
to 1500. Under normal circumstances, this wouldn't be an issue because
"When a packet arrives at a node from a client with too large an mtu,
what SHOULD happen in a normal forwarding situation (per RFC 1191) is
that the node issue a ICMP 'Destination Unreachable' packet with a
"Fragmentation required" code. The client then uses this information
to reset its mtu." (I'm quoting an email exchange from our mailing
list: https://sudoroom.org/pipermail/mesh-dev/2014-October/000019.html)
However, in our situation, we were essentially stuck bridging two
interfaces that had different MTUs because they were both added to
this batman-adv layer 2 forwarding daemon. Because there was no layer
3 routing going on, the ICMP message would never be sent. We were just
seeing dropped frames in wireshark analyses - client devices wouldn't
adjust their mtu and they wouldn't fragment because they wouldn't know
to.
We tried a couple workarounds, including an iptables MSS clamping
rule, but that only works for TCP traffic. We were eventually stuck
with what we came to realize was an unworkable stack+topology.
The email exchange
https://sudoroom.org/pipermail/mesh-dev/2014-October/000019.html could
help out with any additional confusion.
I still think that a heterogenous mesh with babel on top of batman-adv
could be an excellent combination of dynamic routing with layer 2 mesh
roaming, something like this diagram:
https://libre-mesh.org/projects/libre-mesh/wiki/Network_Architecture?version...
The libre-mesh folks mentioned that they've been experimenting with
that sort of architecture, but I don't think we're really going to
need it any time soon (and we weren't exactly in love with bmx6, but
that's probably outside of the scope here).
Hope that was reasonably helpful.
Max
-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m