The motivation of using a higher MTU of 1524 at the beginning was the rumour, that there might be some client devices (which we would get into the network by bridging bat0 wifi wlan0 for instance) not able to handle any MTU smaller than 1500. But in fact it turned out, that in our days basically all devices are able to do both IPv4 and IPv6 PMTU discovery on layer 3. So tests showed, that if all BATMAN nodes are using an MTU of 1476 on bat0 (or the overlaying bridge) everything seems to work fine.
Sorry, I can't follow you here. If the whole network is a switch environment how could the clients perform a working PMTU ? Sure, all clients are able to do PMTU (I don't think somebody doubted that) but it won't work. :) Client sends 1500 bytes -> router receives the frame (no IP!) and drops the packet. Where should the "fragmentation needed" packet come from ? That only works if you route packets instead of switching them.
Ah, wait, I forgot one thing: It worked for our hotspots because the coovachili internet gateway had an MTU equal to the PMTU all the way through the mesh. But you are right, we are probably having some trouble when having too mesh clients which are bridged to each other and have an MTU set to 1500...
I'm wondering what you think of how tinc is handling this at the moment in switch mode: It just "fakes" an ICMPv4/6 message with the address of the destination if such a hop is getting an IP-packet bigger than the link MTU. This might sound like a good idea at first sight, but the disadvantage is, that you're getting trouble in IPSec-only networks (which are quite rare at the moment, yes :) ).
Usually you are choosing UDP mode in tinc, any ethernet frame inside of this will then be encapsulated in this. But if tinc discovers, that the packet will be too big to fit through this link and the PMTU tinc discovered for it, it will do TCP encapsulation instead to let the kernel fragment the packet automatically.
That is nice but only works because tinc uses IP addresses (unlike batman- adv). AFAIK you use tinc to connect internet endpoints, hence your packet probably looks like this: [ETHER][IP][UDP/TCP][BATMAN-HDR][PAYLOAD] whereas the packets sent by batman-adv look like this: [ETHER][BATMAN-HDR][PAYLOAD]
Nope, tinc is able to create a TUN (router mode) and TAP (switch mode) network adapter, so it is able to actually transport the original ethernet frame as well: [ETHER][IP][UDP/TCP][ETHER][BATMAN-HDR][PAYLOAD]
As in a mesh network usually not an internet uplink but the wifi is very likely being the bottleneck, the extra overhead on the internet uplinks created by fragmentation might not be "harmful" for the network average bandwidth. And as also the packet loss on an internet uplink not running on its bandwidth limit is very, very low, latencies shouldn't increase too much.
I think you underestimate the performance impact. AFAIK IPv6 does not support the clasical IPv4 fragmentation anymore (intermediate routers won't fragment the packets but drop them).
Hmm, yes, true, IPv6 does not support fragmentation at all... so this would/might just work over the crappy IPv4 at all and is not only but also just a workaround to quickly squeeze some fragmented packets over the ipv4 internet.
I also somehow liked Andrew's suggestion about header compression with segmentation as a fallback mechanism. If this segmentation would occure in rare situations only and transparent for the upper layers, well, why not :).
Cheers, Linus