On Tuesday, November 13, 2018 2:55:31 PM CET Jake.Harris@zf.com wrote:
Mhm, this is really not much data ... did you try the multicast as suggested in an earlier reply?
What earlier reply are you referring to? The only one I'm noticing is the tip to boost the multicast bandwidth, but I cannot see this being fruitful to update the configuration of all 50 nodes when worst-case I'm using less than 1% of the max throughput.
One aspect is that the multicast rate is also changing the modulation rate of beacons. If you have >50 nodes beaconing with 1 Mbit/s you are already filling up your airtime with beacons. Do the math - one beacon takes about 1ms on 1 Mbit/s, each node sends about 10 beacons per second ...
This is actually very important and will most likely help already. It would be a better fix than changing the protection window.
BATADV_RESET_PROTECTION_MS is a define in the batman-adv C-code, so it can't be set at runtime but only at compile time.
While this sounds like an utter pain in the butt to recompile and update the code on all the nodes to make this change, I believe this has a far better chance of alleviating the issue, I'm looking into how to do this since I've never compiled anything myself but I can't see it being too difficult.
One observation I made when rebooting the swarm all at once, about a minute after all the pi's go down the laptop I work off (running batctl td bat0) reports a whole bunch of backbone unannounced messages I believe. I'm assuming there is one message per node but have not verified, my guess is this is normal and is not the cause of these issues?
Again, thank you