Hi,
This is the fifth itereation to increase the DAT DHT timeout to reduce the amount of broadcasted ARP Replies.
To increase the timeout only for DAT DHT entries added via DHT-PUT but not for any other entry in the DAT cache the DAT cache and DAT DHT concepts are first split into two separate hash tables (PATCH 1/2).
PATCH 2/2 then increases the timeout for DAT DHT entries from 5 to 30 minutes.
The motivation for this patchset is based on the observations made here: https://www.open-mesh.org/projects/batman-adv/wiki/DAT_DHCP_Snooping
In tests this year at Freifunk Lübeck with ~180 mesh nodes and Gluon this reduced the ARP broadcast overhead, measured over 7 days, as follows:
- Total: 6677.66 bits/s -> 677.26 bits/s => -89.86% 11.92 pkts/s -> 1.21 pkts/s => -89.85%
- from gateways: 5618.02 bits/s -> 212.28 => -96.22% 10.03 pkts/s -> 0.38 pkts/s => -96.21%
Also see graphics and a few more test details here: - https://www.open-mesh.org/projects/batman-adv/wiki/DAT_DHCP_Snooping#Result-...
These patches have been applied in this mesh network without issues for 3 months now.
Regards, Linus
---
Changelog v5: - rebased to current main branch -> removed now obsolete debugfs code
Changelog v4: - rebased to: acfc9a214d01695 ("batman-adv: genetlink: make policy common to family")
Changelog v3:
formerly: "batman-adv: Increase purge timeout on DAT DHT candidates" https://patchwork.open-mesh.org/patch/17728/ - fixed the potential jiffies overflow and jiffies initialization issues by replacing the last_dht_update timeout variable with a split of DAT cache and DAT DHT into two separate hash tables -> instead of maintaining two timeouts in one DAT entry two DAT entries are created and maintained in their respective DAT cache and DAT DHT hash tables
Changelog v2:
formerly: "batman-adv: Increase DHCP snooped DAT entry purge timeout in DHT" (https://patchwork.open-mesh.org/patch/17364/) - removed the extended timeouts flag in the DHT-PUT messages introduced in v1 again - removed DHCP dependency