This patchset increases the DAT DHT timeout to reduce the amount
of broadcasted ARP Replies.
To increase the timeout only for DAT DHT entries added via DHT-PUT but
not for any other entry in the DAT cache the DAT cache and DAT DHT
concepts are split into two separate hash tables (PATCH 2/3).
PATCH 3/3 then increases the timeout for DAT DHT entries from 5 to
30 minutes.
The motivation for this patchset is based on the observations made here:
https://www.open-mesh.org/projects/batman-adv/…
[View More]wiki/DAT_DHCP_Snooping
In tests this year at Freifunk Lübeck with ~180 mesh nodes and Gluon
this reduced the ARP broadcast overhead, measured over 7 days, as
follows:
- Total: 6677.66 bits/s -> 677.26 bits/s => -89.86%
11.92 pkts/s -> 1.21 pkts/s => -89.85%
- from gateways: 5618.02 bits/s -> 212.28 => -96.22%
10.03 pkts/s -> 0.38 pkts/s => -96.21%
Also see graphics and a few more test details here:
- https://www.open-mesh.org/projects/batman-adv/wiki/DAT_DHCP_Snooping#Result…
These patches (v5) have been applied in this mesh network without issues
for 3 months now.
Regards,
Linus
---
Changelog v7:
- adding PATCH 1/3 to add the batadv_netlink_get_softif() wrapper to
reduce the amount of duplicate code, both in the current code base
but also for the next PATCH 2/3
Changelog v6:
- removed renaming+deprecation of BATADV_P_DAT_CACHE_REPLY in PATCH 1/2
- small commit message rewording in PATCH 1/2
Changelog v5:
- rebased to current main branch
-> removed now obsolete debugfs code
Changelog v4:
- rebased to: acfc9a214d01695
("batman-adv: genetlink: make policy common to family")
Changelog v3:
formerly:
"batman-adv: Increase purge timeout on DAT DHT candidates"
https://patchwork.open-mesh.org/patch/17728/
- fixed the potential jiffies overflow and jiffies initialization
issues by replacing the last_dht_update timeout variable with
a split of DAT cache and DAT DHT into two separate hash tables
-> instead of maintaining two timeouts in one DAT entry two DAT
entries are created and maintained in their respective DAT
cache and DAT DHT hash tables
Changelog v2:
formerly:
"batman-adv: Increase DHCP snooped DAT entry purge timeout in DHT"
(https://patchwork.open-mesh.org/patch/17364/)
- removed the extended timeouts flag in the DHT-PUT messages introduced
in v1 again
- removed DHCP dependency
[View Less]
Hi,
This is the fifth itereation to increase the DAT DHT timeout to reduce the
amount of broadcasted ARP Replies.
To increase the timeout only for DAT DHT entries added via DHT-PUT but
not for any other entry in the DAT cache the DAT cache and DAT DHT
concepts are first split into two separate hash tables (PATCH 1/2).
PATCH 2/2 then increases the timeout for DAT DHT entries from 5 to
30 minutes.
The motivation for this patchset is based on the observations made here:
https://www.open-mesh.…
[View More]org/projects/batman-adv/wiki/DAT_DHCP_Snooping
In tests this year at Freifunk Lübeck with ~180 mesh nodes and Gluon
this reduced the ARP broadcast overhead, measured over 7 days, as follows:
- Total: 6677.66 bits/s -> 677.26 bits/s => -89.86%
11.92 pkts/s -> 1.21 pkts/s => -89.85%
- from gateways: 5618.02 bits/s -> 212.28 => -96.22%
10.03 pkts/s -> 0.38 pkts/s => -96.21%
Also see graphics and a few more test details here:
- https://www.open-mesh.org/projects/batman-adv/wiki/DAT_DHCP_Snooping#Result…
These patches have been applied in this mesh network without issues
for 3 months now.
Regards,
Linus
---
Changelog v5:
- rebased to current main branch
-> removed now obsolete debugfs code
Changelog v4:
- rebased to: acfc9a214d01695
("batman-adv: genetlink: make policy common to family")
Changelog v3:
formerly:
"batman-adv: Increase purge timeout on DAT DHT candidates"
https://patchwork.open-mesh.org/patch/17728/
- fixed the potential jiffies overflow and jiffies initialization
issues by replacing the last_dht_update timeout variable with
a split of DAT cache and DAT DHT into two separate hash tables
-> instead of maintaining two timeouts in one DAT entry two DAT
entries are created and maintained in their respective DAT
cache and DAT DHT hash tables
Changelog v2:
formerly:
"batman-adv: Increase DHCP snooped DAT entry purge timeout in DHT"
(https://patchwork.open-mesh.org/patch/17364/)
- removed the extended timeouts flag in the DHT-PUT messages introduced
in v1 again
- removed DHCP dependency
[View Less]
The first three patches are actual fixes.
The first two try to avoid sending uninitialized data that could be
interpreted as invalid TT change events in both TT change response and
OGM. Following invalid entries could be seen when that happen with
batctl o:
* 00:00:00:00:00:00 -1 [....] ( 0) 88:12:4e:ad:7e:ba (179) (0x45845380)
* 00:00:00:00:78:79 4092 [.W..] ( 0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b)
The third one fixes an issue that happened when a TT change event list
is too big for …
[View More]the MTU, the list was never actually sent nor free and
continued to grow indefinitely from this point. That also caused the
OGM TTVN to increase at each OGM interval without any changes being ever
visible to other nodes. This ever growing TT change event list could be
observed by looking at /sys/kernel/slab/batadv_tt_change_cache/objects
that sometimes showed unusal high value even after issuing a memcache
shrink.
The next two patches are more cleanup / potential slight improvements.
While patch 4 is mainly cosmetic (having negative tt.local_changes
values is not exactly an issue), patch 5 is here to keep the TT changes
list as short as possible (reducing network overhead).
V4:
- Reword comment on patch 4
- Fix flag assignment position is patch 4
- Fix store stearing with WRITE_ONCE
- Change tt.local_change < 1 to tt.local_change == 0 in patch 4
- Rework/simplify TT event deduplication logic
V3:
- Fix commit message wording
- Update outdated comments
V2:
- This has been tested enough to not be in RFC state anymore
- Add one more uninitialize TT change fix for full table TT responses
Remi Pommarel (5):
batman-adv: Do not send uninitialized TT changes
batman-adv: Remove uninitialized data in full table TT response
batman-adv: Do not let TT changes list grows indefinitely
batman-adv: Remove atomic usage for tt.local_changes
batman-adv: Don't keep redundant TT change events
net/batman-adv/soft-interface.c | 2 +-
net/batman-adv/translation-table.c | 123 ++++++++++++++++-------------
net/batman-adv/types.h | 4 +-
3 files changed, 72 insertions(+), 57 deletions(-)
--
2.40.0
[View Less]
This is the fifth itereation to increase the DAT DHT timeout to reduce
the amount of broadcasted ARP Replies.
To increase the timeout only for DAT DHT entries added via DHT-PUT but
not for any other entry in the DAT cache the DAT cache and DAT DHT
concepts are first split into two separate hash tables (PATCH 1/2).
PATCH 2/2 then increases the timeout for DAT DHT entries from 5 to
30 minutes.
The motivation for this patchset is based on the observations made here:
https://www.open-mesh.org/…
[View More]projects/batman-adv/wiki/DAT_DHCP_Snooping
In tests this year at Freifunk Lübeck with ~180 mesh nodes and Gluon
this reduced the ARP broadcast overhead, measured over 7 days, as
follows:
- Total: 6677.66 bits/s -> 677.26 bits/s => -89.86%
11.92 pkts/s -> 1.21 pkts/s => -89.85%
- from gateways: 5618.02 bits/s -> 212.28 => -96.22%
10.03 pkts/s -> 0.38 pkts/s => -96.21%
Also see graphics and a few more test details here:
- https://www.open-mesh.org/projects/batman-adv/wiki/DAT_DHCP_Snooping#Result…
These patches (v5) have been applied in this mesh network without issues
for 3 months now.
Regards,
Linus
---
Changelog v6:
- removed renaming+deprecation of BATADV_P_DAT_CACHE_REPLY in PATCH 1/2
- small commit message rewording in PATCH 1/2
Changelog v5:
- rebased to current main branch
-> removed now obsolete debugfs code
Changelog v4:
- rebased to: acfc9a214d01695
("batman-adv: genetlink: make policy common to family")
Changelog v3:
formerly:
"batman-adv: Increase purge timeout on DAT DHT candidates"
https://patchwork.open-mesh.org/patch/17728/
- fixed the potential jiffies overflow and jiffies initialization
issues by replacing the last_dht_update timeout variable with
a split of DAT cache and DAT DHT into two separate hash tables
-> instead of maintaining two timeouts in one DAT entry two DAT
entries are created and maintained in their respective DAT
cache and DAT DHT hash tables
Changelog v2:
formerly:
"batman-adv: Increase DHCP snooped DAT entry purge timeout in DHT"
(https://patchwork.open-mesh.org/patch/17364/)
- removed the extended timeouts flag in the DHT-PUT messages introduced
in v1 again
- removed DHCP dependency
[View Less]
The first three patches are actual fixes.
The first two try to avoid sending uninitialized data that could be
interpreted as invalid TT change events in both TT change response and
OGM. Following invalid entries could be seen when that happen with
batctl o:
* 00:00:00:00:00:00 -1 [....] ( 0) 88:12:4e:ad:7e:ba (179) (0x45845380)
* 00:00:00:00:78:79 4092 [.W..] ( 0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b)
The third one fixes an issue that happened when a TT change event list
is too big for …
[View More]the MTU, the list was never actually sent nor free and
continued to grow indefinitely from this point. That also caused the
OGM TTVN to increase at each OGM interval without any changes being ever
visible to other nodes. This ever growing TT change event list could be
observed by looking at /sys/kernel/slab/batadv_tt_change_cache/objects
that sometimes showed unusal high value even after issuing a memcache
shrink.
The next two patches are more cleanup / potential slight improvements.
While patch 4 is mainly cosmetic (having negative tt.local_changes
values is not exactly an issue), patch 5 is here to keep the TT changes
list as short as possible (reducing network overhead).
V3:
- Fix commit message wording
- Update outdated comments
V2:
- This has been tested enough to not be in RFC state anymore
- Add one more uninitialize TT change fix for full table TT responses
Remi Pommarel (5):
batman-adv: Do not send uninitialized TT changes
batman-adv: Remove uninitialized data in full table TT response
batman-adv: Do not let TT changes list grows indefinitely
batman-adv: Remove atomic usage for tt.local_changes
batman-adv: Don't keep redundant TT change events
net/batman-adv/soft-interface.c | 2 +-
net/batman-adv/translation-table.c | 92 ++++++++++++++++++------------
net/batman-adv/types.h | 4 +-
3 files changed, 60 insertions(+), 38 deletions(-)
--
2.40.0
[View Less]
On Tuesday, 19 November 2024 15:59:58 CET Mu De wrote:
> "batctl o" shows empty table
Which means that the underlying link is not working. Either because the
interface is down, not connected or is manipulating the transmitted OGMs
(broadcast) in some form. The latter can for example happen if the underlying
link is converting broadcasts to unicast - some accesspoints are doing that. I
just mention this because you seem to use station interfaces as underlying
interface for bat0.
You …
[View More]can use tcpdump on the underlying interface to see whether the submitted
packets (OGMs) on one side are received exactly the same on the other side.
Wireshark has support for batman-adv. So, you could easily check the recorded
PCAPs with it.
You can also check "batctl n" to see on which side you can correctly "hear"
OGM packets. If you can not hear it only on one side then you should check
this one initially (but continue with the rest).
Kind regards,
Sven
[View Less]