On 27/01/2024 19:50, Linus Lüssing wrote:
The original idea of the delay_time check was to not apply multicast snooping too early when an MLD querier appears. And to instead wait at least for MLD reports to arrive before switching from flooding to group based, MLD snooped forwarding, to avoid temporary packet loss.
However in a batman-adv mesh network it was noticed that after 248 days of uptime 32bit MIPS based devices would start to signal that they had stopped applying multicast snooping due to missing queriers - even though they were the elected querier and still sending MLD queries themselves.
While time_is_before_jiffies() generally is safe against jiffies wrap-arounds, like the code comments in jiffies.h explain, it won't be able to track a difference larger than ULONG_MAX/2. With a 32bit large jiffies and one jiffies tick every 10ms (CONFIG_HZ=100) on these MIPS devices running OpenWrt this would result in a difference larger than ULONG_MAX/2 after 248 (= 2^32/100/60/60/24/2) days and time_is_before_jiffies() would then start to return false instead of true. Leading to multicast snooping not being applied to multicast packets anymore.
Fix this issue by using a proper timer_list object which won't have this ULONG_MAX/2 difference limitation.
Fixes: b00589af3b04 ("bridge: disable snooping if there is no querier") Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue
Changelog v2:
- removed "inline" from br_multicast_query_delay_expired()
net/bridge/br_multicast.c | 20 +++++++++++++++----- net/bridge/br_private.h | 4 ++-- 2 files changed, 17 insertions(+), 7 deletions(-)
Acked-by: Nikolay Aleksandrov razor@blackwall.org