This patchset enables the usage of the batman-adv multicast optimizations for scenarios involving bridges on top of e.g. bat0, too.
Along come two more patches adding according debugging facilities to make it possible for the user to check why the multicast optimizations might not work ideally to give hints about what they might change about their topology.
The fourth one alters the forwarding behaviour for IGMP and MLD reports. For one thing this reduces overhead (especially for IGMPv3/MLDv2 reports). But more importantly this is necessary to make the optimizations work in bridged scenarios at all. For more details see: http://www.open-mesh.org/projects/batman-adv/wiki/Multicast-optimizations-li...
Also note, that with PATCH 4/4 multicast optimizations do not build for kernels < 2.6.35 (net/mld.h didn't exist back then): If there are actually brave souls out there running such ancient end-of-life kernels, it is suggested to build batman-adv with the multicast-compile-time option disabled. Multicast optimizations for bridges does not half any benefits for kernels < 3.17 anyways.
Cheers, Linus
Changes in v9: - PATCH 1/4: * fix: added compat code for pr_warn_once() * compat fix for bridge export stubs: fixes compile error with kernels < 3.16 without bridge (snooping) support - PATCH 2/4: * perform updates of variables within bat_priv->mcast.querier_ipv{4,6} individually (there's a new, third member in 4/4 which shouldn't be overriden) * PATCH 4/4: NEW
Changes in v8 (thanks to Simon's suggestions): - PATCH 2/3: * print shadowing status log of an appearing and shadowing querier, too (the bridge-querier-existence call has an additional 10s delay to ensure reports had their time to arrive - the bridge-querier-port call doesn't have that) - PATCH 3/3: * changing debugfs output from "+" and "-" to "U/4/6" and "." * fixing "no querier present" logic (introduced in [PATCHv7 3/3])
Changes in v7 (thanks to Simon's suggestions): - PATCH 2/3: * renaming old/new_querier to old/new_state * slightly extended kerneldoc of batadv_mcast_querier_log() * removing words "good" and "bad" from debug output * simplified batadv_mcast_flags_log() * assignment instead of memset in batadv_mcast_mla_tvlv_update() and batadv_softif_init_late() * simple struct member assignments instead of one complex struct assigment * removing unnecessary memcmp's * substituting return statement for an if-block in batadv_mcast_querier_log() and batadv_mcast_bridge_log() * print "Unsnoopables(U)-flag" instead of just "U-flag" - PATCH 3/3: * use bat_priv values instead of querying bridge ABI in batadv_mcast_flags_print_header()
Changes in v6: * New PATCH 2/3 inserted, moving logging to separate patch * More verbose logging added to PATCH 2/3: Bridge and querier state changes are logged too * upper case to lower case for kernel doc of batadv_mcast_flags_open (PATCH 2/3) * Adding note to kernel doc of batadv_mcast_get_bridge about increased refcount (PATCH 1/3) * Printing some lines about current bridge and querier state to debugfs too (PATCH 3/3)
Changes in v5 (PATCH 2/2 only): * s/dat_cache/mcast_flags/ in kerneldoc (copy&paste error)
Changes in v4 (PATCH 2/2 only): * initial {ad,e}dition of this patch
Changes in v3 (PATCH 1/2 only): * Removed "RFC" tag in title again: The stubs and new export are upstream in net-next and therefore going to be included in 3.17 * Added some debug output: * Two warning messages: -> Old kernel version or no bridge IGMP/MLD snooping compiled * New batman-adv log-level "mcast": -> Logging mcast flag changes (a third debugging facility, a new table for debugfs for a global mcast flag overview will be added in a separate patch later as discussed with Simon)
Changes in v2 (PATCH 1/2 only): * fetching local (= on this same kernel) multicast listeners from the bridge instead of the bat0 interface if a bridge is present - just like ip addresses and routes should be used from br0, the same goes for multicast listeners * beautification of batadv_mcast_mla_br_addr_cpy(), now using already present functions from the kernel instead of own, hackish approach * changed names of some goto-labels (not using "skip" anymore) * using new, third bridge multicast export (because this export is not upstream yet, I've added the "RFC" in the title): br_multicast_has_querier_anywhere() * adding compat stubs for two bridge multicast exports, to make batman-adv compile- and usable even if a 3.16 kernel was compiled without bridge code - the stubs are supposed to be upstream in the bridge code in 3.17 (therefore just 'compat') * updated kerneldocs for batadv_mcast_mla_bridge_get() and batadv_mcast_mla_softif_get() * The two sentences in the commit message starting with "Queriers: ..." were slightly modified to include the third bridge multicast export
So far, the recently added multicast optimizations did not support a configuration where a bridge is on top of the soft-interface (e.g. bat0).
Now the Linux bridge code is able to provide us with the missing bits:
Multicast Listeners: The bridge hands us a list of multicast listeners it has detected behind the bridge which we will announce through the mesh via the translation table, allowing other nodes to direct multicast packets to these clients behind the bridge even with enabled multicast optimizations.
Queriers: The bridge informs us whether there is a selected IGMP or MLD querier behind the bridge or if there is no querier at all. In these cases we need all according IPv4 or IPv6 multicast traffic directed to us.
These two parts together allow us to serve all multicast listeners with IPv6 link-local multicast packets even when bridges are involved and our multicast optimizations enabled.
Signed-off-by: Linus Lüssing linus.luessing@web.de --- compat.c | 30 +++++++++++ compat.h | 39 ++++++++++++++ main.h | 6 ++- multicast.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++++------- 4 files changed, 225 insertions(+), 19 deletions(-)
diff --git a/compat.c b/compat.c index 3dbf9d2..c675e1d 100644 --- a/compat.c +++ b/compat.c @@ -109,3 +109,33 @@ void batadv_free_rcu_tvlv_handler(struct rcu_head *rcu) }
#endif /* < KERNEL_VERSION(3, 0, 0) */ + +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 16, 0) || \ + !IS_ENABLED(CONFIG_BRIDGE) || \ + !IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING) + +int br_multicast_list_adjacent(struct net_device *dev, + struct list_head *br_ip_list) +{ + return 0; +} + +bool br_multicast_has_querier_adjacent(struct net_device *dev, int proto) +{ + return false; +} + +#endif /* < KERNEL_VERSION(3, 16, 0) || + * !IS_ENABLED(CONFIG_BRIDGE) || \ + * !IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING) */ + +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 17, 0) + +bool br_multicast_has_querier_anywhere(struct net_device *dev, int proto) +{ + pr_warn_once("Old kernel detected (< 3.17) - multicast optimizations disabled\n"); + + return false; +} + +#endif /* < KERNEL_VERSION(3, 17, 0) */ diff --git a/compat.h b/compat.h index ed5b815..9f997b3 100644 --- a/compat.h +++ b/compat.h @@ -176,6 +176,13 @@ static inline int batadv_param_set_copystring(const char *val,
#endif /* < KERNEL_VERSION(2, 6, 37) */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 38) + +#define pr_warn_once(fmt, ...) \ + printk_once(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__) + +#endif /* < KERNEL_VERSION(2, 6, 38) */ + #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 39)
#define kstrtou32(cp, base, v)\ @@ -240,6 +247,12 @@ static inline void skb_reset_mac_len(struct sk_buff *skb)
#endif /* < KERNEL_VERSION(3, 0, 0) */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 1, 0) + +#define IS_ENABLED(x) defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + +#endif /* < KERNEL_VERSION(3, 1, 0) */ + #if LINUX_VERSION_CODE < KERNEL_VERSION(3, 3, 0)
#define batadv_interface_add_vid(x, y, z) \ @@ -444,6 +457,32 @@ static int __batadv_interface_kill_vid(struct net_device *dev, __be16 proto,\
#define pskb_copy_for_clone pskb_copy
+int br_multicast_list_adjacent(struct net_device *dev, + struct list_head *br_ip_list); +bool br_multicast_has_querier_adjacent(struct net_device *dev, int proto); + +struct br_ip { + union { + __be32 ip4; +#if IS_ENABLED(CONFIG_IPV6) + struct in6_addr ip6; +#endif + } u; + __be16 proto; + __u16 vid; +}; + +struct br_ip_list { + struct list_head list; + struct br_ip addr; +}; + #endif /* < KERNEL_VERSION(3, 16, 0) */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 17, 0) + +bool br_multicast_has_querier_anywhere(struct net_device *dev, int proto); + +#endif /* < KERNEL_VERSION(3, 17, 0) */ + #endif /* _NET_BATMAN_ADV_COMPAT_H_ */ diff --git a/main.h b/main.h index 4c557eb..cd873a7 100644 --- a/main.h +++ b/main.h @@ -168,6 +168,7 @@ enum batadv_uev_type { #include <linux/netdevice.h> /* netdevice */ #include <linux/etherdevice.h> /* ethernet address classification */ #include <linux/if_ether.h> /* ethernet header */ +#include <linux/if_bridge.h> /* bridge / multicast-snooping communication */ #include <linux/poll.h> /* poll_table */ #include <linux/kthread.h> /* kernel threads */ #include <linux/pkt_sched.h> /* schedule types */ @@ -176,6 +177,7 @@ enum batadv_uev_type { #include <linux/slab.h> #include <net/sock.h> /* struct sock */ #include <net/addrconf.h> /* ipv6 address stuff */ +#include <net/ip.h> #include <linux/ip.h> #include <net/rtnetlink.h> #include <linux/jiffies.h> @@ -222,6 +224,7 @@ __be32 batadv_skb_crc32(struct sk_buff *skb, u8 *payload_ptr); * @BATADV_DBG_BLA: bridge loop avoidance messages * @BATADV_DBG_DAT: ARP snooping and DAT related messages * @BATADV_DBG_NC: network coding related messages + * @BATADV_DBG_MCAST: multicast related messages * @BATADV_DBG_ALL: the union of all the above log levels */ enum batadv_dbg_level { @@ -231,7 +234,8 @@ enum batadv_dbg_level { BATADV_DBG_BLA = BIT(3), BATADV_DBG_DAT = BIT(4), BATADV_DBG_NC = BIT(5), - BATADV_DBG_ALL = 63, + BATADV_DBG_MCAST = BIT(6), + BATADV_DBG_ALL = 127, };
#ifdef CONFIG_BATMAN_ADV_DEBUG diff --git a/multicast.c b/multicast.c index ab6bb2a..cbe1b6c 100644 --- a/multicast.c +++ b/multicast.c @@ -22,12 +22,44 @@ #include "translation-table.h"
/** + * batadv_mcast_get_bridge - get the bridge on top of the softif if it exists + * @soft_iface: netdev struct of the mesh interface + * + * Returns either a bridge interface on top of our soft interface or + * NULL if no such bridge exists. + * + * If found then the refcount of the bridge interface is incremented. + */ +static struct net_device *batadv_mcast_get_bridge(struct net_device *soft_iface) +{ + struct net_device *upper = soft_iface; + + rcu_read_lock(); + do { + upper = netdev_master_upper_dev_get_rcu(upper); + } while (upper && !(upper->priv_flags & IFF_EBRIDGE)); + + if (upper) + dev_hold(upper); + rcu_read_unlock(); + + return upper; +} + +/** * batadv_mcast_mla_softif_get - get softif multicast listeners * @dev: the device to collect multicast addresses from * @mcast_list: a list to put found addresses into * - * Collect multicast addresses of the local multicast listeners - * on the given soft interface, dev, in the given mcast_list. + * Collects multicast addresses of multicast listeners residing + * on this kernel on the given soft interface, dev, in + * the given mcast_list. In general, multicast listeners provided by + * your multicast receiving applications run directly on this node. + * + * If there is a bridge interface on top of dev, collects from that one + * instead. Just like with IP addresses and routes, multicast listeners + * will(/should) register to the bridge interface instead of an + * enslaved bat0. * * Returns -ENOMEM on memory allocation error or the number of * items added to the mcast_list otherwise. @@ -35,12 +67,13 @@ static int batadv_mcast_mla_softif_get(struct net_device *dev, struct hlist_head *mcast_list) { + struct net_device *bridge = batadv_mcast_get_bridge(dev); struct netdev_hw_addr *mc_list_entry; struct batadv_hw_addr *new; int ret = 0;
- netif_addr_lock_bh(dev); - netdev_for_each_mc_addr(mc_list_entry, dev) { + netif_addr_lock_bh(bridge ? bridge : dev); + netdev_for_each_mc_addr(mc_list_entry, bridge ? bridge : dev) { new = kmalloc(sizeof(*new), GFP_ATOMIC); if (!new) { ret = -ENOMEM; @@ -51,7 +84,10 @@ static int batadv_mcast_mla_softif_get(struct net_device *dev, hlist_add_head(&new->list, mcast_list); ret++; } - netif_addr_unlock_bh(dev); + netif_addr_unlock_bh(bridge ? bridge : dev); + + if (bridge) + dev_put(bridge);
return ret; } @@ -77,6 +113,83 @@ static bool batadv_mcast_mla_is_duplicate(uint8_t *mcast_addr, }
/** + * batadv_mcast_mla_br_addr_cpy - copy a bridge multicast address + * @dst: destination to write to - a multicast MAC address + * @src: source to read from - a multicast IP address + * + * Converts a given multicast IPv4/IPv6 address from a bridge + * to its matching multicast MAC address and copies it into the given + * destination buffer. + * + * Caller needs to make sure the destination buffer can hold + * at least ETH_ALEN bytes. + */ +static void batadv_mcast_mla_br_addr_cpy(char *dst, const struct br_ip *src) +{ + if (src->proto == htons(ETH_P_IP)) + ip_eth_mc_map(src->u.ip4, dst); +#if IS_ENABLED(CONFIG_IPV6) + else if (src->proto == htons(ETH_P_IPV6)) + ipv6_eth_mc_map(&src->u.ip6, dst); +#endif + else + memset(dst, 0, ETH_ALEN); +} + +/** + * batadv_mcast_mla_bridge_get - get bridged-in multicast listeners + * @dev: a bridge slave whose bridge to collect multicast addresses from + * @mcast_list: a list to put found addresses into + * + * Collects multicast addresses of multicast listeners residing + * on foreign, non-mesh devices which we gave access to our mesh via + * a bridge on top of the given soft interface, dev, in the given + * mcast_list. + * + * Returns -ENOMEM on memory allocation error or the number of + * items added to the mcast_list otherwise. + */ +static int batadv_mcast_mla_bridge_get(struct net_device *dev, + struct hlist_head *mcast_list) +{ + struct list_head bridge_mcast_list = LIST_HEAD_INIT(bridge_mcast_list); + struct br_ip_list *br_ip_entry, *tmp; + struct batadv_hw_addr *new; + uint8_t mcast_addr[ETH_ALEN]; + int ret; + + /* we don't need to detect these devices/listeners, the IGMP/MLD + * snooping code of the Linux bridge already does that for us + */ + ret = br_multicast_list_adjacent(dev, &bridge_mcast_list); + if (ret < 0) + goto out; + + list_for_each_entry(br_ip_entry, &bridge_mcast_list, list) { + batadv_mcast_mla_br_addr_cpy(mcast_addr, &br_ip_entry->addr); + if (batadv_mcast_mla_is_duplicate(mcast_addr, mcast_list)) + continue; + + new = kmalloc(sizeof(*new), GFP_ATOMIC); + if (!new) { + ret = -ENOMEM; + break; + } + + ether_addr_copy(new->addr, mcast_addr); + hlist_add_head(&new->list, mcast_list); + } + +out: + list_for_each_entry_safe(br_ip_entry, tmp, &bridge_mcast_list, list) { + list_del(&br_ip_entry->list); + kfree(br_ip_entry); + } + + return ret; +} + +/** * batadv_mcast_mla_list_free - free a list of multicast addresses * @mcast_list: the list to free * @@ -184,29 +297,44 @@ static bool batadv_mcast_has_bridge(struct batadv_priv *bat_priv) * Updates the own multicast tvlv with our current multicast related settings, * capabilities and inabilities. * - * Returns true if the tvlv container is registered afterwards. Otherwise - * returns false. + * Returns false if we want all IPv4 && IPv6 multicast traffic and true + * otherwise. */ static bool batadv_mcast_mla_tvlv_update(struct batadv_priv *bat_priv) { struct batadv_tvlv_mcast_data mcast_data; + struct net_device *dev = bat_priv->soft_iface;
mcast_data.flags = BATADV_NO_FLAGS; memset(mcast_data.reserved, 0, sizeof(mcast_data.reserved));
- /* Avoid attaching MLAs, if there is a bridge on top of our soft - * interface, we don't support that yet (TODO) + if (!batadv_mcast_has_bridge(bat_priv)) + goto update; + +#if !IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING) + pr_warn_once("No bridge IGMP snooping compiled - multicast optimizations disabled\n"); +#endif + + mcast_data.flags |= BATADV_MCAST_WANT_ALL_UNSNOOPABLES; + + /* 1) If no querier exists at all, then multicast listeners on + * our local TT clients behind the bridge will keep silent. + * 2) If the selected querier is on one of our local TT clients, + * behind the bridge, then this querier might shadow multicast + * listeners on our local TT clients, behind this bridge. + * + * In both cases, we will signalize other batman nodes that + * we need all multicast traffic of the according protocol. */ - if (batadv_mcast_has_bridge(bat_priv)) { - if (bat_priv->mcast.enabled) { - batadv_tvlv_container_unregister(bat_priv, - BATADV_TVLV_MCAST, 1); - bat_priv->mcast.enabled = false; - } + if (!br_multicast_has_querier_anywhere(dev, ETH_P_IP) || + br_multicast_has_querier_adjacent(dev, ETH_P_IP)) + mcast_data.flags |= BATADV_MCAST_WANT_ALL_IPV4;
- return false; - } + if (!br_multicast_has_querier_anywhere(dev, ETH_P_IPV6) || + br_multicast_has_querier_adjacent(dev, ETH_P_IPV6)) + mcast_data.flags |= BATADV_MCAST_WANT_ALL_IPV6;
+update: if (!bat_priv->mcast.enabled || mcast_data.flags != bat_priv->mcast.flags) { batadv_tvlv_container_register(bat_priv, BATADV_TVLV_MCAST, 1, @@ -215,7 +343,8 @@ static bool batadv_mcast_mla_tvlv_update(struct batadv_priv *bat_priv) bat_priv->mcast.enabled = true; }
- return true; + return !(mcast_data.flags & + (BATADV_MCAST_WANT_ALL_IPV4 + BATADV_MCAST_WANT_ALL_IPV6)); }
/** @@ -238,6 +367,10 @@ void batadv_mcast_mla_update(struct batadv_priv *bat_priv) if (ret < 0) goto out;
+ ret = batadv_mcast_mla_bridge_get(soft_iface, &mcast_list); + if (ret < 0) + goto out; + update: batadv_mcast_mla_tt_retract(bat_priv, &mcast_list); batadv_mcast_mla_tt_add(bat_priv, &mcast_list);
This patch adds an 'mcast' log level. Currently, it will print changes relevant to a nodes own multicast flag changes.
Signed-off-by: Linus Lüssing linus.luessing@web.de --- multicast.c | 142 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- soft-interface.c | 5 ++ types.h | 17 +++++++ 3 files changed, 159 insertions(+), 5 deletions(-)
diff --git a/multicast.c b/multicast.c index cbe1b6c..341559e 100644 --- a/multicast.c +++ b/multicast.c @@ -291,6 +291,120 @@ static bool batadv_mcast_has_bridge(struct batadv_priv *bat_priv) }
/** + * batadv_mcast_querier_log - debug output regarding the querier status on link + * @bat_priv: the bat priv with all the soft interface information + * @str_proto: a string for the querier protocol (e.g. "IGMP" or "MLD") + * @old_state: the previous querier state on our link + * @new_state: the new querier state on our link + * + * Outputs debug messages to the logging facility with log level 'mcast' + * regarding changes to the querier status on the link which are relevant + * to our multicast optimizations. + * + * Usually this is about whether a querier appeared or vanished in + * our mesh or whether the querier is in the suboptimal position of being + * behind our local bridge segment: Snooping switches will directly + * forward listener reports to the querier, therefore batman-adv and + * the bridge will potentially not see these listeners - the querier is + * potentially shadowing listeners from us then. + * + * This is only interesting for nodes with a bridge on top of their + * soft interface. + */ +static void +batadv_mcast_querier_log(struct batadv_priv *bat_priv, char *str_proto, + struct batadv_mcast_querier_state *old_state, + struct batadv_mcast_querier_state *new_state) +{ + if (!old_state->exists && new_state->exists) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "%s Querier appeared\n", str_proto); + else if (old_state->exists && !new_state->exists) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "%s Querier disappeared\n", str_proto); + else if (!bat_priv->mcast.bridged && !new_state->exists) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Note: No %s Querier present\n", str_proto); + + if (new_state->exists) { + if ((!old_state->shadowing && new_state->shadowing) || + (!old_state->exists && new_state->shadowing)) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "%s Querier is behind our bridged segment: Might shadow listeners\n", + str_proto); + else if (old_state->shadowing && !new_state->shadowing) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "%s Querier is not behind our bridged segment\n", + str_proto); + } +} + +/** + * batadv_mcast_bridge_log - debug output for topology changes in bridged setups + * @bat_priv: the bat priv with all the soft interface information + * @bridged: a flag about whether the soft interface is currently bridged or not + * @querier_ipv4: (maybe) new status of a potential, selected IGMP querier + * @querier_ipv6: (maybe) new status of a potential, selected MLD querier + * + * If no bridges are ever used on this node, then this function does nothing. + * + * Otherwise this function outputs debug information to the 'mcast' log level + * which might be relevant to our multicast optimizations. + * + * More precisely, it outputs information when a bridge interface is added or + * removed from a soft interface. And when a bridge is present, it further + * outputs information about the querier state which is relevant for the + * multicast flags this node is going to set. + */ +static void +batadv_mcast_bridge_log(struct batadv_priv *bat_priv, bool bridged, + struct batadv_mcast_querier_state *querier_ipv4, + struct batadv_mcast_querier_state *querier_ipv6) +{ + if (!bat_priv->mcast.bridged && bridged) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Bridge added: Setting Unsnoopables(U)-flag\n"); + else if (bat_priv->mcast.bridged && !bridged) + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Bridge removed: Unsetting Unsnoopables(U)-flag\n"); + + if (bridged) { + batadv_mcast_querier_log(bat_priv, "IGMP", + &bat_priv->mcast.querier_ipv4, + querier_ipv4); + batadv_mcast_querier_log(bat_priv, "MLD", + &bat_priv->mcast.querier_ipv6, + querier_ipv6); + } +} + +/** + * batadv_mcast_flags_logs - output debug information about mcast flag changes + * @bat_priv: the bat priv with all the soft interface information + * @mcast_flags: flags indicating the new multicast state + * + * Whenever the multicast flags this nodes announces changes (@mcast_flags vs. + * bat_priv->mcast.flags), this notifies userspace via the 'mcast' log level. + */ +static void batadv_mcast_flags_log(struct batadv_priv *bat_priv, uint8_t flags) +{ + uint8_t old_flags = bat_priv->mcast.flags; + char str_old_flags[] = "[...]"; + + sprintf(str_old_flags, "[%c%c%c]", + old_flags & BATADV_MCAST_WANT_ALL_UNSNOOPABLES ? 'U' : '.', + old_flags & BATADV_MCAST_WANT_ALL_IPV4 ? '4' : '.', + old_flags & BATADV_MCAST_WANT_ALL_IPV6 ? '6' : '.'); + + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Changing multicast flags from '%s' to '[%c%c%c]'\n", + bat_priv->mcast.enabled ? str_old_flags : "<undefined>", + flags & BATADV_MCAST_WANT_ALL_UNSNOOPABLES ? 'U' : '.', + flags & BATADV_MCAST_WANT_ALL_IPV4 ? '4' : '.', + flags & BATADV_MCAST_WANT_ALL_IPV6 ? '6' : '.'); +} + +/** * batadv_mcast_mla_tvlv_update - update multicast tvlv * @bat_priv: the bat priv with all the soft interface information * @@ -303,18 +417,28 @@ static bool batadv_mcast_has_bridge(struct batadv_priv *bat_priv) static bool batadv_mcast_mla_tvlv_update(struct batadv_priv *bat_priv) { struct batadv_tvlv_mcast_data mcast_data; + struct batadv_mcast_querier_state querier4 = {false, false}; + struct batadv_mcast_querier_state querier6 = {false, false}; struct net_device *dev = bat_priv->soft_iface; + bool bridged;
mcast_data.flags = BATADV_NO_FLAGS; memset(mcast_data.reserved, 0, sizeof(mcast_data.reserved));
- if (!batadv_mcast_has_bridge(bat_priv)) + bridged = batadv_mcast_has_bridge(bat_priv); + if (!bridged) goto update;
#if !IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING) pr_warn_once("No bridge IGMP snooping compiled - multicast optimizations disabled\n"); #endif
+ querier4.exists = br_multicast_has_querier_anywhere(dev, ETH_P_IP); + querier4.shadowing = br_multicast_has_querier_adjacent(dev, ETH_P_IP); + + querier6.exists = br_multicast_has_querier_anywhere(dev, ETH_P_IPV6); + querier6.shadowing = br_multicast_has_querier_adjacent(dev, ETH_P_IPV6); + mcast_data.flags |= BATADV_MCAST_WANT_ALL_UNSNOOPABLES;
/* 1) If no querier exists at all, then multicast listeners on @@ -326,21 +450,29 @@ static bool batadv_mcast_mla_tvlv_update(struct batadv_priv *bat_priv) * In both cases, we will signalize other batman nodes that * we need all multicast traffic of the according protocol. */ - if (!br_multicast_has_querier_anywhere(dev, ETH_P_IP) || - br_multicast_has_querier_adjacent(dev, ETH_P_IP)) + if (!querier4.exists || querier4.shadowing) mcast_data.flags |= BATADV_MCAST_WANT_ALL_IPV4;
- if (!br_multicast_has_querier_anywhere(dev, ETH_P_IPV6) || - br_multicast_has_querier_adjacent(dev, ETH_P_IPV6)) + if (!querier6.exists || querier6.shadowing) mcast_data.flags |= BATADV_MCAST_WANT_ALL_IPV6;
update: + batadv_mcast_bridge_log(bat_priv, bridged, &querier4, &querier6); + + bat_priv->mcast.querier_ipv4.exists = querier4.exists; + bat_priv->mcast.querier_ipv4.shadowing = querier4.shadowing; + + bat_priv->mcast.querier_ipv6.exists = querier6.exists; + bat_priv->mcast.querier_ipv6.shadowing = querier6.shadowing; + if (!bat_priv->mcast.enabled || mcast_data.flags != bat_priv->mcast.flags) { + batadv_mcast_flags_log(bat_priv, mcast_data.flags); batadv_tvlv_container_register(bat_priv, BATADV_TVLV_MCAST, 1, &mcast_data, sizeof(mcast_data)); bat_priv->mcast.flags = mcast_data.flags; bat_priv->mcast.enabled = true; + bat_priv->mcast.bridged = bridged; }
return !(mcast_data.flags & diff --git a/soft-interface.c b/soft-interface.c index 9bf382d..d4d892c 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -745,7 +745,12 @@ static int batadv_softif_init_late(struct net_device *dev) atomic_set(&bat_priv->distributed_arp_table, 1); #endif #ifdef CONFIG_BATMAN_ADV_MCAST + bat_priv->mcast.querier_ipv4.exists = false; + bat_priv->mcast.querier_ipv4.shadowing = false; + bat_priv->mcast.querier_ipv6.exists = false; + bat_priv->mcast.querier_ipv6.shadowing = false; bat_priv->mcast.flags = BATADV_NO_FLAGS; + bat_priv->mcast.bridged = false; atomic_set(&bat_priv->multicast_mode, 1); atomic_set(&bat_priv->mcast.num_disabled, 0); atomic_set(&bat_priv->mcast.num_want_all_unsnoopables, 0); diff --git a/types.h b/types.h index 462a70c..bd56fc1 100644 --- a/types.h +++ b/types.h @@ -623,14 +623,28 @@ struct batadv_priv_dat {
#ifdef CONFIG_BATMAN_ADV_MCAST /** + * struct batadv_mcast_querier_state - IGMP/MLD querier state when bridged + * @exists: whether a querier exists in the mesh + * @shadowing: if a querier exists, whether it is potentially shadowing + * multicast listeners (i.e. querier is behind our own bridge segment) + */ +struct batadv_mcast_querier_state { + bool exists; + bool shadowing; +}; + +/** * struct batadv_priv_mcast - per mesh interface mcast data * @mla_list: list of multicast addresses we are currently announcing via TT * @want_all_unsnoopables_list: a list of orig_nodes wanting all unsnoopable * multicast traffic * @want_all_ipv4_list: a list of orig_nodes wanting all IPv4 multicast traffic * @want_all_ipv6_list: a list of orig_nodes wanting all IPv6 multicast traffic + * @querier_ipv4: the current state of an IGMP querier in the mesh (if bridged) + * @querier_ipv6: the current state of an MLD querier in the mesh (if bridged) * @flags: the flags we have last sent in our mcast tvlv * @enabled: whether the multicast tvlv is currently enabled + * @bridged: whether the soft interface currently has a bridge on top * @num_disabled: number of nodes that have no mcast tvlv * @num_want_all_unsnoopables: number of nodes wanting unsnoopable IP traffic * @num_want_all_ipv4: counter for items in want_all_ipv4_list @@ -643,8 +657,11 @@ struct batadv_priv_mcast { struct hlist_head want_all_unsnoopables_list; struct hlist_head want_all_ipv4_list; struct hlist_head want_all_ipv6_list; + struct batadv_mcast_querier_state querier_ipv4; + struct batadv_mcast_querier_state querier_ipv6; uint8_t flags; bool enabled; + bool bridged; atomic_t num_disabled; atomic_t num_want_all_unsnoopables; atomic_t num_want_all_ipv4;
This patch adds a debugfs table with originators and their according multicast flags to help users figure out why multicast optimizations might be enabled or disabled for them.
Signed-off-by: Linus Lüssing linus.luessing@web.de --- debugfs.c | 21 +++++++++++++ multicast.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ multicast.h | 2 ++ 3 files changed, 122 insertions(+)
diff --git a/debugfs.c b/debugfs.c index a12e25e..cd7c282 100644 --- a/debugfs.c +++ b/debugfs.c @@ -30,6 +30,7 @@ #include "bridge_loop_avoidance.h" #include "distributed-arp-table.h" #include "network-coding.h" +#include "multicast.h"
static struct dentry *batadv_debugfs;
@@ -332,6 +333,20 @@ static int batadv_nc_nodes_open(struct inode *inode, struct file *file) } #endif
+#ifdef CONFIG_BATMAN_ADV_MCAST +/** + * batadv_mcast_flags_open - prepare file handler for reads from mcast_flags + * @inode: inode which was opened + * @file: file handle to be initialized + */ +static int batadv_mcast_flags_open(struct inode *inode, struct file *file) +{ + struct net_device *net_dev = (struct net_device *)inode->i_private; + + return single_open(file, batadv_mcast_flags_seq_print_text, net_dev); +} +#endif + #define BATADV_DEBUGINFO(_name, _mode, _open) \ struct batadv_debuginfo batadv_debuginfo_##_name = { \ .attr = { .name = __stringify(_name), \ @@ -372,6 +387,9 @@ static BATADV_DEBUGINFO(transtable_local, S_IRUGO, #ifdef CONFIG_BATMAN_ADV_NC static BATADV_DEBUGINFO(nc_nodes, S_IRUGO, batadv_nc_nodes_open); #endif +#ifdef CONFIG_BATMAN_ADV_MCAST +static BATADV_DEBUGINFO(mcast_flags, S_IRUGO, batadv_mcast_flags_open); +#endif
static struct batadv_debuginfo *batadv_mesh_debuginfos[] = { &batadv_debuginfo_originators, @@ -388,6 +406,9 @@ static struct batadv_debuginfo *batadv_mesh_debuginfos[] = { #ifdef CONFIG_BATMAN_ADV_NC &batadv_debuginfo_nc_nodes, #endif +#ifdef CONFIG_BATMAN_ADV_MCAST + &batadv_debuginfo_mcast_flags, +#endif NULL, };
diff --git a/multicast.c b/multicast.c index 341559e..09b3e6f 100644 --- a/multicast.c +++ b/multicast.c @@ -984,6 +984,105 @@ void batadv_mcast_init(struct batadv_priv *bat_priv) }
/** + * batadv_mcast_flags_print_header - print own mcast flags to debugfs table + * @bat_priv: the bat priv with all the soft interface information + * @seq: debugfs table seq_file struct + * + * Prints our own multicast flags including a more specific reason why + * they are set, that is prints the bridge and querier state too, to + * the debugfs table specified via @seq. + */ +static void batadv_mcast_flags_print_header(struct batadv_priv *bat_priv, + struct seq_file *seq) +{ + uint8_t flags = bat_priv->mcast.flags; + char querier4, querier6, shadowing4, shadowing6; + bool bridged = bat_priv->mcast.bridged; + + if (bridged) { + querier4 = bat_priv->mcast.querier_ipv4.exists ? '.' : '4'; + querier6 = bat_priv->mcast.querier_ipv6.exists ? '.' : '6'; + shadowing4 = bat_priv->mcast.querier_ipv4.shadowing ? '4' : '.'; + shadowing6 = bat_priv->mcast.querier_ipv6.shadowing ? '6' : '.'; + } else { + querier4 = '?'; + querier6 = '?'; + shadowing4 = '?'; + shadowing6 = '?'; + } + + seq_printf(seq, "Multicast flags (own flags: [%c%c%c])\n", + flags & BATADV_MCAST_WANT_ALL_UNSNOOPABLES ? 'U' : '.', + flags & BATADV_MCAST_WANT_ALL_IPV4 ? '4' : '.', + flags & BATADV_MCAST_WANT_ALL_IPV6 ? '6' : '.'); + seq_printf(seq, "* Bridged [U]\t\t\t\t%c\n", bridged ? 'U' : '.'); + seq_printf(seq, "* No IGMP/MLD Querier [4/6]:\t\t%c/%c\n", + querier4, querier6); + seq_printf(seq, "* Shadowing IGMP/MLD Querier [4/6]:\t%c/%c\n", + shadowing4, shadowing6); + seq_puts(seq, "-------------------------------------------\n"); + seq_printf(seq, " %-10s %s\n", "Originator", "Flags"); +} + +/** + * batadv_mcast_flags_seq_print_text - print the mcast flags of other nodes + * @seq: seq file to print on + * @offset: not used + * + * This prints a table of (primary) originators and their according + * multicast flags, including (in the header) our own. + */ +int batadv_mcast_flags_seq_print_text(struct seq_file *seq, void *offset) +{ + struct net_device *net_dev = (struct net_device *)seq->private; + struct batadv_priv *bat_priv = netdev_priv(net_dev); + struct batadv_hard_iface *primary_if; + struct batadv_hashtable *hash = bat_priv->orig_hash; + struct batadv_orig_node *orig_node; + struct hlist_head *head; + uint8_t flags; + uint32_t i; + + primary_if = batadv_seq_print_text_primary_if_get(seq); + if (!primary_if) + return 0; + + batadv_mcast_flags_print_header(bat_priv, seq); + + for (i = 0; i < hash->size; i++) { + head = &hash->table[i]; + + rcu_read_lock(); + hlist_for_each_entry_rcu(orig_node, head, hash_entry) { + if (!(orig_node->capa_initialized & + BATADV_ORIG_CAPA_HAS_MCAST)) + continue; + + flags = orig_node->mcast_flags; + + if (!(orig_node->capabilities & + BATADV_ORIG_CAPA_HAS_MCAST)) { + seq_printf(seq, "%pM -\n", orig_node->orig); + continue; + } + + seq_printf(seq, "%pM [%c%c%c]\n", orig_node->orig, + (flags & BATADV_MCAST_WANT_ALL_UNSNOOPABLES ? + 'U' : '.'), + (flags & BATADV_MCAST_WANT_ALL_IPV4 ? + '4' : '.'), + (flags & BATADV_MCAST_WANT_ALL_IPV6 ? + '6' : '.')); + } + rcu_read_unlock(); + } + + batadv_hardif_free_ref(primary_if); + + return 0; +} + +/** * batadv_mcast_free - free the multicast optimizations structures * @bat_priv: the bat priv with all the soft interface information */ diff --git a/multicast.h b/multicast.h index 3a44ebd..e7bc9bf 100644 --- a/multicast.h +++ b/multicast.h @@ -42,6 +42,8 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb,
void batadv_mcast_init(struct batadv_priv *bat_priv);
+int batadv_mcast_flags_seq_print_text(struct seq_file *seq, void *offset); + void batadv_mcast_free(struct batadv_priv *bat_priv);
void batadv_mcast_purge_orig(struct batadv_orig_node *orig_node);
With this patch IGMP or MLD reports are only forwarded to the selected IGMP/MLD querier as RFC4541 suggests.
This is necessary to avoid multicast packet loss in bridged scenarios: An IGMPv2/MLDv1 querier does not actively join the multicast group the reports are sent to. Because of this, this leads to snooping bridges/switches not being able to learn of multicast listeners in the mesh and wrongly shutting down ports for multicast traffic to the mesh.
Signed-off-by: Linus Lüssing linus.luessing@web.de --- compat.h | 7 + multicast.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++++++++- multicast.h | 8 ++ soft-interface.c | 11 ++ types.h | 4 + 5 files changed, 399 insertions(+), 6 deletions(-)
diff --git a/compat.h b/compat.h index 9f997b3..f3e7032 100644 --- a/compat.h +++ b/compat.h @@ -255,6 +255,13 @@ static inline void skb_reset_mac_len(struct sk_buff *skb)
#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 3, 0)
+static inline int batadv_ipv6_skip_exthdr(const struct sk_buff *skb, int start, + u8 *nexthdrp, __be16 *frag_offp) +{ + return ipv6_skip_exthdr(skb, start, nexthdrp); +} +#define ipv6_skip_exthdr batadv_ipv6_skip_exthdr + #define batadv_interface_add_vid(x, y, z) \ __batadv_interface_add_vid(struct net_device *dev, __be16 proto,\ unsigned short vid);\ diff --git a/multicast.c b/multicast.c index 09b3e6f..24cbf07 100644 --- a/multicast.c +++ b/multicast.c @@ -21,6 +21,9 @@ #include "hard-interface.h" #include "translation-table.h"
+#include <linux/igmp.h> +#include <net/mld.h> + /** * batadv_mcast_get_bridge - get the bridge on top of the softif if it exists * @soft_iface: netdev struct of the mesh interface @@ -512,10 +515,247 @@ out: }
/** + * batadv_mcast_is_valid_igmp - check for an IGMP packet + * @skb: the IPv4 packet to check + * + * Checks whether a given IPv4 packet is a valid IGMP packet and if so + * sets the skb transport header accordingly. + * + * The caller needs to ensure the correct setup of the skb network header. + * This call might reallocate skb data. + */ +static bool batadv_mcast_is_valid_igmp(struct sk_buff *skb) +{ + struct iphdr *iphdr; + unsigned int len = skb_network_offset(skb) + sizeof(*iphdr); + + if (!pskb_may_pull(skb, len)) + return false; + + iphdr = ip_hdr(skb); + + if (iphdr->ihl < 5 || iphdr->version != 4) + return false; + + if (ntohs(iphdr->tot_len) < ip_hdrlen(skb)) + return false; + + len += ip_hdrlen(skb) - sizeof(*iphdr); + skb_set_transport_header(skb, len); + + if (iphdr->protocol != IPPROTO_IGMP) + return false; + + /* TODO: verify checksums */ + + if (!pskb_may_pull(skb, len + sizeof(struct igmphdr))) + return false; + + /* RFC2236+RFC3376 (IGMPv2+IGMPv3) require the multicast link layer + * all-systems destination addresses (224.0.0.1) for general queries + */ + if (igmp_hdr(skb)->type == IGMP_HOST_MEMBERSHIP_QUERY && + !igmp_hdr(skb)->group && + ip_hdr(skb)->daddr != htonl(INADDR_ALLHOSTS_GROUP)) + return false; + + return true; +} + +/** + * batadv_mcast_is_report_ipv4 - check for IGMP reports (and queries) + * @bat_priv: the bat priv with all the soft interface information + * @skb: the ethernet frame destined for the mesh + * @orig: if IGMP report: to be set to the querier to forward the skb to + * + * Checks whether the given frame is an IGMP report and if so sets the + * orig pointer to the originator of the selected IGMP querier if one exists + * and returns true. Otherwise returns false. + * + * If the packet is a general IGMP query instead then we delete the memorized, + * foreign IGMP querier (if one exists): We became the selected querier and + * therefore do not need to forward reports into the mesh. + * + * This call might reallocate skb data. + */ +static bool batadv_mcast_is_report_ipv4(struct batadv_priv *bat_priv, + struct sk_buff *skb, + struct batadv_orig_node **orig) +{ + struct batadv_mcast_querier_state *querier; + struct batadv_orig_node *orig_node; + + if (!batadv_mcast_is_valid_igmp(skb)) + return false; + + querier = &bat_priv->mcast.querier_ipv4; + + switch (igmp_hdr(skb)->type) { + case IGMP_HOST_MEMBERSHIP_REPORT: + case IGMPV2_HOST_MEMBERSHIP_REPORT: + case IGMPV3_HOST_MEMBERSHIP_REPORT: + rcu_read_lock(); + orig_node = rcu_dereference(querier->orig); + if (orig_node && atomic_inc_not_zero(&orig_node->refcount)) { + /* TODO: include multicast routers via MRD (RFC4286) */ + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Redirecting IGMP Report to %pM\n", + orig_node->orig); + *orig = orig_node; + } + rcu_read_unlock(); + + return true; + case IGMP_HOST_MEMBERSHIP_QUERY: + /* RFC4541, section 2.1.1.1.b) says: + * ignore general queries from 0.0.0.0 + */ + if (!ip_hdr(skb)->saddr || igmp_hdr(skb)->group) + break; + + spin_lock_bh(&querier->orig_lock); + orig_node = rcu_dereference(querier->orig); + if (orig_node) + rcu_assign_pointer(querier->orig, NULL); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Snooped own IGMP Query\n"); + break; + } + + return false; +} + +/** + * batadv_mcast_is_valid_mld - check for an MLD packet + * @skb: the IPv6 packet to check + * + * Checks whether a given IPv6 packet is a valid MLD packet and if so + * sets the skb transport header accordingly. + * + * The caller needs to ensure the correct setup of the skb network header. + * This call might reallocate skb data. + */ +static bool batadv_mcast_is_valid_mld(struct sk_buff *skb) +{ + const struct ipv6hdr *ip6hdr; + struct mld_msg *mld; + __be16 frag_off; + u8 nexthdr; + unsigned int len = skb_network_offset(skb) + sizeof(*ip6hdr); + + if (!pskb_may_pull(skb, len)) + return false; + + ip6hdr = ipv6_hdr(skb); + + if (ip6hdr->version != 6 || ip6hdr->payload_len == 0) + return false; + + if (skb->len < len + ntohs(ip6hdr->payload_len)) + return false; + + nexthdr = ip6hdr->nexthdr; + len = ipv6_skip_exthdr(skb, len, &nexthdr, &frag_off); + + if (len < 0) + return false; + + skb_set_transport_header(skb, len); + + /* TODO: verify checksums */ + + if (nexthdr != IPPROTO_ICMPV6) + return false; + + if (!pskb_may_pull(skb, len + sizeof(*mld))) + return false; + + mld = (struct mld_msg *)icmp6_hdr(skb); + + if (mld->mld_type == ICMPV6_MGM_QUERY) { + /* RFC2710+RFC3810 (MLDv1+MLDv2): link-local source addresses */ + if (!(ipv6_addr_type(&ip6hdr->saddr) & IPV6_ADDR_LINKLOCAL)) + return false; + + /* RFC2710+RFC3810 (MLDv1+MLDv2): multicast link layer all-nodes + * destination address (ff02::1) for general queries + */ + if (ipv6_addr_any(&mld->mld_mca) && + !ipv6_addr_is_ll_all_nodes(&ip6hdr->daddr)) + return false; + } + + return true; +} + +/** + * batadv_mcast_is_report_ipv6 - check for MLD reports (and queries) + * @bat_priv: the bat priv with all the soft interface information + * @skb: the ethernet frame destined for the mesh + * @orig: if MLD report: to be set to the querier to forward the skb to + * + * Checks whether the given frame is an MLD report and if so sets the + * orig pointer to the originator of the selected MLD querier if one exists + * and returns true. Otherwise returns false. + * + * If the packet is a general MLD query instead then we delete the memorized, + * foreign MLD querier (if one exists): We became the selected querier and + * therefore do not need to forward reports into the mesh. + * + * This call might reallocate skb data. + */ +static bool batadv_mcast_is_report_ipv6(struct batadv_priv *bat_priv, + struct sk_buff *skb, + struct batadv_orig_node **orig) +{ + struct mld_msg *mld; + struct batadv_mcast_querier_state *querier; + struct batadv_orig_node *orig_node; + + if (!batadv_mcast_is_valid_mld(skb)) + return false; + + querier = &bat_priv->mcast.querier_ipv6; + mld = (struct mld_msg *)icmp6_hdr(skb); + + switch (mld->mld_type) { + case ICMPV6_MGM_REPORT: + case ICMPV6_MLD2_REPORT: + rcu_read_lock(); + orig_node = rcu_dereference(querier->orig); + if (orig_node && atomic_inc_not_zero(&orig_node->refcount)) { + /* TODO: include multicast routers via MRD (RFC4286) */ + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Redirecting MLD Report to %pM\n", + orig_node->orig); + *orig = orig_node; + } + rcu_read_unlock(); + + return true; + case ICMPV6_MGM_QUERY: + spin_lock_bh(&querier->orig_lock); + orig_node = rcu_dereference(querier->orig); + if (orig_node) + rcu_assign_pointer(querier->orig, NULL); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Snooped own MLD Query\n"); + break; + } + + return false; +} + +/** * batadv_mcast_forw_mode_check_ipv4 - check for optimized forwarding potential * @bat_priv: the bat priv with all the soft interface information * @skb: the IPv4 packet to check * @is_unsnoopable: stores whether the destination is snoopable + * @orig: for IGMP reports: to be set to the querier to forward the skb to * * Checks whether the given IPv4 packet has the potential to be forwarded with a * mode more optimal than classic flooding. @@ -525,7 +765,8 @@ out: */ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, struct sk_buff *skb, - bool *is_unsnoopable) + bool *is_unsnoopable, + struct batadv_orig_node **orig) { struct iphdr *iphdr;
@@ -533,6 +774,9 @@ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, if (!pskb_may_pull(skb, sizeof(struct ethhdr) + sizeof(*iphdr))) return -ENOMEM;
+ if (batadv_mcast_is_report_ipv4(bat_priv, skb, orig)) + return 0; + iphdr = ip_hdr(skb);
/* TODO: Implement Multicast Router Discovery (RFC4286), @@ -554,6 +798,7 @@ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, * @bat_priv: the bat priv with all the soft interface information * @skb: the IPv6 packet to check * @is_unsnoopable: stores whether the destination is snoopable + * @orig: for MLD reports: to be set to the querier to forward the skb to * * Checks whether the given IPv6 packet has the potential to be forwarded with a * mode more optimal than classic flooding. @@ -563,7 +808,8 @@ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, */ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, struct sk_buff *skb, - bool *is_unsnoopable) + bool *is_unsnoopable, + struct batadv_orig_node **orig) { struct ipv6hdr *ip6hdr;
@@ -571,6 +817,9 @@ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, if (!pskb_may_pull(skb, sizeof(struct ethhdr) + sizeof(*ip6hdr))) return -ENOMEM;
+ if (batadv_mcast_is_report_ipv6(bat_priv, skb, orig)) + return 0; + ip6hdr = ipv6_hdr(skb);
/* TODO: Implement Multicast Router Discovery (RFC4286), @@ -593,6 +842,7 @@ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, * @bat_priv: the bat priv with all the soft interface information * @skb: the multicast frame to check * @is_unsnoopable: stores whether the destination is snoopable + * @orig: for IGMP/MLD reports: to be set to the querier to forward the skb to * * Checks whether the given multicast ethernet frame has the potential to be * forwarded with a mode more optimal than classic flooding. @@ -602,7 +852,8 @@ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, */ static int batadv_mcast_forw_mode_check(struct batadv_priv *bat_priv, struct sk_buff *skb, - bool *is_unsnoopable) + bool *is_unsnoopable, + struct batadv_orig_node **orig) { struct ethhdr *ethhdr = eth_hdr(skb);
@@ -615,10 +866,10 @@ static int batadv_mcast_forw_mode_check(struct batadv_priv *bat_priv, switch (ntohs(ethhdr->h_proto)) { case ETH_P_IP: return batadv_mcast_forw_mode_check_ipv4(bat_priv, skb, - is_unsnoopable); + is_unsnoopable, orig); case ETH_P_IPV6: return batadv_mcast_forw_mode_check_ipv6(bat_priv, skb, - is_unsnoopable); + is_unsnoopable, orig); default: return -EINVAL; } @@ -786,12 +1037,16 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, bool is_unsnoopable = false; struct ethhdr *ethhdr;
- ret = batadv_mcast_forw_mode_check(bat_priv, skb, &is_unsnoopable); + ret = batadv_mcast_forw_mode_check(bat_priv, skb, &is_unsnoopable, + orig); if (ret == -ENOMEM) return BATADV_FORW_NONE; else if (ret < 0) return BATADV_FORW_ALL;
+ if (*orig) + return BATADV_FORW_SINGLE; + ethhdr = eth_hdr(skb);
tt_count = batadv_tt_global_hash_count(bat_priv, ethhdr->h_dest, @@ -823,6 +1078,101 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, }
/** + * batadv_mcast_snoop_query_ipv4 - snoop for the selected MLD querier + * @skb: the unencapsulated ethernet frame coming from the mesh + * @orig_node: the originator this frame came from + * + * Checks whether the given frame is a general IGMP query from the selected + * querier and if so memorizes the originator this frame came from. + */ +static void batadv_mcast_snoop_query_ipv4(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ + struct batadv_mcast_querier_state *querier; + + if (!batadv_mcast_is_valid_igmp(skb)) + return; + + /* we are only interested in general queries (group == 0.0.0.0) */ + if (igmp_hdr(skb)->type != IGMP_HOST_MEMBERSHIP_QUERY || + igmp_hdr(skb)->group) + return; + + /* RFC4541, section 2.1.1.1.b) says: ignore queries from 0.0.0.0 */ + if (!ip_hdr(skb)->saddr) + return; + + querier = &orig_node->bat_priv->mcast.querier_ipv4; + + spin_lock_bh(&querier->orig_lock); + rcu_assign_pointer(querier->orig, orig_node); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, orig_node->bat_priv, + "Snooped IGMP Query from originator %pM\n", orig_node->orig); +} + +/** + * batadv_mcast_snoop_query_ipv6 - snoop for the selected MLD querier + * @skb: the unencapsulated ethernet frame coming from the mesh + * @orig_node: the originator this frame came from + * + * Checks whether the given frame is a general MLD query from the selected + * querier and if so memorizes the originator this frame came from. + */ +static void batadv_mcast_snoop_query_ipv6(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ + struct mld_msg *mld; + struct batadv_mcast_querier_state *querier; + + if (!batadv_mcast_is_valid_mld(skb)) + return; + + mld = (struct mld_msg *)icmp6_hdr(skb); + + /* we are only interested in general queries (mca == ::) */ + if (mld->mld_type != ICMPV6_MGM_QUERY || + !ipv6_addr_any(&mld->mld_mca)) + return; + + querier = &orig_node->bat_priv->mcast.querier_ipv6; + + spin_lock_bh(&querier->orig_lock); + rcu_assign_pointer(querier->orig, orig_node); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, orig_node->bat_priv, + "Snooped MLD Query from originator %pM\n", orig_node->orig); +} + +/** + * batadv_mcast_snoop_query - snoop the selected IGMP/MLD querier + * @skb: the unencapsulated ethernet frame coming from the mesh + * @orig_node: the originator this frame came from + * + * Checks whether the given frame is a general IGMP or MLD query + * from the selected querier and if so memorizes the originator + * this frame came from. + * + * This call might reallocate skb data. + */ +void batadv_mcast_snoop_query(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ + struct ethhdr *ethhdr = eth_hdr(skb); + + switch (ntohs(ethhdr->h_proto)) { + case ETH_P_IP: + batadv_mcast_snoop_query_ipv4(skb, orig_node); + break; + case ETH_P_IPV6: + batadv_mcast_snoop_query_ipv6(skb, orig_node); + break; + } +} + +/** * batadv_mcast_want_unsnoop_update - update unsnoop counter and list * @bat_priv: the bat priv with all the soft interface information * @orig: the orig_node which multicast state might have changed of @@ -1101,6 +1451,7 @@ void batadv_mcast_free(struct batadv_priv *bat_priv) void batadv_mcast_purge_orig(struct batadv_orig_node *orig) { struct batadv_priv *bat_priv = orig->bat_priv; + struct batadv_mcast_querier_state *querier;
if (!(orig->capabilities & BATADV_ORIG_CAPA_HAS_MCAST)) atomic_dec(&bat_priv->mcast.num_disabled); @@ -1108,4 +1459,16 @@ void batadv_mcast_purge_orig(struct batadv_orig_node *orig) batadv_mcast_want_unsnoop_update(bat_priv, orig, BATADV_NO_FLAGS); batadv_mcast_want_ipv4_update(bat_priv, orig, BATADV_NO_FLAGS); batadv_mcast_want_ipv6_update(bat_priv, orig, BATADV_NO_FLAGS); + + querier = &bat_priv->mcast.querier_ipv4; + spin_lock_bh(&querier->orig_lock); + if (rcu_dereference(querier->orig) == orig) + rcu_assign_pointer(querier->orig, NULL); + spin_unlock_bh(&querier->orig_lock); + + querier = &bat_priv->mcast.querier_ipv6; + spin_lock_bh(&querier->orig_lock); + if (rcu_dereference(querier->orig) == orig) + rcu_assign_pointer(querier->orig, NULL); + spin_unlock_bh(&querier->orig_lock); } diff --git a/multicast.h b/multicast.h index e7bc9bf..fa8e7be 100644 --- a/multicast.h +++ b/multicast.h @@ -40,6 +40,9 @@ enum batadv_forw_mode batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, struct batadv_orig_node **mcast_single_orig);
+void batadv_mcast_snoop_query(struct sk_buff *skb, + struct batadv_orig_node *orig_node); + void batadv_mcast_init(struct batadv_priv *bat_priv);
int batadv_mcast_flags_seq_print_text(struct seq_file *seq, void *offset); @@ -61,6 +64,11 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, return BATADV_FORW_ALL; }
+static inline void batadv_mcast_snoop_query(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ +} + static inline int batadv_mcast_init(struct batadv_priv *bat_priv) { return 0; diff --git a/soft-interface.c b/soft-interface.c index d4d892c..0f5fb30 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -183,6 +183,7 @@ static int batadv_interface_tx(struct sk_buff *skb, if (atomic_read(&bat_priv->mesh_state) != BATADV_MESH_ACTIVE) goto dropped;
+ skb_set_network_header(skb, ETH_HLEN); soft_iface->trans_start = jiffies; vid = batadv_get_vid(skb, 0); ethhdr = eth_hdr(skb); @@ -397,7 +398,9 @@ void batadv_interface_rx(struct net_device *soft_iface, /* skb->dev & skb->pkt_type are set here */ if (unlikely(!pskb_may_pull(skb, ETH_HLEN))) goto dropped; + skb->protocol = eth_type_trans(skb, soft_iface); + skb_reset_network_header(skb);
/* should not be necessary anymore as we use skb_pull_rcsum() * TODO: please verify this and remove this TODO @@ -435,6 +438,9 @@ void batadv_interface_rx(struct net_device *soft_iface, skb->mark &= ~bat_priv->isolation_mark_mask; skb->mark |= bat_priv->isolation_mark; } + + if (orig_node) + batadv_mcast_snoop_query(skb, orig_node); } else if (batadv_is_ap_isolated(bat_priv, ethhdr->h_source, ethhdr->h_dest, vid)) { goto dropped; @@ -747,8 +753,13 @@ static int batadv_softif_init_late(struct net_device *dev) #ifdef CONFIG_BATMAN_ADV_MCAST bat_priv->mcast.querier_ipv4.exists = false; bat_priv->mcast.querier_ipv4.shadowing = false; + rcu_assign_pointer(bat_priv->mcast.querier_ipv4.orig, NULL); + spin_lock_init(&bat_priv->mcast.querier_ipv4.orig_lock); bat_priv->mcast.querier_ipv6.exists = false; bat_priv->mcast.querier_ipv6.shadowing = false; + rcu_assign_pointer(bat_priv->mcast.querier_ipv6.orig, NULL); + spin_lock_init(&bat_priv->mcast.querier_ipv6.orig_lock); + bat_priv->mcast.flags = BATADV_NO_FLAGS; bat_priv->mcast.bridged = false; atomic_set(&bat_priv->multicast_mode, 1); diff --git a/types.h b/types.h index bd56fc1..0a26fe9 100644 --- a/types.h +++ b/types.h @@ -627,10 +627,14 @@ struct batadv_priv_dat { * @exists: whether a querier exists in the mesh * @shadowing: if a querier exists, whether it is potentially shadowing * multicast listeners (i.e. querier is behind our own bridge segment) + * @orig: node on which the selected querier resides + * @orig_lock: protects updates of the selected querier in 'orig' */ struct batadv_mcast_querier_state { bool exists; bool shadowing; + struct batadv_orig_node __rcu *orig; /* rcu protected pointer */ + spinlock_t orig_lock; /* protects updates of orig */ };
/**
Hi Linus,
On Sunday 07 September 2014 07:42:40 Linus Lüssing wrote:
With this patch IGMP or MLD reports are only forwarded to the selected IGMP/MLD querier as RFC4541 suggests.
This is necessary to avoid multicast packet loss in bridged scenarios: An IGMPv2/MLDv1 querier does not actively join the multicast group the reports are sent to. Because of this, this leads to snooping bridges/switches not being able to learn of multicast listeners in the mesh and wrongly shutting down ports for multicast traffic to the mesh.
Signed-off-by: Linus Lüssing linus.luessing@web.de
compat.h | 7 + multicast.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++++++++- multicast.h | 8 ++ soft-interface.c | 11 ++ types.h | 4 + 5 files changed, 399 insertions(+), 6 deletions(-)
As a general note: This patch adds ~400 lines for mainly processing IGMP/MLD stuff, right? Shouldn't this kind of IP/IGMP/MLD report detection and memorizing the Querier rather go in the bridge code instead of batman-adv, or is there any functionality there already? IMHO the preferred position would be the bridge code since its more general/used by more people on one hand, and does not bloat batman-adv on the other hand. :)
Could you give us some comments regarding who should do IGMP/MLD processing in the kernel so we are prepared for these kind of upstream questions?
Just some small notes regarding the code below:
[...] --- a/multicast.c +++ b/multicast.c [...] @@ -512,10 +515,247 @@ out: }
/**
- batadv_mcast_is_valid_igmp - check for an IGMP packet
- @skb: the IPv4 packet to check
- Checks whether a given IPv4 packet is a valid IGMP packet and if so
- sets the skb transport header accordingly.
- The caller needs to ensure the correct setup of the skb network header.
- This call might reallocate skb data.
- */
+static bool batadv_mcast_is_valid_igmp(struct sk_buff *skb) +{
- struct iphdr *iphdr;
- unsigned int len = skb_network_offset(skb) + sizeof(*iphdr);
- if (!pskb_may_pull(skb, len))
return false;
- iphdr = ip_hdr(skb);
- if (iphdr->ihl < 5 || iphdr->version != 4)
return false;
- if (ntohs(iphdr->tot_len) < ip_hdrlen(skb))
return false;
- len += ip_hdrlen(skb) - sizeof(*iphdr);
- skb_set_transport_header(skb, len);
- if (iphdr->protocol != IPPROTO_IGMP)
return false;
- /* TODO: verify checksums */
- if (!pskb_may_pull(skb, len + sizeof(struct igmphdr)))
return false;
- /* RFC2236+RFC3376 (IGMPv2+IGMPv3) require the multicast link layer
* all-systems destination addresses (224.0.0.1) for general queries
*/
- if (igmp_hdr(skb)->type == IGMP_HOST_MEMBERSHIP_QUERY &&
!igmp_hdr(skb)->group &&
ip_hdr(skb)->daddr != htonl(INADDR_ALLHOSTS_GROUP))
return false;
- return true;
+}
+/**
- batadv_mcast_is_report_ipv4 - check for IGMP reports (and queries)
- @bat_priv: the bat priv with all the soft interface information
- @skb: the ethernet frame destined for the mesh
- @orig: if IGMP report: to be set to the querier to forward the skb to
- Checks whether the given frame is an IGMP report and if so sets the
- orig pointer to the originator of the selected IGMP querier if one
exists + * and returns true. Otherwise returns false.
- If the packet is a general IGMP query instead then we delete the
memorized, + * foreign IGMP querier (if one exists): We became the selected querier and + * therefore do not need to forward reports into the mesh.
- This call might reallocate skb data.
- */
+static bool batadv_mcast_is_report_ipv4(struct batadv_priv *bat_priv,
struct sk_buff *skb,
struct batadv_orig_node **orig)
+{
- struct batadv_mcast_querier_state *querier;
- struct batadv_orig_node *orig_node;
- if (!batadv_mcast_is_valid_igmp(skb))
return false;
- querier = &bat_priv->mcast.querier_ipv4;
- switch (igmp_hdr(skb)->type) {
- case IGMP_HOST_MEMBERSHIP_REPORT:
- case IGMPV2_HOST_MEMBERSHIP_REPORT:
- case IGMPV3_HOST_MEMBERSHIP_REPORT:
rcu_read_lock();
orig_node = rcu_dereference(querier->orig);
if (orig_node && atomic_inc_not_zero(&orig_node->refcount)) {
/* TODO: include multicast routers via MRD (RFC4286) */
batadv_dbg(BATADV_DBG_MCAST, bat_priv,
"Redirecting IGMP Report to %pM\n",
orig_node->orig);
*orig = orig_node;
}
rcu_read_unlock();
return true;
- case IGMP_HOST_MEMBERSHIP_QUERY:
/* RFC4541, section 2.1.1.1.b) says:
* ignore general queries from 0.0.0.0
*/
if (!ip_hdr(skb)->saddr || igmp_hdr(skb)->group)
break;
Why do we break here if group != 0? This looks wrong, maybe you meant && or group != 0?
[...] diff --git a/soft-interface.c b/soft-interface.c index d4d892c..0f5fb30 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -183,6 +183,7 @@ static int batadv_interface_tx(struct sk_buff *skb, if (atomic_read(&bat_priv->mesh_state) != BATADV_MESH_ACTIVE) goto dropped;
- skb_set_network_header(skb, ETH_HLEN); soft_iface->trans_start = jiffies; vid = batadv_get_vid(skb, 0); ethhdr = eth_hdr(skb);
@@ -397,7 +398,9 @@ void batadv_interface_rx(struct net_device *soft_iface, /* skb->dev & skb->pkt_type are set here */ if (unlikely(!pskb_may_pull(skb, ETH_HLEN))) goto dropped;
- skb->protocol = eth_type_trans(skb, soft_iface);
- skb_reset_network_header(skb);
Why do we need that header stuff here? And there is an extra newline too ... If you need that for the multicast stuff better move it there, otherwise it is set for every packet ...
Cheers, Simon
On Mon, Nov 24, 2014 at 03:56:06PM +0100, Simon Wunderlich wrote:
As a general note: This patch adds ~400 lines for mainly processing IGMP/MLD stuff, right? Shouldn't this kind of IP/IGMP/MLD report detection and memorizing the Querier rather go in the bridge code instead of batman-adv, or is there any functionality there already? IMHO the preferred position would be the bridge code since its more general/used by more people on one hand, and does not bloat batman-adv on the other hand. :)
Could you give us some comments regarding who should do IGMP/MLD processing in the kernel so we are prepared for these kind of upstream questions?
Currently, with this patch, the decisions about where to forward IGMP/MLD reports in batman-adv is done on the multicast listener / report sender side.
With this approach, as seen in the picture, even a node without the bridge (the blue node 'L') needs to snoop reports. It can't rely on a bridge.
On the other hand, the current upstream implementation without bridge support does not (necessarily) need it (will add an explanation to the wiki page).
Though there is no (does not seem to be) any conceptual issue with either the current upstream implementation or this proposition as is, you just made me realize that there are issues with mixed setups... backwards compatibility issues... For instance if node 'L' were of a post-multicast but pre-multicast-bridge optimizations version.
On Sunday 07 September 2014 07:42:36 Linus Lüssing wrote:
This patchset enables the usage of the batman-adv multicast optimizations for scenarios involving bridges on top of e.g. bat0, too.
Along come two more patches adding according debugging facilities to make it possible for the user to check why the multicast optimizations might not work ideally to give hints about what they might change about their topology.
The fourth one alters the forwarding behaviour for IGMP and MLD reports. For one thing this reduces overhead (especially for IGMPv3/MLDv2 reports). But more importantly this is necessary to make the optimizations work in bridged scenarios at all. For more details see: http://www.open-mesh.org/projects/batman-adv/wiki/Multicast-optimizations-li stener-reports
As discussed it would be great to clarify the text on that wiki page a little. It is not really clear (at least to me) why this is necessary. Simple english and not too many conditions and integrated remarks help. :)
Also note, that with PATCH 4/4 multicast optimizations do not build for kernels < 2.6.35 (net/mld.h didn't exist back then): If there are actually brave souls out there running such ancient end-of-life kernels, it is suggested to build batman-adv with the multicast-compile-time option disabled. Multicast optimizations for bridges does not half any benefits for kernels < 3.17 anyways.
How about you add some kind of #error in compat.h if multicast is enabled and the kernel is too old, and ask to disable the multicast feature? that should be better then just failing with "mld.h not found". Otherwise we could add a stub mld.h and stub compat functions as we do for newer kernels which do not yet support all the multicast fun.
Thanks Simon
b.a.t.m.a.n@lists.open-mesh.org