The last round of multicast patches send to the batman-adv mailinglist to add support for the multicast optimizations in bridged scenarios, too, unfortunately had one major conceptual flaw: It could lead to packet loss. It's not sufficient to have the unicasting of reports implemented on bridge-nodes only. Nodes without bridges need to treat reports the same way.
The issue is described in detail here:
https://www.open-mesh.org/projects/batman-adv/wiki/Multicast-optimizations-l...
Cheers, Linus
-----
Changelog: v6: * compat: copied copyright headers from original upstream c files * compat: unified ordering in compat c files: -> copyright header, then includes, then kernel specific functions v5: * Removed RFC tag: Needed exports got merged to net-next and are going to be available with Linux 4.2 * Redid compat solution - now fully backwards compatible down to 2.6.33 v4: * excluded bridge part from this patchset, they should hopefully be added to net-next soon * Added a compat solution (PATCH 3/3) * Removed Kconfig-depends as by David's suggestion the needed parsing functions for MLD are going to be forced built-ins even if IPv6 is going to be built as a module * Removed unused variable 'int ret' in batadv_mcast_is_report_ipv6() * Adjusted to new folder structure v3: * Adding Kconfig-depends and #if's (so basically adding similar dependancy constraints as the bridge code has, except that there are no depends if batman-adv gets compiled without multicast optimizations) -> the case of IPv6=M and batman-adv=y is still impossible if multicast optimizations are enabled; but I don't see the practical demand for that either - people who use IPv6 as a module will probably also want to use batman-adv as a module v2: * various bugfixes (now runtime tested, too - should(tm) work) * added netdev+bridge mailinglists
With this patch IGMP or MLD reports are only forwarded to the selected IGMP/MLD querier as RFC4541 suggests. This is necessary to avoid multicast packet loss in bridged scenarios later:
An IGMPv2/MLDv1 querier does not actively join the multicast group the reports are sent to. Because of this, this leads to snooping bridges/switches not being able to learn of multicast listeners in the mesh and wrongly shutting down ports for multicast traffic to the mesh.
Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue --- net/batman-adv/main.h | 4 +- net/batman-adv/multicast.c | 257 ++++++++++++++++++++++++++++++++++++++- net/batman-adv/multicast.h | 8 ++ net/batman-adv/soft-interface.c | 11 ++ net/batman-adv/types.h | 14 +++ 5 files changed, 287 insertions(+), 7 deletions(-)
diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h index 41d27c7..2832cf9 100644 --- a/net/batman-adv/main.h +++ b/net/batman-adv/main.h @@ -219,6 +219,7 @@ __be32 batadv_skb_crc32(struct sk_buff *skb, u8 *payload_ptr); * @BATADV_DBG_BLA: bridge loop avoidance messages * @BATADV_DBG_DAT: ARP snooping and DAT related messages * @BATADV_DBG_NC: network coding related messages + * @BATADV_DBG_MCAST: multicast related messages * @BATADV_DBG_ALL: the union of all the above log levels */ enum batadv_dbg_level { @@ -228,7 +229,8 @@ enum batadv_dbg_level { BATADV_DBG_BLA = BIT(3), BATADV_DBG_DAT = BIT(4), BATADV_DBG_NC = BIT(5), - BATADV_DBG_ALL = 63, + BATADV_DBG_MCAST = BIT(6), + BATADV_DBG_ALL = 127, };
#ifdef CONFIG_BATMAN_ADV_DEBUG diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c index 7aa480b..72d561c 100644 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@ -24,6 +24,7 @@ #include <linux/etherdevice.h> #include <linux/fs.h> #include <linux/if_ether.h> +#include <linux/igmp.h> #include <linux/in6.h> #include <linux/in.h> #include <linux/ip.h> @@ -40,6 +41,7 @@ #include <linux/types.h> #include <net/addrconf.h> #include <net/ipv6.h> +#include <net/mld.h>
#include "packet.h" #include "translation-table.h" @@ -270,10 +272,76 @@ out: }
/** + * batadv_mcast_is_report_ipv4 - check for IGMP reports (and queries) + * @bat_priv: the bat priv with all the soft interface information + * @skb: the ethernet frame destined for the mesh + * @orig: if IGMP report: to be set to the querier to forward the skb to + * + * Checks whether the given frame is an IGMP report and if so sets the + * orig pointer to the originator of the selected IGMP querier if one exists + * and returns true. Otherwise returns false. + * + * If the packet is a general IGMP query instead then we delete the memorized, + * foreign IGMP querier (if one exists): We became the selected querier and + * therefore do not need to forward reports into the mesh. + * + * This call might reallocate skb data. + */ +static bool batadv_mcast_is_report_ipv4(struct batadv_priv *bat_priv, + struct sk_buff *skb, + struct batadv_orig_node **orig) +{ + struct batadv_mcast_querier_state *querier; + struct batadv_orig_node *orig_node; + + if (ip_mc_check_igmp(skb, NULL) < 0) + return false; + + querier = &bat_priv->mcast.querier_ipv4; + + switch (igmp_hdr(skb)->type) { + case IGMP_HOST_MEMBERSHIP_REPORT: + case IGMPV2_HOST_MEMBERSHIP_REPORT: + case IGMPV3_HOST_MEMBERSHIP_REPORT: + rcu_read_lock(); + orig_node = rcu_dereference(querier->orig); + if (orig_node && atomic_inc_not_zero(&orig_node->refcount)) { + /* TODO: include multicast routers via MRD (RFC4286) */ + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Redirecting IGMP Report to %pM\n", + orig_node->orig); + *orig = orig_node; + } + rcu_read_unlock(); + + return true; + case IGMP_HOST_MEMBERSHIP_QUERY: + /* RFC4541, section 2.1.1.1.b) says: + * ignore general queries from 0.0.0.0 + */ + if (!ip_hdr(skb)->saddr || igmp_hdr(skb)->group) + break; + + spin_lock_bh(&querier->orig_lock); + orig_node = rcu_dereference(querier->orig); + if (orig_node) + rcu_assign_pointer(querier->orig, NULL); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Snooped own IGMP Query\n"); + break; + } + + return false; +} + +/** * batadv_mcast_forw_mode_check_ipv4 - check for optimized forwarding potential * @bat_priv: the bat priv with all the soft interface information * @skb: the IPv4 packet to check * @is_unsnoopable: stores whether the destination is snoopable + * @orig: for IGMP reports: to be set to the querier to forward the skb to * * Checks whether the given IPv4 packet has the potential to be forwarded with a * mode more optimal than classic flooding. @@ -283,7 +351,8 @@ out: */ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, struct sk_buff *skb, - bool *is_unsnoopable) + bool *is_unsnoopable, + struct batadv_orig_node **orig) { struct iphdr *iphdr;
@@ -291,6 +360,9 @@ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, if (!pskb_may_pull(skb, sizeof(struct ethhdr) + sizeof(*iphdr))) return -ENOMEM;
+ if (batadv_mcast_is_report_ipv4(bat_priv, skb, orig)) + return 0; + iphdr = ip_hdr(skb);
/* TODO: Implement Multicast Router Discovery (RFC4286), @@ -307,11 +379,73 @@ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, return 0; }
+#if IS_ENABLED(CONFIG_IPV6) +/** + * batadv_mcast_is_report_ipv6 - check for MLD reports (and queries) + * @bat_priv: the bat priv with all the soft interface information + * @skb: the ethernet frame destined for the mesh + * @orig: if MLD report: to be set to the querier to forward the skb to + * + * Checks whether the given frame is an MLD report and if so sets the + * orig pointer to the originator of the selected MLD querier if one exists + * and returns true. Otherwise returns false. + * + * If the packet is a general MLD query instead then we delete the memorized, + * foreign MLD querier (if one exists): We became the selected querier and + * therefore do not need to forward reports into the mesh. + * + * This call might reallocate skb data. + */ +static bool batadv_mcast_is_report_ipv6(struct batadv_priv *bat_priv, + struct sk_buff *skb, + struct batadv_orig_node **orig) +{ + struct mld_msg *mld; + struct batadv_mcast_querier_state *querier; + struct batadv_orig_node *orig_node; + + if (ipv6_mc_check_mld(skb, NULL) < 0) + return false; + + querier = &bat_priv->mcast.querier_ipv6; + mld = (struct mld_msg *)icmp6_hdr(skb); + + switch (mld->mld_type) { + case ICMPV6_MGM_REPORT: + case ICMPV6_MLD2_REPORT: + rcu_read_lock(); + orig_node = rcu_dereference(querier->orig); + if (orig_node && atomic_inc_not_zero(&orig_node->refcount)) { + /* TODO: include multicast routers via MRD (RFC4286) */ + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Redirecting MLD Report to %pM\n", + orig_node->orig); + *orig = orig_node; + } + rcu_read_unlock(); + + return true; + case ICMPV6_MGM_QUERY: + spin_lock_bh(&querier->orig_lock); + orig_node = rcu_dereference(querier->orig); + if (orig_node) + rcu_assign_pointer(querier->orig, NULL); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, bat_priv, + "Snooped own MLD Query\n"); + break; + } + + return false; +} + /** * batadv_mcast_forw_mode_check_ipv6 - check for optimized forwarding potential * @bat_priv: the bat priv with all the soft interface information * @skb: the IPv6 packet to check * @is_unsnoopable: stores whether the destination is snoopable + * @orig: for MLD reports: to be set to the querier to forward the skb to * * Checks whether the given IPv6 packet has the potential to be forwarded with a * mode more optimal than classic flooding. @@ -321,7 +455,8 @@ static int batadv_mcast_forw_mode_check_ipv4(struct batadv_priv *bat_priv, */ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, struct sk_buff *skb, - bool *is_unsnoopable) + bool *is_unsnoopable, + struct batadv_orig_node **orig) { struct ipv6hdr *ip6hdr;
@@ -329,6 +464,9 @@ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, if (!pskb_may_pull(skb, sizeof(struct ethhdr) + sizeof(*ip6hdr))) return -ENOMEM;
+ if (batadv_mcast_is_report_ipv6(bat_priv, skb, orig)) + return 0; + ip6hdr = ipv6_hdr(skb);
/* TODO: Implement Multicast Router Discovery (RFC4286), @@ -345,6 +483,7 @@ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv,
return 0; } +#endif
/** * batadv_mcast_forw_mode_check - check for optimized forwarding potential @@ -360,7 +499,8 @@ static int batadv_mcast_forw_mode_check_ipv6(struct batadv_priv *bat_priv, */ static int batadv_mcast_forw_mode_check(struct batadv_priv *bat_priv, struct sk_buff *skb, - bool *is_unsnoopable) + bool *is_unsnoopable, + struct batadv_orig_node **orig) { struct ethhdr *ethhdr = eth_hdr(skb);
@@ -373,10 +513,12 @@ static int batadv_mcast_forw_mode_check(struct batadv_priv *bat_priv, switch (ntohs(ethhdr->h_proto)) { case ETH_P_IP: return batadv_mcast_forw_mode_check_ipv4(bat_priv, skb, - is_unsnoopable); + is_unsnoopable, orig); +#if IS_ENABLED(CONFIG_IPV6) case ETH_P_IPV6: return batadv_mcast_forw_mode_check_ipv6(bat_priv, skb, - is_unsnoopable); + is_unsnoopable, orig); +#endif default: return -EINVAL; } @@ -544,12 +686,16 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, bool is_unsnoopable = false; struct ethhdr *ethhdr;
- ret = batadv_mcast_forw_mode_check(bat_priv, skb, &is_unsnoopable); + ret = batadv_mcast_forw_mode_check(bat_priv, skb, &is_unsnoopable, + orig); if (ret == -ENOMEM) return BATADV_FORW_NONE; else if (ret < 0) return BATADV_FORW_ALL;
+ if (*orig) + return BATADV_FORW_SINGLE; + ethhdr = eth_hdr(skb);
tt_count = batadv_tt_global_hash_count(bat_priv, ethhdr->h_dest, @@ -581,6 +727,105 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, }
/** + * batadv_mcast_snoop_query_ipv4 - snoop for the selected MLD querier + * @skb: the unencapsulated ethernet frame coming from the mesh + * @orig_node: the originator this frame came from + * + * Checks whether the given frame is a general IGMP query from the selected + * querier and if so memorizes the originator this frame came from. + */ +static void batadv_mcast_snoop_query_ipv4(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ + struct batadv_mcast_querier_state *querier; + + if (ip_mc_check_igmp(skb, NULL) < 0) + return; + + /* we are only interested in general queries (group == 0.0.0.0) */ + if (igmp_hdr(skb)->type != IGMP_HOST_MEMBERSHIP_QUERY || + igmp_hdr(skb)->group) + return; + + /* RFC4541, section 2.1.1.1.b) says: ignore queries from 0.0.0.0 */ + if (!ip_hdr(skb)->saddr) + return; + + querier = &orig_node->bat_priv->mcast.querier_ipv4; + + spin_lock_bh(&querier->orig_lock); + rcu_assign_pointer(querier->orig, orig_node); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, orig_node->bat_priv, + "Snooped IGMP Query from originator %pM\n", orig_node->orig); +} + +#if IS_ENABLED(CONFIG_IPV6) +/** + * batadv_mcast_snoop_query_ipv6 - snoop for the selected MLD querier + * @skb: the unencapsulated ethernet frame coming from the mesh + * @orig_node: the originator this frame came from + * + * Checks whether the given frame is a general MLD query from the selected + * querier and if so memorizes the originator this frame came from. + */ +static void batadv_mcast_snoop_query_ipv6(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ + struct mld_msg *mld; + struct batadv_mcast_querier_state *querier; + + if (ipv6_mc_check_mld(skb, NULL) < 0) + return; + + mld = (struct mld_msg *)icmp6_hdr(skb); + + /* we are only interested in general queries (mca == ::) */ + if (mld->mld_type != ICMPV6_MGM_QUERY || + !ipv6_addr_any(&mld->mld_mca)) + return; + + querier = &orig_node->bat_priv->mcast.querier_ipv6; + + spin_lock_bh(&querier->orig_lock); + rcu_assign_pointer(querier->orig, orig_node); + spin_unlock_bh(&querier->orig_lock); + + batadv_dbg(BATADV_DBG_MCAST, orig_node->bat_priv, + "Snooped MLD Query from originator %pM\n", orig_node->orig); +} +#endif + +/** + * batadv_mcast_snoop_query - snoop the selected IGMP/MLD querier + * @skb: the unencapsulated ethernet frame coming from the mesh + * @orig_node: the originator this frame came from + * + * Checks whether the given frame is a general IGMP or MLD query + * from the selected querier and if so memorizes the originator + * this frame came from. + * + * This call might reallocate skb data. + */ +void batadv_mcast_snoop_query(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ + struct ethhdr *ethhdr = eth_hdr(skb); + + switch (ntohs(ethhdr->h_proto)) { + case ETH_P_IP: + batadv_mcast_snoop_query_ipv4(skb, orig_node); + break; +#if IS_ENABLED(CONFIG_IPV6) + case ETH_P_IPV6: + batadv_mcast_snoop_query_ipv6(skb, orig_node); + break; +#endif + } +} + +/** * batadv_mcast_want_unsnoop_update - update unsnoop counter and list * @bat_priv: the bat priv with all the soft interface information * @orig: the orig_node which multicast state might have changed of diff --git a/net/batman-adv/multicast.h b/net/batman-adv/multicast.h index beb6e56..fb61db9 100644 --- a/net/batman-adv/multicast.h +++ b/net/batman-adv/multicast.h @@ -46,6 +46,9 @@ enum batadv_forw_mode batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, struct batadv_orig_node **mcast_single_orig);
+void batadv_mcast_snoop_query(struct sk_buff *skb, + struct batadv_orig_node *orig_node); + void batadv_mcast_init(struct batadv_priv *bat_priv);
void batadv_mcast_free(struct batadv_priv *bat_priv); @@ -65,6 +68,11 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, struct sk_buff *skb, return BATADV_FORW_ALL; }
+static inline void batadv_mcast_snoop_query(struct sk_buff *skb, + struct batadv_orig_node *orig_node) +{ +} + static inline int batadv_mcast_init(struct batadv_priv *bat_priv) { return 0; diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index c002961..0a8c2ae 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -206,6 +206,8 @@ static int batadv_interface_tx(struct sk_buff *skb, if (atomic_read(&bat_priv->mesh_state) != BATADV_MESH_ACTIVE) goto dropped;
+ skb_set_network_header(skb, ETH_HLEN); + soft_iface->trans_start = jiffies; vid = batadv_get_vid(skb, 0); ethhdr = eth_hdr(skb); @@ -396,6 +398,7 @@ void batadv_interface_rx(struct net_device *soft_iface,
skb_pull_rcsum(skb, hdr_size); skb_reset_mac_header(skb); + skb_set_network_header(skb, ETH_HLEN);
/* clean the netfilter state now that the batman-adv header has been * removed @@ -458,6 +461,9 @@ void batadv_interface_rx(struct net_device *soft_iface, skb->mark &= ~bat_priv->isolation_mark_mask; skb->mark |= bat_priv->isolation_mark; } + + if (orig_node) + batadv_mcast_snoop_query(skb, orig_node); } else if (batadv_is_ap_isolated(bat_priv, ethhdr->h_source, ethhdr->h_dest, vid)) { goto dropped; @@ -768,6 +774,11 @@ static int batadv_softif_init_late(struct net_device *dev) atomic_set(&bat_priv->distributed_arp_table, 1); #endif #ifdef CONFIG_BATMAN_ADV_MCAST + rcu_assign_pointer(bat_priv->mcast.querier_ipv4.orig, NULL); + spin_lock_init(&bat_priv->mcast.querier_ipv4.orig_lock); + rcu_assign_pointer(bat_priv->mcast.querier_ipv6.orig, NULL); + spin_lock_init(&bat_priv->mcast.querier_ipv6.orig_lock); + bat_priv->mcast.flags = BATADV_NO_FLAGS; atomic_set(&bat_priv->multicast_mode, 1); atomic_set(&bat_priv->mcast.num_disabled, 0); diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 67d6348..3a5365f 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -640,12 +640,24 @@ struct batadv_priv_dat {
#ifdef CONFIG_BATMAN_ADV_MCAST /** + * struct batadv_mcast_querier_state - IGMP/MLD querier state when bridged + * @orig: node on which the selected querier resides + * @orig_lock: protects updates of the selected querier in 'orig' + */ +struct batadv_mcast_querier_state { + struct batadv_orig_node __rcu *orig; /* rcu protected pointer */ + spinlock_t orig_lock; /* protects updates of orig */ +}; + +/** * struct batadv_priv_mcast - per mesh interface mcast data * @mla_list: list of multicast addresses we are currently announcing via TT * @want_all_unsnoopables_list: a list of orig_nodes wanting all unsnoopable * multicast traffic * @want_all_ipv4_list: a list of orig_nodes wanting all IPv4 multicast traffic * @want_all_ipv6_list: a list of orig_nodes wanting all IPv6 multicast traffic + * @querier_ipv4: the current state of an IGMP querier in the mesh + * @querier_ipv6: the current state of an MLD querier in the mesh * @flags: the flags we have last sent in our mcast tvlv * @enabled: whether the multicast tvlv is currently enabled * @num_disabled: number of nodes that have no mcast tvlv @@ -660,6 +672,8 @@ struct batadv_priv_mcast { struct hlist_head want_all_unsnoopables_list; struct hlist_head want_all_ipv4_list; struct hlist_head want_all_ipv6_list; + struct batadv_mcast_querier_state querier_ipv4; + struct batadv_mcast_querier_state querier_ipv6; uint8_t flags; bool enabled; atomic_t num_disabled;
The multicast optimizations bridge integration will require the just implemented IGMP/MLD report handling later. Therefore bumping the version number.
Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue --- net/batman-adv/multicast.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c index 72d561c..87a5942 100644 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@ -225,7 +225,7 @@ static bool batadv_mcast_mla_tvlv_update(struct batadv_priv *bat_priv) if (batadv_mcast_has_bridge(bat_priv)) { if (bat_priv->mcast.enabled) { batadv_tvlv_container_unregister(bat_priv, - BATADV_TVLV_MCAST, 1); + BATADV_TVLV_MCAST, 2); bat_priv->mcast.enabled = false; }
@@ -234,7 +234,7 @@ static bool batadv_mcast_mla_tvlv_update(struct batadv_priv *bat_priv)
if (!bat_priv->mcast.enabled || mcast_data.flags != bat_priv->mcast.flags) { - batadv_tvlv_container_register(bat_priv, BATADV_TVLV_MCAST, 1, + batadv_tvlv_container_register(bat_priv, BATADV_TVLV_MCAST, 2, &mcast_data, sizeof(mcast_data)); bat_priv->mcast.flags = mcast_data.flags; bat_priv->mcast.enabled = true; @@ -984,7 +984,7 @@ static void batadv_mcast_tvlv_ogm_handler_v1(struct batadv_priv *bat_priv, void batadv_mcast_init(struct batadv_priv *bat_priv) { batadv_tvlv_handler_register(bat_priv, batadv_mcast_tvlv_ogm_handler_v1, - NULL, BATADV_TVLV_MCAST, 1, + NULL, BATADV_TVLV_MCAST, 2, BATADV_TVLV_HANDLER_OGM_CIFNOTFND); }
@@ -994,8 +994,8 @@ void batadv_mcast_init(struct batadv_priv *bat_priv) */ void batadv_mcast_free(struct batadv_priv *bat_priv) { - batadv_tvlv_container_unregister(bat_priv, BATADV_TVLV_MCAST, 1); - batadv_tvlv_handler_unregister(bat_priv, BATADV_TVLV_MCAST, 1); + batadv_tvlv_container_unregister(bat_priv, BATADV_TVLV_MCAST, 2); + batadv_tvlv_handler_unregister(bat_priv, BATADV_TVLV_MCAST, 2);
batadv_mcast_mla_tt_retract(bat_priv, NULL); }
Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue --- Makefile | 4 + compat-include/linux/igmp.h | 13 ++ compat-include/linux/skbuff.h | 14 ++ compat-include/net/addrconf.h | 13 ++ compat-include/net/ip6_checksum.h | 18 +++ compat-include/net/ipv6.h | 17 +++ compat-include/net/mld.h | 52 +++++++ compat-sources/Makefile | 3 + compat-sources/net/core/skbuff.c | 179 ++++++++++++++++++++++++ compat-sources/net/ipv4/igmp.c | 241 +++++++++++++++++++++++++++++++++ compat-sources/net/ipv6/mcast_snoop.c | 216 +++++++++++++++++++++++++++++ 11 files changed, 770 insertions(+) create mode 100644 compat-include/linux/igmp.h create mode 100644 compat-include/net/addrconf.h create mode 100644 compat-include/net/ip6_checksum.h create mode 100644 compat-include/net/ipv6.h create mode 100644 compat-include/net/mld.h create mode 100644 compat-sources/Makefile create mode 100644 compat-sources/net/core/skbuff.c create mode 100644 compat-sources/net/ipv4/igmp.c create mode 100644 compat-sources/net/ipv6/mcast_snoop.c
diff --git a/Makefile b/Makefile index ee3be1d..69056d7 100644 --- a/Makefile +++ b/Makefile @@ -50,6 +50,10 @@ ifneq ($(REVISION),) NOSTDINC_FLAGS += -DBATADV_SOURCE_VERSION="$(REVISION)" endif
+include $(PWD)/compat-sources/Makefile + +export batman-adv-y + BUILD_FLAGS := \ M=$(PWD)/net/batman-adv \ CONFIG_BATMAN_ADV=m \ diff --git a/compat-include/linux/igmp.h b/compat-include/linux/igmp.h new file mode 100644 index 0000000..f61ab79 --- /dev/null +++ b/compat-include/linux/igmp.h @@ -0,0 +1,13 @@ +#ifndef _NET_BATMAN_ADV_COMPAT_LINUX_IGMP_H_ +#define _NET_BATMAN_ADV_COMPAT_LINUX_IGMP_H_ + +#include <linux/version.h> +#include_next <linux/igmp.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 2, 0) + +int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed); + +#endif /* < KERNEL_VERSION(4, 2, 0) */ + +#endif /* _NET_BATMAN_ADV_COMPAT_LINUX_IGMP_H_ */ diff --git a/compat-include/linux/skbuff.h b/compat-include/linux/skbuff.h index d363cc0..1d4f569 100644 --- a/compat-include/linux/skbuff.h +++ b/compat-include/linux/skbuff.h @@ -89,6 +89,20 @@ static inline void skb_reset_mac_len(struct sk_buff *skb)
#define pskb_copy_for_clone pskb_copy
+__sum16 skb_checksum_simple_validate(struct sk_buff *skb); + +__sum16 +skb_checksum_validate(struct sk_buff *skb, int proto, + __wsum (*compute_pseudo)(struct sk_buff *skb, int proto)); + #endif /* < KERNEL_VERSION(3, 16, 0) */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 2, 0) + +struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb, + unsigned int transport_len, + __sum16(*skb_chkf)(struct sk_buff *skb)); + +#endif /* < KERNEL_VERSION(4, 2, 0) */ + #endif /* _NET_BATMAN_ADV_COMPAT_LINUX_SKBUFF_H_ */ diff --git a/compat-include/net/addrconf.h b/compat-include/net/addrconf.h new file mode 100644 index 0000000..69c45d0 --- /dev/null +++ b/compat-include/net/addrconf.h @@ -0,0 +1,13 @@ +#ifndef _NET_BATMAN_ADV_COMPAT_NET_ADDRCONF_H_ +#define _NET_BATMAN_ADV_COMPAT_NET_ADDRCONF_H_ + +#include <linux/version.h> +#include_next <net/addrconf.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 2, 0) + +int ipv6_mc_check_mld(struct sk_buff *skb, struct sk_buff **skb_trimmed); + +#endif /* < KERNEL_VERSION(4, 2, 0) */ + +#endif /* _NET_BATMAN_ADV_COMPAT_NET_ADDRCONF_H_ */ diff --git a/compat-include/net/ip6_checksum.h b/compat-include/net/ip6_checksum.h new file mode 100644 index 0000000..fda0c07 --- /dev/null +++ b/compat-include/net/ip6_checksum.h @@ -0,0 +1,18 @@ +#ifndef _NET_BATMAN_ADV_COMPAT_NET_IP6_CHECKSUM_H_ +#define _NET_BATMAN_ADV_COMPAT_NET_IP6_CHECKSUM_H_ + +#include <linux/version.h> +#include_next <net/ip6_checksum.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 16, 0) + +static inline __wsum ip6_compute_pseudo(struct sk_buff *skb, int proto) +{ + return ~csum_unfold(csum_ipv6_magic(&ipv6_hdr(skb)->saddr, + &ipv6_hdr(skb)->daddr, + skb->len, proto, 0)); +} + +#endif /* < KERNEL_VERSION(3, 16, 0) */ + +#endif /* _NET_BATMAN_ADV_COMPAT_NET_IP6_CHECKSUM_H_ */ diff --git a/compat-include/net/ipv6.h b/compat-include/net/ipv6.h new file mode 100644 index 0000000..1e190d8 --- /dev/null +++ b/compat-include/net/ipv6.h @@ -0,0 +1,17 @@ +#ifndef _NET_BATMAN_ADV_COMPAT_NET_IPV6_H_ +#define _NET_BATMAN_ADV_COMPAT_NET_IPV6_H_ + +#include <linux/version.h> +#include_next <net/ipv6.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 3, 0) + +#define ipv6_skip_exthdr(skb, start, nexthdrp, frag_offp) \ + ({ \ + (void)frag_offp; \ + ipv6_skip_exthdr(skb, start, nexthdrp); \ + }) + +#endif /* < KERNEL_VERSION(3, 3, 0) */ + +#endif /* _NET_BATMAN_ADV_COMPAT_NET_IPV6_H_ */ diff --git a/compat-include/net/mld.h b/compat-include/net/mld.h new file mode 100644 index 0000000..e041eb6 --- /dev/null +++ b/compat-include/net/mld.h @@ -0,0 +1,52 @@ +#ifndef _NET_BATMAN_ADV_COMPAT_NET_MLD_H_ +#define _NET_BATMAN_ADV_COMPAT_NET_MLD_H_ + +#include <linux/version.h> + +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 35) +#include_next <net/mld.h> +#endif /* >= KERNEL_VERSION(2, 6, 35) */ + +#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 35) +struct mld_msg { + struct icmp6hdr mld_hdr; + struct in6_addr mld_mca; +}; + +#define mld_type mld_hdr.icmp6_type + +struct mld2_grec { + __u8 grec_type; + __u8 grec_auxwords; + __be16 grec_nsrcs; + struct in6_addr grec_mca; + struct in6_addr grec_src[0]; +}; + +struct mld2_report { + struct icmp6hdr mld2r_hdr; + struct mld2_grec mld2r_grec[0]; +}; + +struct mld2_query { + struct icmp6hdr mld2q_hdr; + struct in6_addr mld2q_mca; +#if defined(__LITTLE_ENDIAN_BITFIELD) + __u8 mld2q_qrv:3, + mld2q_suppress:1, + mld2q_resv2:4; +#elif defined(__BIG_ENDIAN_BITFIELD) + __u8 mld2q_resv2:4, + mld2q_suppress:1, + mld2q_qrv:3; +#else +#error "Please fix <asm/byteorder.h>" +#endif + __u8 mld2q_qqic; + __be16 mld2q_nsrcs; + struct in6_addr mld2q_srcs[0]; +}; + +#endif /* < KERNEL_VERSION(2, 6, 35) */ + +#endif /* _NET_BATMAN_ADV_COMPAT_NET_MLD_H_ */ diff --git a/compat-sources/Makefile b/compat-sources/Makefile new file mode 100644 index 0000000..c364ded --- /dev/null +++ b/compat-sources/Makefile @@ -0,0 +1,3 @@ +batman-adv-y += ../../compat-sources/net/core/skbuff.o +batman-adv-y += ../../compat-sources/net/ipv4/igmp.o +batman-adv-y += ../../compat-sources/net/ipv6/mcast_snoop.o diff --git a/compat-sources/net/core/skbuff.c b/compat-sources/net/core/skbuff.c new file mode 100644 index 0000000..7ad23fe --- /dev/null +++ b/compat-sources/net/core/skbuff.c @@ -0,0 +1,179 @@ +/* + * Routines having to do with the 'struct sk_buff' memory handlers. + * + * Authors: Alan Cox alan@lxorguk.ukuu.org.uk + * Florian La Roche rzsfl@rz.uni-sb.de + * + * Fixes: + * Alan Cox : Fixed the worst of the load + * balancer bugs. + * Dave Platt : Interrupt stacking fix. + * Richard Kooijman : Timestamp fixes. + * Alan Cox : Changed buffer format. + * Alan Cox : destructor hook for AF_UNIX etc. + * Linus Torvalds : Better skb_clone. + * Alan Cox : Added skb_copy. + * Alan Cox : Added all the changed routines Linus + * only put in the headers + * Ray VanTassle : Fixed --skb->lock in free + * Alan Cox : skb_copy copy arp field + * Andi Kleen : slabified it. + * Robert Olsson : Removed skb_head_pool + * + * NOTE: + * The __skb_ routines should be called with interrupts + * disabled, or you better be *real* sure that the operation is atomic + * with respect to whatever list is being frobbed (e.g. via lock_sock() + * or via disabling bottom half handlers, etc). + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/ipv6.h> +#include <linux/skbuff.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 16, 0) + +/* Compare with: + * "bridge: multicast: call skb_checksum_{simple_, }validate" + */ +__sum16 skb_checksum_simple_validate(struct sk_buff *skb) +{ + switch (skb->ip_summed) { + case CHECKSUM_COMPLETE: + if (!csum_fold(skb->csum)) + break; + /* fall through */ + case CHECKSUM_NONE: + skb->csum = 0; + if (skb_checksum_complete(skb)) + return -EINVAL; + } + + return 0; +} + +/* Watch out: Not as generic as upstream + * - redefines this method to only fit with ICMPV6 + * + * Compare with: + * "bridge: multicast: call skb_checksum_{simple_, }validate" + */ +__sum16 +skb_checksum_validate(struct sk_buff *skb, int proto, + __wsum (*compute_pseudo)(struct sk_buff *skb, int proto)) +{ + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + + switch (skb->ip_summed) { + case CHECKSUM_COMPLETE: + if (!csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr, skb->len, + IPPROTO_ICMPV6, skb->csum)) + break; + /*FALLTHROUGH*/ + case CHECKSUM_NONE: + skb->csum = ~csum_unfold(csum_ipv6_magic(&ip6h->saddr, + &ip6h->daddr, + skb->len, + IPPROTO_ICMPV6, 0)); + if (__skb_checksum_complete(skb)) + return -EINVAL; + } + + return 0; +} + +#endif /* < KERNEL_VERSION(3, 16, 0) */ + +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 2, 0) + +/** + * skb_checksum_maybe_trim - maybe trims the given skb + * @skb: the skb to check + * @transport_len: the data length beyond the network header + * + * Checks whether the given skb has data beyond the given transport length. + * If so, returns a cloned skb trimmed to this transport length. + * Otherwise returns the provided skb. Returns NULL in error cases + * (e.g. transport_len exceeds skb length or out-of-memory). + * + * Caller needs to set the skb transport header and release the returned skb. + * Provided skb is consumed. + */ +static struct sk_buff *skb_checksum_maybe_trim(struct sk_buff *skb, + unsigned int transport_len) +{ + struct sk_buff *skb_chk; + unsigned int len = skb_transport_offset(skb) + transport_len; + int ret; + + if (skb->len < len) { + kfree_skb(skb); + return NULL; + } else if (skb->len == len) { + return skb; + } + + skb_chk = skb_clone(skb, GFP_ATOMIC); + kfree_skb(skb); + + if (!skb_chk) + return NULL; + + ret = pskb_trim_rcsum(skb_chk, len); + if (ret) { + kfree_skb(skb_chk); + return NULL; + } + + return skb_chk; +} + +/** + * skb_checksum_trimmed - validate checksum of an skb + * @skb: the skb to check + * @transport_len: the data length beyond the network header + * @skb_chkf: checksum function to use + * + * Applies the given checksum function skb_chkf to the provided skb. + * Returns a checked and maybe trimmed skb. Returns NULL on error. + * + * If the skb has data beyond the given transport length, then a + * trimmed & cloned skb is checked and returned. + * + * Caller needs to set the skb transport header and release the returned skb. + * Provided skb is consumed. + */ +struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb, + unsigned int transport_len, + __sum16(*skb_chkf)(struct sk_buff *skb)) +{ + struct sk_buff *skb_chk; + unsigned int offset = skb_transport_offset(skb); + __sum16 ret; + + skb_chk = skb_checksum_maybe_trim(skb, transport_len); + if (!skb_chk) + return NULL; + + if (!pskb_may_pull(skb_chk, offset)) { + kfree_skb(skb_chk); + return NULL; + } + + __skb_pull(skb_chk, offset); + ret = skb_chkf(skb_chk); + __skb_push(skb_chk, offset); + + if (ret) { + kfree_skb(skb_chk); + return NULL; + } + + return skb_chk; +} + +#endif /* < KERNEL_VERSION(4, 2, 0) */ diff --git a/compat-sources/net/ipv4/igmp.c b/compat-sources/net/ipv4/igmp.c new file mode 100644 index 0000000..457a05e --- /dev/null +++ b/compat-sources/net/ipv4/igmp.c @@ -0,0 +1,241 @@ +/* + * Linux NET3: Internet Group Management Protocol [IGMP] + * + * This code implements the IGMP protocol as defined in RFC1112. There has + * been a further revision of this protocol since which is now supported. + * + * If you have trouble with this module be careful what gcc you have used, + * the older version didn't come out right using gcc 2.5.8, the newer one + * seems to fall out with gcc 2.6.2. + * + * Authors: + * Alan Cox alan@lxorguk.ukuu.org.uk + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Fixes: + * + * Alan Cox : Added lots of __inline__ to optimise + * the memory usage of all the tiny little + * functions. + * Alan Cox : Dumped the header building experiment. + * Alan Cox : Minor tweaks ready for multicast routing + * and extended IGMP protocol. + * Alan Cox : Removed a load of inline directives. Gcc 2.5.8 + * writes utterly bogus code otherwise (sigh) + * fixed IGMP loopback to behave in the manner + * desired by mrouted, fixed the fact it has been + * broken since 1.3.6 and cleaned up a few minor + * points. + * + * Chih-Jen Chang : Tried to revise IGMP to Version 2 + * Tsu-Sheng Tsao E-mail: chihjenc@scf.usc.edu and tsusheng@scf.usc.edu + * The enhancements are mainly based on Steve Deering's + * ipmulti-3.5 source code. + * Chih-Jen Chang : Added the igmp_get_mrouter_info and + * Tsu-Sheng Tsao igmp_set_mrouter_info to keep track of + * the mrouted version on that device. + * Chih-Jen Chang : Added the max_resp_time parameter to + * Tsu-Sheng Tsao igmp_heard_query(). Using this parameter + * to identify the multicast router version + * and do what the IGMP version 2 specified. + * Chih-Jen Chang : Added a timer to revert to IGMP V2 router + * Tsu-Sheng Tsao if the specified time expired. + * Alan Cox : Stop IGMP from 0.0.0.0 being accepted. + * Alan Cox : Use GFP_ATOMIC in the right places. + * Christian Daudt : igmp timer wasn't set for local group + * memberships but was being deleted, + * which caused a "del_timer() called + * from %p with timer not initialized\n" + * message (960131). + * Christian Daudt : removed del_timer from + * igmp_timer_expire function (960205). + * Christian Daudt : igmp_heard_report now only calls + * igmp_timer_expire if tm->running is + * true (960216). + * Malcolm Beattie : ttl comparison wrong in igmp_rcv made + * igmp_heard_query never trigger. Expiry + * miscalculation fixed in igmp_heard_query + * and random() made to return unsigned to + * prevent negative expiry times. + * Alexey Kuznetsov: Wrong group leaving behaviour, backport + * fix from pending 2.1.x patches. + * Alan Cox: Forget to enable FDDI support earlier. + * Alexey Kuznetsov: Fixed leaving groups on device down. + * Alexey Kuznetsov: Accordance to igmp-v2-06 draft. + * David L Stevens: IGMPv3 support, with help from + * Vinay Kulkarni + */ + +#include <linux/igmp.h> +#include <linux/ip.h> +#include <linux/skbuff.h> +#include <net/ip.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 2, 0) + +static int ip_mc_check_iphdr(struct sk_buff *skb) +{ + const struct iphdr *iph; + unsigned int len; + unsigned int offset = skb_network_offset(skb) + sizeof(*iph); + + if (!pskb_may_pull(skb, offset)) + return -EINVAL; + + iph = ip_hdr(skb); + + if (iph->version != 4 || ip_hdrlen(skb) < sizeof(*iph)) + return -EINVAL; + + offset += ip_hdrlen(skb) - sizeof(*iph); + + if (!pskb_may_pull(skb, offset)) + return -EINVAL; + + iph = ip_hdr(skb); + + if (unlikely(ip_fast_csum((u8 *)iph, iph->ihl))) + return -EINVAL; + + len = skb_network_offset(skb) + ntohs(iph->tot_len); + if (skb->len < len || len < offset) + return -EINVAL; + + skb_set_transport_header(skb, offset); + + return 0; +} + +static int ip_mc_check_igmp_reportv3(struct sk_buff *skb) +{ + unsigned int len = skb_transport_offset(skb); + + len += sizeof(struct igmpv3_report); + + return pskb_may_pull(skb, len) ? 0 : -EINVAL; +} + +static int ip_mc_check_igmp_query(struct sk_buff *skb) +{ + unsigned int len = skb_transport_offset(skb); + + len += sizeof(struct igmphdr); + if (skb->len < len) + return -EINVAL; + + /* IGMPv{1,2}? */ + if (skb->len != len) { + /* or IGMPv3? */ + len += sizeof(struct igmpv3_query) - sizeof(struct igmphdr); + if (skb->len < len || !pskb_may_pull(skb, len)) + return -EINVAL; + } + + /* RFC2236+RFC3376 (IGMPv2+IGMPv3) require the multicast link layer + * all-systems destination addresses (224.0.0.1) for general queries + */ + if (!igmp_hdr(skb)->group && + ip_hdr(skb)->daddr != htonl(INADDR_ALLHOSTS_GROUP)) + return -EINVAL; + + return 0; +} + +static int ip_mc_check_igmp_msg(struct sk_buff *skb) +{ + switch (igmp_hdr(skb)->type) { + case IGMP_HOST_LEAVE_MESSAGE: + case IGMP_HOST_MEMBERSHIP_REPORT: + case IGMPV2_HOST_MEMBERSHIP_REPORT: + /* fall through */ + return 0; + case IGMPV3_HOST_MEMBERSHIP_REPORT: + return ip_mc_check_igmp_reportv3(skb); + case IGMP_HOST_MEMBERSHIP_QUERY: + return ip_mc_check_igmp_query(skb); + default: + return -ENOMSG; + } +} + +static inline __sum16 ip_mc_validate_checksum(struct sk_buff *skb) +{ + return skb_checksum_simple_validate(skb); +} + +static int __ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed) + +{ + struct sk_buff *skb_chk; + unsigned int transport_len; + unsigned int len = skb_transport_offset(skb) + sizeof(struct igmphdr); + int ret; + + transport_len = ntohs(ip_hdr(skb)->tot_len) - ip_hdrlen(skb); + + skb_get(skb); + skb_chk = skb_checksum_trimmed(skb, transport_len, + ip_mc_validate_checksum); + if (!skb_chk) + return -EINVAL; + + if (!pskb_may_pull(skb_chk, len)) { + kfree_skb(skb_chk); + return -EINVAL; + } + + ret = ip_mc_check_igmp_msg(skb_chk); + if (ret) { + kfree_skb(skb_chk); + return ret; + } + + if (skb_trimmed) + *skb_trimmed = skb_chk; + else + kfree_skb(skb_chk); + + return 0; +} + +/** + * ip_mc_check_igmp - checks whether this is a sane IGMP packet + * @skb: the skb to validate + * @skb_trimmed: to store an skb pointer trimmed to IPv4 packet tail (optional) + * + * Checks whether an IPv4 packet is a valid IGMP packet. If so sets + * skb network and transport headers accordingly and returns zero. + * + * -EINVAL: A broken packet was detected, i.e. it violates some internet + * standard + * -ENOMSG: IP header validation succeeded but it is not an IGMP packet. + * -ENOMEM: A memory allocation failure happened. + * + * Optionally, an skb pointer might be provided via skb_trimmed (or set it + * to NULL): After parsing an IGMP packet successfully it will point to + * an skb which has its tail aligned to the IP packet end. This might + * either be the originally provided skb or a trimmed, cloned version if + * the skb frame had data beyond the IP packet. A cloned skb allows us + * to leave the original skb and its full frame unchanged (which might be + * desirable for layer 2 frame jugglers). + * + * The caller needs to release a reference count from any returned skb_trimmed. + */ +int ip_mc_check_igmp(struct sk_buff *skb, struct sk_buff **skb_trimmed) +{ + int ret = ip_mc_check_iphdr(skb); + + if (ret < 0) + return ret; + + if (ip_hdr(skb)->protocol != IPPROTO_IGMP) + return -ENOMSG; + + return __ip_mc_check_igmp(skb, skb_trimmed); +} + +#endif /* < KERNEL_VERSION(4, 2, 0) */ diff --git a/compat-sources/net/ipv6/mcast_snoop.c b/compat-sources/net/ipv6/mcast_snoop.c new file mode 100644 index 0000000..3f46ed3 --- /dev/null +++ b/compat-sources/net/ipv6/mcast_snoop.c @@ -0,0 +1,216 @@ +/* Copyright (C) 2010: YOSHIFUJI Hideaki yoshfuji@linux-ipv6.org + * Copyright (C) 2015: Linus Lüssing linus.luessing@c0d3.blue + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see http://www.gnu.org/licenses/. + * + * + * Based on the MLD support added to br_multicast.c by YOSHIFUJI Hideaki. + */ + +#include <linux/skbuff.h> +#include <net/ipv6.h> +#include <net/mld.h> +#include <net/addrconf.h> +#include <net/ip6_checksum.h> + +#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 2, 0) + +static int ipv6_mc_check_ip6hdr(struct sk_buff *skb) +{ + const struct ipv6hdr *ip6h; + unsigned int len; + unsigned int offset = skb_network_offset(skb) + sizeof(*ip6h); + + if (!pskb_may_pull(skb, offset)) + return -EINVAL; + + ip6h = ipv6_hdr(skb); + + if (ip6h->version != 6) + return -EINVAL; + + len = offset + ntohs(ip6h->payload_len); + if (skb->len < len || len <= offset) + return -EINVAL; + + return 0; +} + +static int ipv6_mc_check_exthdrs(struct sk_buff *skb) +{ + const struct ipv6hdr *ip6h; + int offset; + u8 nexthdr; + __be16 frag_off; + + ip6h = ipv6_hdr(skb); + + if (ip6h->nexthdr != IPPROTO_HOPOPTS) + return -ENOMSG; + + nexthdr = ip6h->nexthdr; + offset = skb_network_offset(skb) + sizeof(*ip6h); + offset = ipv6_skip_exthdr(skb, offset, &nexthdr, &frag_off); + + if (offset < 0) + return -EINVAL; + + if (nexthdr != IPPROTO_ICMPV6) + return -ENOMSG; + + skb_set_transport_header(skb, offset); + + return 0; +} + +static int ipv6_mc_check_mld_reportv2(struct sk_buff *skb) +{ + unsigned int len = skb_transport_offset(skb); + + len += sizeof(struct mld2_report); + + return pskb_may_pull(skb, len) ? 0 : -EINVAL; +} + +static int ipv6_mc_check_mld_query(struct sk_buff *skb) +{ + struct mld_msg *mld; + unsigned int len = skb_transport_offset(skb); + + /* RFC2710+RFC3810 (MLDv1+MLDv2) require link-local source addresses */ + if (!(ipv6_addr_type(&ipv6_hdr(skb)->saddr) & IPV6_ADDR_LINKLOCAL)) + return -EINVAL; + + len += sizeof(struct mld_msg); + if (skb->len < len) + return -EINVAL; + + /* MLDv1? */ + if (skb->len != len) { + /* or MLDv2? */ + len += sizeof(struct mld2_query) - sizeof(struct mld_msg); + if (skb->len < len || !pskb_may_pull(skb, len)) + return -EINVAL; + } + + mld = (struct mld_msg *)skb_transport_header(skb); + + /* RFC2710+RFC3810 (MLDv1+MLDv2) require the multicast link layer + * all-nodes destination address (ff02::1) for general queries + */ + if (ipv6_addr_any(&mld->mld_mca) && + !ipv6_addr_is_ll_all_nodes(&ipv6_hdr(skb)->daddr)) + return -EINVAL; + + return 0; +} + +static int ipv6_mc_check_mld_msg(struct sk_buff *skb) +{ + struct mld_msg *mld = (struct mld_msg *)skb_transport_header(skb); + + switch (mld->mld_type) { + case ICMPV6_MGM_REDUCTION: + case ICMPV6_MGM_REPORT: + /* fall through */ + return 0; + case ICMPV6_MLD2_REPORT: + return ipv6_mc_check_mld_reportv2(skb); + case ICMPV6_MGM_QUERY: + return ipv6_mc_check_mld_query(skb); + default: + return -ENOMSG; + } +} + +static inline __sum16 ipv6_mc_validate_checksum(struct sk_buff *skb) +{ + return skb_checksum_validate(skb, IPPROTO_ICMPV6, ip6_compute_pseudo); +} + +static int __ipv6_mc_check_mld(struct sk_buff *skb, + struct sk_buff **skb_trimmed) + +{ + struct sk_buff *skb_chk = NULL; + unsigned int transport_len; + unsigned int len = skb_transport_offset(skb) + sizeof(struct mld_msg); + int ret; + + transport_len = ntohs(ipv6_hdr(skb)->payload_len); + transport_len -= skb_transport_offset(skb) - sizeof(struct ipv6hdr); + + skb_get(skb); + skb_chk = skb_checksum_trimmed(skb, transport_len, + ipv6_mc_validate_checksum); + if (!skb_chk) + return -EINVAL; + + if (!pskb_may_pull(skb_chk, len)) { + kfree_skb(skb_chk); + return -EINVAL; + } + + ret = ipv6_mc_check_mld_msg(skb_chk); + if (ret) { + kfree_skb(skb_chk); + return ret; + } + + if (skb_trimmed) + *skb_trimmed = skb_chk; + else + kfree_skb(skb_chk); + + return 0; +} + +/** + * ipv6_mc_check_mld - checks whether this is a sane MLD packet + * @skb: the skb to validate + * @skb_trimmed: to store an skb pointer trimmed to IPv6 packet tail (optional) + * + * Checks whether an IPv6 packet is a valid MLD packet. If so sets + * skb network and transport headers accordingly and returns zero. + * + * -EINVAL: A broken packet was detected, i.e. it violates some internet + * standard + * -ENOMSG: IP header validation succeeded but it is not an MLD packet. + * -ENOMEM: A memory allocation failure happened. + * + * Optionally, an skb pointer might be provided via skb_trimmed (or set it + * to NULL): After parsing an MLD packet successfully it will point to + * an skb which has its tail aligned to the IP packet end. This might + * either be the originally provided skb or a trimmed, cloned version if + * the skb frame had data beyond the IP packet. A cloned skb allows us + * to leave the original skb and its full frame unchanged (which might be + * desirable for layer 2 frame jugglers). + * + * The caller needs to release a reference count from any returned skb_trimmed. + */ +int ipv6_mc_check_mld(struct sk_buff *skb, struct sk_buff **skb_trimmed) +{ + int ret; + + ret = ipv6_mc_check_ip6hdr(skb); + if (ret < 0) + return ret; + + ret = ipv6_mc_check_exthdrs(skb); + if (ret < 0) + return ret; + + return __ipv6_mc_check_mld(skb, skb_trimmed); +} + +#endif /* < KERNEL_VERSION(4, 2, 0) */
b.a.t.m.a.n@lists.open-mesh.org