This is yet another rebased version of Andreas' patchset. I've used Antonios branch, but reverted his changes to the first patch, since there were unexplained semantic changes (e.g. when no gateway is claiming the client) which I don't think are correct.
Antonio, if you want, please explain your changes or let us discuss.
Otherwise, I'd suggest to keep this patchset around for a few days (!) for a final review, and then merge it.
Thanks, Simon
Andreas Pape (5): batman-adv: prevent multiple ARP replies sent by gateways if dat enabled batman-adv: prevent duplication of ARP replies when DAT is used batman-adv: drop unicast packets from other backbone gw batman-adv: changed debug messages for easier bla debugging batman-adv: handle race condition for claims between gateways
net/batman-adv/bridge_loop_avoidance.c | 87 ++++++++++++++++++++++++++++++---- net/batman-adv/bridge_loop_avoidance.h | 11 +++++ net/batman-adv/distributed-arp-table.c | 47 ++++++++++++++++++ net/batman-adv/routing.c | 25 ++++++++-- 4 files changed, 159 insertions(+), 11 deletions(-)
From: Andreas Pape APape@phoenixcontact.com
If dat is enabled it must be made sure that only the backbone gw which has claimed the remote destination for the ARP request answers the ARP request directly if the MAC address is known due to the local dat table. This prevents multiple ARP replies in a common backbone if more than one gateway already knows the remote mac searched for in the ARP request.
Signed-off-by: Andreas Pape apape@phoenixcontact.com Acked-by: Simon Wunderlich sw@simonwunderlich.de [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de --- net/batman-adv/bridge_loop_avoidance.c | 49 ++++++++++++++++++++++++++++++++++ net/batman-adv/bridge_loop_avoidance.h | 11 ++++++++ net/batman-adv/distributed-arp-table.c | 15 +++++++++++ 3 files changed, 75 insertions(+)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 7332d284..546e66ec 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -2449,3 +2449,52 @@ int batadv_bla_backbone_dump(struct sk_buff *msg, struct netlink_callback *cb)
return ret; } + +#ifdef CONFIG_BATMAN_ADV_DAT +/** + * batadv_bla_check_claim - check if address is claimed + * + * @bat_priv: the bat priv with all the soft interface information + * @addr: mac address of which the claim status is checked + * @vid: the VLAN ID + * + * addr is checked if this address is claimed by the local device itself. + * + * Return: true if bla is disabled or the mac is claimed by the device, + * false if the device addr is already claimed by another gateway + */ +bool batadv_bla_check_claim(struct batadv_priv *bat_priv, + u8 *addr, unsigned short vid) +{ + struct batadv_bla_claim search_claim; + struct batadv_bla_claim *claim = NULL; + struct batadv_hard_iface *primary_if = NULL; + bool ret = true; + + if (!atomic_read(&bat_priv->bridge_loop_avoidance)) + return ret; + + primary_if = batadv_primary_if_get_selected(bat_priv); + if (!primary_if) + return ret; + + /* First look if the mac address is claimed */ + ether_addr_copy(search_claim.addr, addr); + search_claim.vid = vid; + + claim = batadv_claim_hash_find(bat_priv, &search_claim); + + /* If there is a claim and we are not owner of the claim, + * return false. + */ + if (claim) { + if (!batadv_compare_eth(claim->backbone_gw->orig, + primary_if->net_dev->dev_addr)) + ret = false; + batadv_claim_put(claim); + } + + batadv_hardif_put(primary_if); + return ret; +} +#endif diff --git a/net/batman-adv/bridge_loop_avoidance.h b/net/batman-adv/bridge_loop_avoidance.h index e157986b..23477574 100644 --- a/net/batman-adv/bridge_loop_avoidance.h +++ b/net/batman-adv/bridge_loop_avoidance.h @@ -69,6 +69,10 @@ void batadv_bla_status_update(struct net_device *net_dev); int batadv_bla_init(struct batadv_priv *bat_priv); void batadv_bla_free(struct batadv_priv *bat_priv); int batadv_bla_claim_dump(struct sk_buff *msg, struct netlink_callback *cb); +#ifdef CONFIG_BATMAN_ADV_DAT +bool batadv_bla_check_claim(struct batadv_priv *bat_priv, u8 *addr, + unsigned short vid); +#endif #define BATADV_BLA_CRC_INIT 0 #else /* ifdef CONFIG_BATMAN_ADV_BLA */
@@ -145,6 +149,13 @@ static inline int batadv_bla_backbone_dump(struct sk_buff *msg, return -EOPNOTSUPP; }
+static inline +bool batadv_bla_check_claim(struct batadv_priv *bat_priv, u8 *addr, + unsigned short vid) +{ + return true; +} + #endif /* ifdef CONFIG_BATMAN_ADV_BLA */
#endif /* ifndef _NET_BATMAN_ADV_BLA_H_ */ diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c index 4f643fdb..28cfa538 100644 --- a/net/batman-adv/distributed-arp-table.c +++ b/net/batman-adv/distributed-arp-table.c @@ -43,6 +43,7 @@ #include <linux/workqueue.h> #include <net/arp.h>
+#include "bridge_loop_avoidance.h" #include "hard-interface.h" #include "hash.h" #include "log.h" @@ -1040,6 +1041,20 @@ bool batadv_dat_snoop_outgoing_arp_request(struct batadv_priv *bat_priv, goto out; }
+ /* If BLA is enabled, only send ARP replies if we have claimed + * the destination for the ARP request or if no one else of + * the backbone gws belonging to our backbone has claimed the + * destination. + */ + if (!batadv_bla_check_claim(bat_priv, + dat_entry->mac_addr, vid)) { + batadv_dbg(BATADV_DBG_DAT, bat_priv, + "Device %pM claimed by another backbone gw. Don't send ARP reply!", + dat_entry->mac_addr); + ret = true; + goto out; + } + skb_new = batadv_dat_arp_create_reply(bat_priv, ip_dst, ip_src, dat_entry->mac_addr, hw_src, vid);
From: Andreas Pape APape@phoenixcontact.com
If none of the backbone gateways in a bla setup has already knowledge of the mac address searched for in an incoming ARP request from the backbone an address resolution via the DHT of DAT is started. The gateway can send several ARP requests to different DHT nodes and therefore can get several replies. This patch assures that not all of the possible ARP replies are returned to the backbone by checking the local DAT cache of the gateway. If there is an entry in the local cache the gateway has already learned the requested address and there is no need to forward the additional reply to the backbone. Furthermore it is checked if this gateway has claimed the source of the ARP reply and only forwards it to the backbone if it has claimed the source or if there is no claim at all.
Signed-off-by: Andreas Pape apape@phoenixcontact.com Acked-by: Simon Wunderlich sw@simonwunderlich.de [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de --- net/batman-adv/distributed-arp-table.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c index 28cfa538..0639f85b 100644 --- a/net/batman-adv/distributed-arp-table.c +++ b/net/batman-adv/distributed-arp-table.c @@ -1203,6 +1203,7 @@ void batadv_dat_snoop_outgoing_arp_reply(struct batadv_priv *bat_priv, bool batadv_dat_snoop_incoming_arp_reply(struct batadv_priv *bat_priv, struct sk_buff *skb, int hdr_size) { + struct batadv_dat_entry *dat_entry = NULL; u16 type; __be32 ip_src, ip_dst; u8 *hw_src, *hw_dst; @@ -1225,12 +1226,41 @@ bool batadv_dat_snoop_incoming_arp_reply(struct batadv_priv *bat_priv, hw_dst = batadv_arp_hw_dst(skb, hdr_size); ip_dst = batadv_arp_ip_dst(skb, hdr_size);
+ /* If ip_dst is already in cache and has the right mac address, + * drop this frame if this ARP reply is destined for us because it's + * most probably an ARP reply generated by another node of the DHT. + * We have most probably received already a reply earlier. Delivering + * this frame would lead to doubled receive of an ARP reply. + */ + dat_entry = batadv_dat_entry_hash_find(bat_priv, ip_src, vid); + if (dat_entry && batadv_compare_eth(hw_src, dat_entry->mac_addr)) { + batadv_dbg(BATADV_DBG_DAT, bat_priv, "Doubled ARP reply removed: ARP MSG = [src: %pM-%pI4 dst: %pM-%pI4]; dat_entry: %pM-%pI4\n", + hw_src, &ip_src, hw_dst, &ip_dst, + dat_entry->mac_addr, &dat_entry->ip); + dropped = true; + goto out; + } + /* Update our internal cache with both the IP addresses the node got * within the ARP reply */ batadv_dat_entry_add(bat_priv, ip_src, hw_src, vid); batadv_dat_entry_add(bat_priv, ip_dst, hw_dst, vid);
+ /* If BLA is enabled, only forward ARP replies if we have claimed the + * source of the ARP reply or if no one else of the same backbone has + * already claimed that client. This prevents that different gateways + * to the same backbone all forward the ARP reply leading to multiple + * replies in the backbone. + */ + if (!batadv_bla_is_my_claim(bat_priv, hw_src, vid)) { + batadv_dbg(BATADV_DBG_DAT, bat_priv, + "Device %pM claimed by another backbone gw. Drop ARP reply.\n", + hw_src); + dropped = true; + goto out; + } + /* if this REPLY is directed to a client of mine, let's deliver the * packet to the interface */ @@ -1243,6 +1273,8 @@ bool batadv_dat_snoop_incoming_arp_reply(struct batadv_priv *bat_priv, out: if (dropped) kfree_skb(skb); + if (dat_entry) + batadv_dat_entry_put(dat_entry); /* if dropped == false -> deliver to the interface */ return dropped; }
On Donnerstag, 16. März 2017 14:34:26 CET Simon Wunderlich wrote:
/* If BLA is enabled, only forward ARP replies if we have claimed the
* source of the ARP reply or if no one else of the same backbone has
* already claimed that client. This prevents that different gateways
* to the same backbone all forward the ARP reply leading to multiple
* replies in the backbone.
*/
if (!batadv_bla_is_my_claim(bat_priv, hw_src, vid)) {
batadv_dbg(BATADV_DBG_DAT, bat_priv,
"Device %pM claimed by another backbone gw. Drop ARP reply.\n",
hw_src);
dropped = true;
goto out;
}
Thanks for the rebasing. But there is no function "batadv_bla_is_my_claim". It was called batadv_bla_check_claim in patch 1.
And please also add a space between "PATCH" and "v8". This is how "git format-patch -v8" would have created it. And patchwork can handle it a little bit better when "PATCH" is a well separated word.
Kind regards, Sven
From: Andreas Pape APape@phoenixcontact.com
Additional dropping of unicast packets received from another backbone gw if the same backbone network before being forwarded to the same backbone again is necessary. It was observed in a test setup that in rare cases these frames lead to looping unicast traffic backbone->mesh->backbone.
Signed-off-by: Andreas Pape apape@phoenixcontact.com Acked-by: Simon Wunderlich sw@simonwunderlich.de [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de --- net/batman-adv/routing.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c index 7fd740b6..c85dc310 100644 --- a/net/batman-adv/routing.c +++ b/net/batman-adv/routing.c @@ -941,15 +941,17 @@ int batadv_recv_unicast_packet(struct sk_buff *skb, struct batadv_priv *bat_priv = netdev_priv(recv_if->soft_iface); struct batadv_unicast_packet *unicast_packet; struct batadv_unicast_4addr_packet *unicast_4addr_packet; - u8 *orig_addr; - struct batadv_orig_node *orig_node = NULL; + u8 *orig_addr, *orig_addr_gw; + struct batadv_orig_node *orig_node = NULL, *orig_node_gw = NULL; int check, hdr_size = sizeof(*unicast_packet); enum batadv_subtype subtype; - bool is4addr; + struct ethhdr *ethhdr; int ret = NET_RX_DROP; + bool is4addr, is_gw;
unicast_packet = (struct batadv_unicast_packet *)skb->data; unicast_4addr_packet = (struct batadv_unicast_4addr_packet *)skb->data; + ethhdr = eth_hdr(skb);
is4addr = unicast_packet->packet_type == BATADV_UNICAST_4ADDR; /* the caller function should have already pulled 2 bytes */ @@ -972,6 +974,23 @@ int batadv_recv_unicast_packet(struct sk_buff *skb,
/* packet for me */ if (batadv_is_my_mac(bat_priv, unicast_packet->dest)) { + /* If this is a unicast packet from another backgone gw, + * drop it. + */ + orig_addr_gw = ethhdr->h_source; + orig_node_gw = batadv_orig_hash_find(bat_priv, orig_addr_gw); + if (orig_node_gw) { + is_gw = batadv_bla_is_backbone_gw(skb, orig_node_gw, + hdr_size); + batadv_orig_node_put(orig_node_gw); + if (is_gw) { + batadv_dbg(BATADV_DBG_BLA, bat_priv, + "Dropped unicast pkt received from another backbone gw %pM.\n", + orig_addr_gw); + return NET_RX_DROP; + } + } + if (is4addr) { subtype = unicast_4addr_packet->subtype; batadv_dat_inc_counter(bat_priv, subtype);
From: Andreas Pape APape@phoenixcontact.com
Some of the bla debug messages are extended and additional messages are added for easier bla debugging. Some debug messages introduced with the dat changes in prior patches of this patch series have been changed to be more compliant to other existing debug messages.
Acked-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Andreas Pape apape@phoenixcontact.com [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de --- net/batman-adv/bridge_loop_avoidance.c | 18 ++++++++++++++---- net/batman-adv/routing.c | 2 +- 2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 546e66ec..852a2be1 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -739,8 +739,8 @@ static void batadv_bla_add_claim(struct batadv_priv *bat_priv, goto claim_free_ref;
batadv_dbg(BATADV_DBG_BLA, bat_priv, - "bla_add_claim(): changing ownership for %pM, vid %d\n", - mac, batadv_print_vid(vid)); + "bla_add_claim(): changing ownership for %pM, vid %d to gw %pM\n", + mac, batadv_print_vid(vid), backbone_gw->orig);
remove_crc = true; } @@ -1295,10 +1295,13 @@ static void batadv_bla_purge_claims(struct batadv_priv *bat_priv, goto skip;
batadv_dbg(BATADV_DBG_BLA, bat_priv, - "bla_purge_claims(): %pM, vid %d, time out\n", + "bla_purge_claims(): timed out.\n"); + +purge_now: + batadv_dbg(BATADV_DBG_BLA, bat_priv, + "bla_purge_claims(): %pM, vid %d\n", claim->addr, claim->vid);
-purge_now: batadv_handle_unclaim(bat_priv, primary_if, backbone_gw->orig, claim->addr, claim->vid); @@ -1846,6 +1849,13 @@ bool batadv_bla_rx(struct batadv_priv *bat_priv, struct sk_buff *skb, /* possible optimization: race for a claim */ /* No claim exists yet, claim it for us! */ + + batadv_dbg(BATADV_DBG_BLA, bat_priv, + "bla_rx(): Unclaimed MAC %pM found. Claim it. Local: %s\n", + ethhdr->h_source, + batadv_is_my_client(bat_priv, + ethhdr->h_source, vid) ? + "yes" : "no"); batadv_handle_claim(bat_priv, primary_if, primary_if->net_dev->dev_addr, ethhdr->h_source, vid); diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c index c85dc310..e1ebe14e 100644 --- a/net/batman-adv/routing.c +++ b/net/batman-adv/routing.c @@ -985,7 +985,7 @@ int batadv_recv_unicast_packet(struct sk_buff *skb, batadv_orig_node_put(orig_node_gw); if (is_gw) { batadv_dbg(BATADV_DBG_BLA, bat_priv, - "Dropped unicast pkt received from another backbone gw %pM.\n", + "recv_unicast_packet(): Dropped unicast pkt received from another backbone gw %pM.\n", orig_addr_gw); return NET_RX_DROP; }
From: Andreas Pape APape@phoenixcontact.com
Consider the following situation which has been found in a test setup: Gateway B has claimed client C and gateway A has the same backbone network as B. C sends a broad- or multicast to B and directly after this packet decides to send another packet to A due to a better TQ value. B will forward the broad-/multicast into the backbone as it is the responsible gw and after that A will claim C as it has been chosen by C as the best gateway. If it now happens that A claims C before it has received the broad-/multicast forwarded by B (due to backbone topology or due to some delay in B when forwarding the packet) we get a critical situation: in the current code A will immediately unclaim C when receiving the multicast due to the roaming client scenario although the position of C has not changed in the mesh. If this happens the multi-/broadcast forwarded by B will be sent back into the mesh by A and we have looping packets until one of the gateways claims C again. In order to prevent this, unclaiming of a client due to the roaming client scenario is only done after a certain time is expired after the last claim of the client. 100 ms are used here, which should be slow enough for big backbones and slow gateways but fast enough not to break the roaming client use case.
Acked-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Andreas Pape apape@phoenixcontact.com [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de --- net/batman-adv/bridge_loop_avoidance.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 852a2be1..d07e89ec 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -1973,10 +1973,22 @@ bool batadv_bla_tx(struct batadv_priv *bat_priv, struct sk_buff *skb, /* if yes, the client has roamed and we have * to unclaim it. */ - batadv_handle_unclaim(bat_priv, primary_if, - primary_if->net_dev->dev_addr, - ethhdr->h_source, vid); - goto allow; + if (batadv_has_timed_out(claim->lasttime, 100)) { + /* only unclaim if the last claim entry is + * older than 100 ms to make sure we really + * have a roaming client here. + */ + batadv_dbg(BATADV_DBG_BLA, bat_priv, "bla_tx(): Roaming client %pM detected. Unclaim it.\n", + ethhdr->h_source); + batadv_handle_unclaim(bat_priv, primary_if, + primary_if->net_dev->dev_addr, + ethhdr->h_source, vid); + goto allow; + } else { + batadv_dbg(BATADV_DBG_BLA, bat_priv, "bla_tx(): Race for claim %pM detected. Drop packet.\n", + ethhdr->h_source); + goto handled; + } }
/* check if it is a multicast/broadcast frame */
On Thursday, March 16, 2017 2:34:24 PM CET Simon Wunderlich wrote:
This is yet another rebased version of Andreas' patchset. I've used Antonios branch, but reverted his changes to the first patch, since there were unexplained semantic changes (e.g. when no gateway is claiming the client) which I don't think are correct.
Antonio, if you want, please explain your changes or let us discuss.
Otherwise, I'd suggest to keep this patchset around for a few days (!) for a final review, and then merge it.
Just applied this series into 0c794961..cbb2ccc1
I've fixed the 3rd patch on the way according to Svens comment.
Thanks again for the contribution! Simon
b.a.t.m.a.n@lists.open-mesh.org