Consider the following situation which has been found in a test setup: Gateway B has claimed client C and gateway A has the same backbone network as B. C sends a broad- or multicast to B and directly after this packet decides to send another packet to A due to a better TQ value. B will forward the broad-/multicast into the backbone as it is the responsible gw and after that A will claim C as it has been chosen by C as the new gateway. If it now happens that A claims C before it has received the broad-/multicast forwarded by B (due to backbone topology or due to some delay in B when forwarding the packet) we get a critical situation: in the current code A will immediately unclaim C when receiving the multicast due to the roaming client scenario although the position of C has not changed in the mesh. If this happens the multi-/broadcast forwarded by B will be sent back into the mesh by A and we have looping packets until one of the gateways claims C again. In order to prevent this, unclaiming of a client due to the roaming client scenario is only done after a certain time is expired after the last claim of the client. 100 ms are used here, which should be slow enough for big backbones and slow gateways but fast enough not to break the roaming client use case.
Signed-off-by: Andreas Pape apape@phoenixcontact.com --- net/batman-adv/bridge_loop_avoidance.c | 20 ++++++++++++++++---- 1 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 32a6168..98f0dd9 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -1764,10 +1764,22 @@ int batadv_bla_tx(struct batadv_priv *bat_priv, struct sk_buff *skb, /* if yes, the client has roamed and we have * to unclaim it. */ - batadv_handle_unclaim(bat_priv, primary_if, - primary_if->net_dev->dev_addr, - ethhdr->h_source, vid); - goto allow; + if (batadv_has_timed_out(claim->lasttime, 100)) { + /* only unclaim if the last claim entry is + * older than 100 ms to make sure we really + * have a roaming client here. + */ + batadv_dbg(BATADV_DBG_BLA, bat_priv, "bla_tx(): Roaming client %pM detected. Unclaim it.\n", + ethhdr->h_source); + batadv_handle_unclaim(bat_priv, primary_if, + primary_if->net_dev->dev_addr, + ethhdr->h_source, vid); + goto allow; + } else { + batadv_dbg(BATADV_DBG_BLA, bat_priv, "bla_tx(): Race for claim %pM detected. Drop packet.\n", + ethhdr->h_source); + goto handled; + } }
/* check if it is a multicast/broadcast frame */ -- 1.7.0.4
.................................................................. PHOENIX CONTACT ELECTRONICS GmbH
Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont USt-Id-Nr.: DE811742156 Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528 Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck ___________________________________________________________________ Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. ---------------------------------------------------------------------------------------------------- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden. ___________________________________________________________________
Hi Andreas,
On Friday 26 February 2016 14:19:50 Andreas Pape wrote:
Consider the following situation which has been found in a test setup: Gateway B has claimed client C and gateway A has the same backbone network as B. C sends a broad- or multicast to B and directly after this packet decides to send another packet to A due to a better TQ value. B will forward the broad-/multicast into the backbone as it is the responsible gw and after that A will claim C as it has been chosen by C as the new gateway. If it now happens that A claims C before it has received the broad-/multicast forwarded by B (due to backbone topology or due to some delay in B when forwarding the packet) we get a critical situation: in the current code A will immediately unclaim C when receiving the multicast due to the roaming client scenario although the position of C has not changed in the mesh. If this happens the multi-/broadcast forwarded by B will be sent back into the mesh by A and we have looping packets until one of the gateways claims C again. In order to prevent this, unclaiming of a client due to the roaming client scenario is only done after a certain time is expired after the last claim of the client. 100 ms are used here, which should be slow enough for big backbones and slow gateways but fast enough not to break the roaming client use case.
That's an interesting solution. My original idea was to make clients "race" for clients, which wasn't ever implemented because in the scenarios I tested this was not a problem (Check batadv_bla_rx(), there is a note on a possible optimization).
I believe your solutions looks valid, let's implement that.
Acked-by: Simon Wunderlich sw@simonwunderlich.de
Thanks, Simon
b.a.t.m.a.n@lists.open-mesh.org