Hello David,
here you have some small fixes for your 'net' tree.
Patch 1 fixes a regression in the "bonding" code introduced while implementing the multi-interface optimization feature, by Simon Wunderlich.
Patch 2 ensures that the "last-seen" timestamp for a newly created originator object is properly initialised in order to avoid a non-critical race condition, by Linus Lüssing.
Patch 3 avoids false positive splats when lockdep is enabled by assigning the proper lock class to locks used by the network coding feature, by Martin Hundebøll.
Patches 4 and 5 fix the code counting the amount of multicast-disabled nodes in the network (used to avoid to enable the multicast optimisation when not possible), by Linus Lüssing.
Patch 6 fixes a memory leak in the Translation Table code that can be triggered by doubling the current originator interval, by Linus Lüssing.
Please pull or let me know of any problem!
Thanks a lot, Antonio
The following changes since commit 7ce67a38f799d1fb332f672b117efbadedaa5352:
net: ethernet: cpsw: fix hangs with interrupts (2015-01-04 22:18:34 -0500)
are available in the git repository at:
git://git.open-mesh.org/linux-merge.git tags/batman-adv-fix-for-davem
for you to fetch changes up to 9d31b3ce81683ce3c9fd10afa70892e373b21067:
batman-adv: fix potential TT client + orig-node memory leak (2015-01-06 11:07:01 +0100)
---------------------------------------------------------------- Included changes: - ensure bonding is used (if enabled) for packets coming in the soft interface - fix race condition to avoid orig_nodes to be deleted right after being added - avoid false positive lockdep splats by assigning lockclass to the proper hashtable lock objects - avoid miscounting of multicast 'disabled' nodes in the network - fix memory leak in the Global Translation Table in case of originator interval change
---------------------------------------------------------------- Linus Lüssing (4): batman-adv: fix delayed foreign originator recognition batman-adv: fix counter for multicast supporting nodes batman-adv: fix multicast counter when purging originators batman-adv: fix potential TT client + orig-node memory leak
Martin Hundebøll (1): batman-adv: fix lock class for decoding hash in network-coding.c
Simon Wunderlich (1): batman-adv: fix and simplify condition when bonding should be used
net/batman-adv/multicast.c | 11 +++++++---- net/batman-adv/network-coding.c | 2 +- net/batman-adv/originator.c | 7 ++++--- net/batman-adv/routing.c | 6 ++++-- 4 files changed, 16 insertions(+), 10 deletions(-)
From: Simon Wunderlich sw@simonwunderlich.de
The current condition actually does NOT consider bonding when the interface the packet came in from is the soft interface, which is the opposite of what it should do (and the comment describes). Fix that and slightly simplify the condition.
Reported-by: Ray Gibson booray@gmail.com Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Antonio Quartulli antonio@meshcoding.com --- net/batman-adv/routing.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c index 35f76f2..6648f32 100644 --- a/net/batman-adv/routing.c +++ b/net/batman-adv/routing.c @@ -443,11 +443,13 @@ batadv_find_router(struct batadv_priv *bat_priv,
router = batadv_orig_router_get(orig_node, recv_if);
+ if (!router) + return router; + /* only consider bonding for recv_if == BATADV_IF_DEFAULT (first hop) * and if activated. */ - if (recv_if == BATADV_IF_DEFAULT || !atomic_read(&bat_priv->bonding) || - !router) + if (!(recv_if == BATADV_IF_DEFAULT && atomic_read(&bat_priv->bonding))) return router;
/* bonding: loop through the list of possible routers found
From: Linus Lüssing linus.luessing@c0d3.blue
Currently it can happen that the reception of an OGM from a new originator is not being accepted. More precisely it can happen that an originator struct gets allocated and initialized (batadv_orig_node_new()), even the TQ gets calculated and set correctly (batadv_iv_ogm_calc_tq()) but still the periodic orig_node purging thread will decide to delete it if it has a chance to jump between these two function calls.
This is because batadv_orig_node_new() initializes the last_seen value to zero and its caller (batadv_iv_ogm_orig_get()) makes it visible to other threads by adding it to the hash table already. batadv_iv_ogm_calc_tq() will set the last_seen variable to the correct, current time a few lines later but if the purging thread jumps in between that it will think that the orig_node timed out and will wrongly schedule it for deletion already.
If the purging interval is the same as the originator interval (which is the default: 1 second), then this game can continue for several rounds until the random OGM jitter added enough difference between these two (in tests, two to about four rounds seemed common).
Fixing this by initializing the last_seen variable of an orig_node to the current time before adding it to the hash table.
Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue Signed-off-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Antonio Quartulli antonio@meshcoding.com --- net/batman-adv/originator.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c index 6a48451..648bdba 100644 --- a/net/batman-adv/originator.c +++ b/net/batman-adv/originator.c @@ -678,6 +678,7 @@ struct batadv_orig_node *batadv_orig_node_new(struct batadv_priv *bat_priv, atomic_set(&orig_node->last_ttvn, 0); orig_node->tt_buff = NULL; orig_node->tt_buff_len = 0; + orig_node->last_seen = jiffies; reset_time = jiffies - 1 - msecs_to_jiffies(BATADV_RESET_PROTECTION_MS); orig_node->bcast_seqno_reset = reset_time; #ifdef CONFIG_BATMAN_ADV_MCAST
From: Martin Hundebøll martin@hundeboll.net
batadv_has_set_lock_class() is called with the wrong hash table as first argument (probably due to a copy-paste error), which leads to false positives when running with lockdep.
Introduced-by: 612d2b4fe0a1ff2f8389462a6f8be34e54124c05 ("batman-adv: network coding - save overheard and tx packets for decoding")
Signed-off-by: Martin Hundebøll martin@hundeboll.net Signed-off-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Antonio Quartulli antonio@meshcoding.com --- net/batman-adv/network-coding.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c index 8d04d17..fab47f1 100644 --- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -133,7 +133,7 @@ int batadv_nc_mesh_init(struct batadv_priv *bat_priv) if (!bat_priv->nc.decoding_hash) goto err;
- batadv_hash_set_lock_class(bat_priv->nc.coding_hash, + batadv_hash_set_lock_class(bat_priv->nc.decoding_hash, &batadv_nc_decoding_hash_lock_class_key);
INIT_DELAYED_WORK(&bat_priv->nc.work, batadv_nc_worker);
From: Linus Lüssing linus.luessing@c0d3.blue
A miscounting of nodes having multicast optimizations enabled can lead to multicast packet loss in the following scenario:
If the first OGM a node receives from another one has no multicast optimizations support (no multicast tvlv) then we are missing to increase the counter. This potentially leads to the wrong assumption that we could safely use multicast optimizations.
Fixings this by increasing the counter if the initial OGM has the multicast TVLV unset, too.
Introduced by 60432d756cf06e597ef9da511402dd059b112447 ("batman-adv: Announce new capability via multicast TVLV")
Reported-by: Tobias Hachmer tobias@hachmer.de Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue Signed-off-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Antonio Quartulli antonio@meshcoding.com --- net/batman-adv/multicast.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c index ab6bb2a..d3503fb 100644 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@ -685,11 +685,13 @@ static void batadv_mcast_tvlv_ogm_handler_v1(struct batadv_priv *bat_priv, if (orig_initialized) atomic_dec(&bat_priv->mcast.num_disabled); orig->capabilities |= BATADV_ORIG_CAPA_HAS_MCAST; - /* If mcast support is being switched off increase the disabled - * mcast node counter. + /* If mcast support is being switched off or if this is an initial + * OGM without mcast support then increase the disabled mcast + * node counter. */ } else if (!orig_mcast_enabled && - orig->capabilities & BATADV_ORIG_CAPA_HAS_MCAST) { + (orig->capabilities & BATADV_ORIG_CAPA_HAS_MCAST || + !orig_initialized)) { atomic_inc(&bat_priv->mcast.num_disabled); orig->capabilities &= ~BATADV_ORIG_CAPA_HAS_MCAST; }
From: Linus Lüssing linus.luessing@c0d3.blue
When purging an orig_node we should only decrease counter tracking the number of nodes without multicast optimizations support if it was increased through this orig_node before.
A not yet quite initialized orig_node (meaning it did not have its turn in the mcast-tvlv handler so far) which gets purged would not adhere to this and will lead to a counter imbalance.
Fixing this by adding a check whether the orig_node is mcast-initalized before decreasing the counter in the mcast-orig_node-purging routine.
Introduced by 60432d756cf06e597ef9da511402dd059b112447 ("batman-adv: Announce new capability via multicast TVLV")
Reported-by: Tobias Hachmer tobias@hachmer.de Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue Signed-off-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Antonio Quartulli antonio@meshcoding.com --- net/batman-adv/multicast.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c index d3503fb..b24e4bb 100644 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@ -740,7 +740,8 @@ void batadv_mcast_purge_orig(struct batadv_orig_node *orig) { struct batadv_priv *bat_priv = orig->bat_priv;
- if (!(orig->capabilities & BATADV_ORIG_CAPA_HAS_MCAST)) + if (!(orig->capabilities & BATADV_ORIG_CAPA_HAS_MCAST) && + orig->capa_initialized & BATADV_ORIG_CAPA_HAS_MCAST) atomic_dec(&bat_priv->mcast.num_disabled);
batadv_mcast_want_unsnoop_update(bat_priv, orig, BATADV_NO_FLAGS);
From: Linus Lüssing linus.luessing@c0d3.blue
This patch fixes a potential memory leak which can occur once an originator times out. On timeout the according global translation table entry might not get purged correctly. Furthermore, the non purged TT entry will cause its orig-node to leak, too. Which additionally can lead to the new multicast optimization feature not kicking in because of a therefore bogus counter.
In detail: The batadv_tt_global_entry->orig_list holds the reference to the orig-node. Usually this reference is released after BATADV_PURGE_TIMEOUT through: _batadv_purge_orig()-> batadv_purge_orig_node()->batadv_update_route()->_batadv_update_route()-> batadv_tt_global_del_orig() which purges this global tt entry and releases the reference to the orig-node.
However, if between two batadv_purge_orig_node() calls the orig-node timeout grew to 2*BATADV_PURGE_TIMEOUT then this call path isn't reached. Instead the according orig-node is removed from the originator hash in _batadv_purge_orig(), the batadv_update_route() part is skipped and won't be reached anymore.
Fixing the issue by moving batadv_tt_global_del_orig() out of the rcu callback.
Signed-off-by: Linus Lüssing linus.luessing@c0d3.blue Acked-by: Antonio Quartulli antonio@meshcoding.com Signed-off-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Antonio Quartulli antonio@meshcoding.com --- net/batman-adv/originator.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c index 648bdba..bea8198 100644 --- a/net/batman-adv/originator.c +++ b/net/batman-adv/originator.c @@ -570,9 +570,6 @@ static void batadv_orig_node_free_rcu(struct rcu_head *rcu)
batadv_frag_purge_orig(orig_node, NULL);
- batadv_tt_global_del_orig(orig_node->bat_priv, orig_node, -1, - "originator timed out"); - if (orig_node->bat_priv->bat_algo_ops->bat_orig_free) orig_node->bat_priv->bat_algo_ops->bat_orig_free(orig_node);
@@ -978,6 +975,9 @@ static void _batadv_purge_orig(struct batadv_priv *bat_priv) if (batadv_purge_orig_node(bat_priv, orig_node)) { batadv_gw_node_delete(bat_priv, orig_node); hlist_del_rcu(&orig_node->hash_entry); + batadv_tt_global_del_orig(orig_node->bat_priv, + orig_node, -1, + "originator timed out"); batadv_orig_node_free_ref(orig_node); continue; }
From: Antonio Quartulli antonio@meshcoding.com Date: Tue, 6 Jan 2015 12:09:59 +0100
here you have some small fixes for your 'net' tree.
Patch 1 fixes a regression in the "bonding" code introduced while implementing the multi-interface optimization feature, by Simon Wunderlich.
Patch 2 ensures that the "last-seen" timestamp for a newly created originator object is properly initialised in order to avoid a non-critical race condition, by Linus Lüssing.
Patch 3 avoids false positive splats when lockdep is enabled by assigning the proper lock class to locks used by the network coding feature, by Martin Hundebøll.
Patches 4 and 5 fix the code counting the amount of multicast-disabled nodes in the network (used to avoid to enable the multicast optimisation when not possible), by Linus Lüssing.
Patch 6 fixes a memory leak in the Translation Table code that can be triggered by doubling the current originator interval, by Linus Lüssing.
Pulled, thanks Antonio.
b.a.t.m.a.n@lists.open-mesh.org