On shutdown a race condition where we access a just freed global TT hash might occure:
batadv_mesh_free()->batadv_originator_free() schedules the batadv_orig_node_free_rcu().
Before batadv_orig_node_free_rcu() is executed (which happens on the rcu_barrier() call in batadv_exit() the latest), batadv_mesh_free()->batadv_tt_free()->batadv_tt_global_table_free()-> batadv_hash_destroy(hash)->kfree(hash) is called, freeing the global tt hash.
When batadv_orig_node_free_rcu()->batadv_tt_global_del_orig() now gets executed it tries to access this just freed global tt hash, causing a kernel panic.
This patch tries to fix this by waiting for any just scheduled batadv_orig_node_free_rcu() to finish via an extra rcu_barrier() call before freeing the global TT hash.
Signed-off-by: Linus Lüssing linus.luessing@web.de --- Ref: #169
main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/main.c b/main.c index 62b1f89..0afc171 100644 --- a/main.c +++ b/main.c @@ -164,6 +164,11 @@ void batadv_mesh_free(struct net_device *soft_iface)
batadv_gw_node_purge(bat_priv); batadv_originator_free(bat_priv); + + /* Wait for any batadv_orig_node_free_rcu() to finish, + * they access the soon to be freed global TT hash */ + rcu_barrier(); + batadv_nc_free(bat_priv);
batadv_tt_free(bat_priv);
It'd probably be nicer to refactor the way we are freeing the TT things instead of this extra rcu_barrier() which isn't that self explanatory, even with a comment.
However this simple extra rcu_barrier() call should be an easy way to fix this issue for now, it's probably better to refactor afterwards.
Cheers, Linus
Hrm, forget it, this patch isn't sufficient...
batadv_nc_free() is scheduling batadv_orig_node_free_rcu()s, too. And actually, batadv_tt_free() itself does so, too, before it frees its global TT hash... which can backfire after the global TT hash freeing...
Ok, a more invasive patch to properly fix this issue is needed.
Gesendet: Sonntag, 17. März 2013 um 03:30 Uhr Von: "Linus Lüssing" linus.luessing@web.de An: b.a.t.m.a.n@lists.open-mesh.org Cc: "Linus Lüssing" linus.luessing@web.de Betreff: [PATCH] batman-adv: Fix general protection fault in batadv_tt_global_del_orig()
On shutdown a race condition where we access a just freed global TT hash might occure:
batadv_mesh_free()->batadv_originator_free() schedules the batadv_orig_node_free_rcu().
Before batadv_orig_node_free_rcu() is executed (which happens on the rcu_barrier() call in batadv_exit() the latest), batadv_mesh_free()->batadv_tt_free()->batadv_tt_global_table_free()-> batadv_hash_destroy(hash)->kfree(hash) is called, freeing the global tt hash.
When batadv_orig_node_free_rcu()->batadv_tt_global_del_orig() now gets executed it tries to access this just freed global tt hash, causing a kernel panic.
This patch tries to fix this by waiting for any just scheduled batadv_orig_node_free_rcu() to finish via an extra rcu_barrier() call before freeing the global TT hash.
Signed-off-by: Linus Lüssing linus.luessing@web.de
Ref: #169
main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/main.c b/main.c index 62b1f89..0afc171 100644 --- a/main.c +++ b/main.c @@ -164,6 +164,11 @@ void batadv_mesh_free(struct net_device *soft_iface)
batadv_gw_node_purge(bat_priv); batadv_originator_free(bat_priv);
/* Wait for any batadv_orig_node_free_rcu() to finish,
* they access the soon to be freed global TT hash */
rcu_barrier();
batadv_nc_free(bat_priv);
batadv_tt_free(bat_priv);
-- 1.7.10.4
rcu_barrier() only waits for the currently scheduled rcu functions to finish - it won't wait for any function scheduled via another call_rcu() within an rcu scheduled function.
Unfortunately our batadv_tt_orig_list_entry_free_ref() does just that, via a batadv_orig_node_free_ref() call, leading to our rcu_barrier() call potentially missing such a batadv_orig_node_free_ref().
This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here.
Signed-off-by: Linus Lüssing linus.luessing@web.de
diff --git a/originator.c b/originator.c index 585e684..013c7d0 100644 --- a/originator.c +++ b/originator.c @@ -117,7 +117,7 @@ out: return neigh_node; }
-static void batadv_orig_node_free_rcu(struct rcu_head *rcu) +void batadv_orig_node_free_rcu(struct rcu_head *rcu) { struct hlist_node *node_tmp; struct batadv_neigh_node *neigh_node, *tmp_neigh_node; diff --git a/originator.h b/originator.h index 7df48fa..4f9f88b 100644 --- a/originator.h +++ b/originator.h @@ -25,6 +25,7 @@ int batadv_originator_init(struct batadv_priv *bat_priv); void batadv_originator_free(struct batadv_priv *bat_priv); void batadv_purge_orig_ref(struct batadv_priv *bat_priv); +void batadv_orig_node_free_rcu(struct rcu_head *rcu); void batadv_orig_node_free_ref(struct batadv_orig_node *orig_node); struct batadv_orig_node *batadv_get_orig_node(struct batadv_priv *bat_priv, const uint8_t *addr); diff --git a/translation-table.c b/translation-table.c index 9322320..ee91cc1 100644 --- a/translation-table.c +++ b/translation-table.c @@ -144,7 +144,10 @@ static void batadv_tt_orig_list_entry_free_rcu(struct rcu_head *rcu) struct batadv_tt_orig_list_entry *orig_entry;
orig_entry = container_of(rcu, struct batadv_tt_orig_list_entry, rcu); - batadv_orig_node_free_ref(orig_entry->orig_node); + + if (atomic_dec_and_test(&orig_entry->orig_node->refcount)) + batadv_orig_node_free_rcu(&orig_entry->orig_node->rcu); + kfree(orig_entry); }
On shutdown a race condition where we access a just freed global TT hash might occure. batadv_orig_node_free_rcu() callbacks might have been scheduled (especially during the shutdown procedure) and unfortunately batadv_tt_global_table_free() does not wait for them to finish first before freeing the global TT hash.
This potentially results in a general protection fault in batadv_tt_global_del_orig(), called via a batadv_orig_node_free_rcu() callback, which tries to access the just freed global TT hash.
This patch tries to fix this by waiting for any just scheduled batadv_orig_node_free_rcu() to finish via an extra rcu_barrier() call before freeing the global TT hash. And by moving the TT freeing call to the end of the batman cleanup routines.
Signed-off-by: Linus Lüssing linus.luessing@web.de
diff --git a/main.c b/main.c index 62b1f89..8663d97 100644 --- a/main.c +++ b/main.c @@ -166,12 +166,13 @@ void batadv_mesh_free(struct net_device *soft_iface) batadv_originator_free(bat_priv); batadv_nc_free(bat_priv);
- batadv_tt_free(bat_priv); - batadv_bla_free(bat_priv);
batadv_dat_free(bat_priv);
+ /* Don't call any batadv_orig_node_free_ref() after me */ + batadv_tt_free(bat_priv); + free_percpu(bat_priv->bat_counters);
atomic_set(&bat_priv->mesh_state, BATADV_MESH_INACTIVE); diff --git a/translation-table.c b/translation-table.c index ee91cc1..279f0fd 100644 --- a/translation-table.c +++ b/translation-table.c @@ -1315,6 +1315,10 @@ static void batadv_tt_global_table_free(struct batadv_priv *bat_priv) spin_unlock_bh(list_lock); }
+ /* Wait for any batadv_orig_node_free_rcu() to finish, + * they access the to be freed global TT hash */ + rcu_barrier(); + batadv_hash_destroy(hash);
bat_priv->tt.global_hash = NULL;
On Sun, Mar 17, 2013 at 05:44:58AM +0100, Linus Lüssing wrote:
On shutdown a race condition where we access a just freed global TT hash might occure. batadv_orig_node_free_rcu() callbacks might have been scheduled (especially during the shutdown procedure) and unfortunately batadv_tt_global_table_free() does not wait for them to finish first before freeing the global TT hash.
This potentially results in a general protection fault in batadv_tt_global_del_orig(), called via a batadv_orig_node_free_rcu() callback, which tries to access the just freed global TT hash.
This patch tries to fix this by waiting for any just scheduled batadv_orig_node_free_rcu() to finish via an extra rcu_barrier() call before freeing the global TT hash. And by moving the TT freeing call to the end of the batman cleanup routines.
Signed-off-by: Linus Lüssing linus.luessing@web.de
Acked-by: Antonio Quartulli ordex@autistici.org
@Marek: when you will merge this commit, can you please reword "tries to fix" in "fixes" ? :) Actually this patch is fixing the problem :)
However, as I discussed with Linus on IRC, this is only a temporary fix, which aims to remove the problem, but still we will need a redesign of the TT clean up routine in order to cleanly get rid of this race condition.
Cheers,
On Sat, Mar 30, 2013 at 02:16:02PM +0100, Antonio Quartulli wrote:
On Sun, Mar 17, 2013 at 05:44:58AM +0100, Linus Lüssing wrote:
On shutdown a race condition where we access a just freed global TT hash might occure. batadv_orig_node_free_rcu() callbacks might have been scheduled (especially during the shutdown procedure) and unfortunately batadv_tt_global_table_free() does not wait for them to finish first before freeing the global TT hash.
This potentially results in a general protection fault in batadv_tt_global_del_orig(), called via a batadv_orig_node_free_rcu() callback, which tries to access the just freed global TT hash.
This patch tries to fix this by waiting for any just scheduled batadv_orig_node_free_rcu() to finish via an extra rcu_barrier() call before freeing the global TT hash. And by moving the TT freeing call to the end of the batman cleanup routines.
Signed-off-by: Linus Lüssing linus.luessing@web.de
Acked-by: Antonio Quartulli ordex@autistici.org
NACK.
This patch is solving one problem but creating a new one: by using rcu_barrier we avoid the crash but we will leak memory, because batadv_orig_node_free_rcu()->batadv_tt_global_del_orig() will access an empty global table and so will not be able to free the global entries.
Patch ("batman-adv: avoid race conditions on TT global table by counting references") is fixing the problem by redesigning the TT clean up routine.
Cheers,
On Sun, Mar 17, 2013 at 05:44:57AM +0100, Linus Lüssing wrote:
rcu_barrier() only waits for the currently scheduled rcu functions to finish - it won't wait for any function scheduled via another call_rcu() within an rcu scheduled function.
Unfortunately our batadv_tt_orig_list_entry_free_ref() does just that, via a batadv_orig_node_free_ref() call, leading to our rcu_barrier() call potentially missing such a batadv_orig_node_free_ref().
This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here.
Signed-off-by: Linus Lüssing linus.luessing@web.de
diff --git a/originator.c b/originator.c index 585e684..013c7d0 100644 --- a/originator.c +++ b/originator.c @@ -117,7 +117,7 @@ out: return neigh_node; }
-static void batadv_orig_node_free_rcu(struct rcu_head *rcu) +void batadv_orig_node_free_rcu(struct rcu_head *rcu) { struct hlist_node *node_tmp; struct batadv_neigh_node *neigh_node, *tmp_neigh_node; diff --git a/originator.h b/originator.h index 7df48fa..4f9f88b 100644 --- a/originator.h +++ b/originator.h @@ -25,6 +25,7 @@ int batadv_originator_init(struct batadv_priv *bat_priv); void batadv_originator_free(struct batadv_priv *bat_priv); void batadv_purge_orig_ref(struct batadv_priv *bat_priv); +void batadv_orig_node_free_rcu(struct rcu_head *rcu); void batadv_orig_node_free_ref(struct batadv_orig_node *orig_node); struct batadv_orig_node *batadv_get_orig_node(struct batadv_priv *bat_priv, const uint8_t *addr); diff --git a/translation-table.c b/translation-table.c index 9322320..ee91cc1 100644 --- a/translation-table.c +++ b/translation-table.c @@ -144,7 +144,10 @@ static void batadv_tt_orig_list_entry_free_rcu(struct rcu_head *rcu) struct batadv_tt_orig_list_entry *orig_entry;
orig_entry = container_of(rcu, struct batadv_tt_orig_list_entry, rcu);
- batadv_orig_node_free_ref(orig_entry->orig_node);
- if (atomic_dec_and_test(&orig_entry->orig_node->refcount))
batadv_orig_node_free_rcu(&orig_entry->orig_node->rcu);
- kfree(orig_entry);
}
Hi Linus,
I was just re-reading this patch: the code can be beautified later (I think it would be worth defining a new function, e.g. batadv_orig_node_free(orig_node), which can then be invoked both in batadv_orig_node_free_rcu() and here - in this way we avoid to export batadv_orig_node_free_rcu() which should remain private to the originator.c module and we avoid this forced fake rcu invocation), but I think that putting a comment here to explain why we don't invoke call_rcu is definitely needed. In the future somebody else may ask why we don't use it and will try to re-add it.
The rest looks good.
Thanks!
rcu_barrier() only waits for the currently scheduled rcu functions to finish - it won't wait for any function scheduled via another call_rcu() within an rcu scheduled function.
Unfortunately our batadv_tt_orig_list_entry_free_ref() does just that, via a batadv_orig_node_free_ref() call, leading to our rcu_barrier() call potentially missing such a batadv_orig_node_free_ref().
This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here.
Signed-off-by: Linus Lüssing linus.luessing@web.de --- * v2: Added a code comment as discussed on IRC: To avoid forgetting about it, to avoid accidentally changing things back in the future.
originator.c | 2 +- originator.h | 1 + translation-table.c | 8 +++++++- 3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/originator.c b/originator.c index 2f34525..1f01e93 100644 --- a/originator.c +++ b/originator.c @@ -117,7 +117,7 @@ out: return neigh_node; }
-static void batadv_orig_node_free_rcu(struct rcu_head *rcu) +void batadv_orig_node_free_rcu(struct rcu_head *rcu) { struct hlist_node *node_tmp; struct batadv_neigh_node *neigh_node, *tmp_neigh_node; diff --git a/originator.h b/originator.h index 7df48fa..4f9f88b 100644 --- a/originator.h +++ b/originator.h @@ -25,6 +25,7 @@ int batadv_originator_init(struct batadv_priv *bat_priv); void batadv_originator_free(struct batadv_priv *bat_priv); void batadv_purge_orig_ref(struct batadv_priv *bat_priv); +void batadv_orig_node_free_rcu(struct rcu_head *rcu); void batadv_orig_node_free_ref(struct batadv_orig_node *orig_node); struct batadv_orig_node *batadv_get_orig_node(struct batadv_priv *bat_priv, const uint8_t *addr); diff --git a/translation-table.c b/translation-table.c index 9322320..4fe07cf 100644 --- a/translation-table.c +++ b/translation-table.c @@ -144,7 +144,13 @@ static void batadv_tt_orig_list_entry_free_rcu(struct rcu_head *rcu) struct batadv_tt_orig_list_entry *orig_entry;
orig_entry = container_of(rcu, struct batadv_tt_orig_list_entry, rcu); - batadv_orig_node_free_ref(orig_entry->orig_node); + + /* We are in an rcu callback here, therefore we cannot use + * batadv_orig_node_free_ref() and its call_rcu(): + * An rcu_barrier() wouldn't wait for that to finish */ + if (atomic_dec_and_test(&orig_entry->orig_node->refcount)) + batadv_orig_node_free_rcu(&orig_entry->orig_node->rcu); + kfree(orig_entry); }
On Wed, Apr 03, 2013 at 03:25:13AM +0200, Linus Lüssing wrote:
rcu_barrier() only waits for the currently scheduled rcu functions to finish - it won't wait for any function scheduled via another call_rcu() within an rcu scheduled function.
Unfortunately our batadv_tt_orig_list_entry_free_ref() does just that, via a batadv_orig_node_free_ref() call, leading to our rcu_barrier() call potentially missing such a batadv_orig_node_free_ref().
This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here.
Signed-off-by: Linus Lüssing linus.luessing@web.de
Acked-by: Antonio Quartulli ordex@autistici.org
Thanks a lot
On Wednesday, April 03, 2013 16:11:45 Antonio Quartulli wrote:
This patch fixes this issue by calling the batadv_orig_node_free_rcu() directly from the rcu callback, removing the unnecessary, additional call_rcu() layer here.
Signed-off-by: Linus Lüssing linus.luessing@web.de
Acked-by: Antonio Quartulli ordex@autistici.org
Applied in revision c8bda21.
Cheers, Marek
PS: I did some style adjustments to the patch for the sake of clarity.
b.a.t.m.a.n@lists.open-mesh.org