I would prefer when you would call cancel_work_sync when metric stuff should be stopped. I was expecting to see this somewhere around batadv_v_elp_iface_disable after the cancel_work_sync but it seems like it is missing there (or in a similar place)
I tried this:
diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c index 1d704574..b35ded79 100644 --- a/net/batman-adv/bat_v_elp.c +++ b/net/batman-adv/bat_v_elp.c @@ -387,8 +387,20 @@ int batadv_v_elp_iface_enable(struct batadv_hard_iface *hard_iface) */ void batadv_v_elp_iface_disable(struct batadv_hard_iface *hard_iface) { + struct batadv_hardif_neigh_node *hardif_neigh; + cancel_delayed_work_sync(&hard_iface->bat_v.elp_wq);
+ rcu_read_lock(); + hlist_for_each_entry_rcu(hardif_neigh, + &hard_iface->neigh_list, list) { + if (!kref_get_unless_zero(&hardif_neigh->refcount)) + continue; + cancel_work_sync(&hardif_neigh->bat_v.metric_work); + batadv_hardif_neigh_put(hardif_neigh); + } + rcu_read_unlock(); + dev_kfree_skb(hard_iface->bat_v.elp_skb); hard_iface->bat_v.elp_skb = NULL; }
But it seems to cause a hang on reboot every once in a while. When the hang happens, I'm not able to trigger sysrq over serial.
I can try kgdb, but that requires sysrq to work, so I'm not sure how I can gain control after the machine becomes unresponsive.
I'm not sure why there isn't a watchdog bite when this happens.
Andy