On Monday 07 December 2015 23:12:42 Linus Lüssing wrote:
On Sat, Nov 28, 2015 at 09:21:02AM +0100, Sven Eckelmann wrote:
mcast.mla_list is protected by tt.commit_lock (see batadv_mcast_mla_tt_add, batadv_mcast_mla_list_free and batadv_mcast_mla_tt_retract).
mcast.mla_list changes should be protected by the non-parallel code flow: During runtime, batadv_mcast_mla_tt_update() is only called from the self-rearming OGM scheduler thread - batadv_mcast_mla_tt_update() will never run more than once at the same time.
The second place for mcast.mla_list changes, batadv_mcast_free(), is called only on shutdown after the OGM scheduling thread was stopped.
The two functions with the lockdep assert are
* batadv_mcast_mla_list_free * batadv_mcast_mla_tt_retract
(batadv_mcast_mla_tt_add looks basically like batadv_mcast_mla_list_free)
The call graphs are attached and these graphs have (pure) starting nodes which are not only batadv_exit and batadv_iv_ogm_schedule. Parts of them look like they are only protected because of tt.commit_lock.
I don't think there should be such races regarding mcast.mla_list
- was something like that observed in the wild which lead to inserting
the lockdep-asserts?
We had multiple races and crashes which resulted in these asserts and locks+checks around list_del. The list_add modifications are still missing. And there are still other problems which are still open [1].
Kind regards, Sven
[1] e.g. https://www.open-mesh.org/issues/223