Linus Lüssing wrote:
Hi Sven,
synchronize_net already contains a synchronize_rcu at its end, so the synchronize_rcu in the batman code there has always been redundant.
I've removed the synchronize_rcu instead of the synchronize_net to be on the safe side. I guess usually no more packets should arrive anyway as the batman packet type is not registered anymore. But I wasn't sure if the might_sleep() of synchronize_net() might be needed for something, so I didn't dare to remove synchronize_net.
If someone says it'd be ok to remove synchronize_net() instead, I could make a new patch, no problem.
Ok, it would have been nice to state such things in the commit message (otherwise the stable@kernel.org will drop such a patch quite easily). Marek and I have ausgekaspert why it only happens in 1765 and also in 1766. So it will not be a patch for stable.
And the might_sleep is only for debugging purposes. But yes, it makes sense to use synchronize_net here (for example due to the usage of dev_remove_pack before).
That means that technically the patch seems to be ok, but didn't liked the explanation with the problem that we might have to justify it to the stable@kernel.org guys that way.
So I would ack the patch with a minor change in the commit message. So instead of
During the module shutdown procedure in batman_exit(), a rcu callback is being scheduled (batman_exit -> hardif_remove_interfaces -> hardif_remove_interfae -> call_rcu). However, when the kernel unloads the module, the rcu callback might not have been executed yet, resulting in a "unable to handle kernel paging request" in __rcu_process_callback afterwards, causing the kernel to freeze. Therefore, we should always flush all rcu callback functions scheduled during the shutdown procedure.
something like
During the module shutdown procedure in batman_exit(), a rcu callback is being scheduled (batman_exit -> hardif_remove_interfaces -> hardif_remove_interfae -> call_rcu). However, when the kernel unloads the module, the rcu callback might not have been executed yet, resulting in a "unable to handle kernel paging request" in __rcu_process_callback afterwards, causing the kernel to freeze.
The synchronize_net and synchronize_rcu in mesh_free are currently called before the call_rcu in hardif_remove_interface and have no real effect on it.
Therefore, we should always flush all rcu callback functions scheduled during the shutdown procedure using synchronize_net. The call to synchronize_rcu can be omitted because synchronize_net already calls it.
thanks, Sven