On Tue, Sep 07, 2010 at 07:56:53PM +0200, Sven Eckelmann wrote:
Thanks for your comment. I removed the parts you don't refer to (makes it lot easier to find the actual comment).
I guess I can always refer to the original to see the related code. ;-)
Paul E. McKenney wrote:
+#include <linux/if_arp.h>
+#define MIN(x, y) ((x) < (y) ? (x) : (y))
+struct batman_if *get_batman_if_by_netdev(struct net_device *net_dev) +{
struct batman_if *batman_if;
rcu_read_lock();
list_for_each_entry_rcu(batman_if, &if_list, list) {
if (batman_if->net_dev == net_dev)
goto out;
}
batman_if = NULL;
+out:
rcu_read_unlock();
Here we are leaking an RCU-protected pointer outside of the RCU read-side critical section. Why is this safe?
First thing: Their is another rcu related problem with a call_rcu and the missing explicit (so not done implizit by another function) synchronize_rcu before the shutdown. This was fixed right after this patch was send for a review... bad timing, but ok.
Fair enough!
Here is the sequence of events that I am concerned about:
CPU 0 executes the code above, obtains a pointer, and is about ready to return.
CPU 1 executes hardif_remove_interface(), and calls hardif_disable_interface(), which calls hardif_deactivate_interface(), which sets ->if_status to IF_INACTIVE. Then hardif_disable_interface() sets ->if_status to IF_NOT_IN_USE. Then hardif_remove_interface() frees the interface via call_rcu().
Of course, call_rcu() waits for an RCU grace period to elapse, but we are no longer in an RCU read-side critical section, so there is nothing stopping the grace period from completing before we are done with the batman_if pointer.
Or am I missing some other interlock that prevents hardif_remove_interface() from freeing this structure?
I have similar concerns with your other RCU read-side critical sections.
Looks to me like a valid point. I have to think a little bit how to solve it correctly. Feel free to add more comments about other rcu cruelties in it.
One approach would be to extend the RCU read-side critical section to cover all uses of the RCU-protected pointer. Another approach would be to take a reference count (or something similar) before the pointer leaves the RCU read-side critical section.
Could you please take a look at Documentation/RCU/checklist.txt?
Because I am not familiar with the BATMAN device, it is all too easy for me to miss subtleties in the code.
Thanx, Paul