On Sat, Dec 01, 2012 at 09:01:53PM +0100, Simon Wunderlich wrote:
On Sat, Dec 01, 2012 at 11:44:34AM -0800, Eric Dumazet wrote:
On Sat, 2012-12-01 at 20:04 +0100, Sven Eckelmann wrote:
On Saturday 01 December 2012 10:36:12 Eric Dumazet wrote:
diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c index c5c0593..c122782 100644 --- a/net/bridge/br_sysfs_br.c +++ b/net/bridge/br_sysfs_br.c @@ -142,7 +142,7 @@ static ssize_t store_stp_state(struct device *d,
if (!rtnl_trylock())
return restart_syscall();
br_stp_set_enabled(br, val);
- rtnl_unlock();
__rtnl_unlock();
return len;
}
I have no idea of why you believe there is a problem here.
Could you explain how net_todo_list could be not empty ?
As long as no device is unregistered between rtnl_trylock()/rtnl_unlock(), there is no possible deadlock.
I am not sure what "here" means for your. At least batman-adv tries to unregister a device -> problem [1]. I will not make any judgements about the other uses in the kernel/other parts patched by Simon.
I was reacting to the change in net/bridge/br_sysfs_br.c
rtnl_trylock() could set a boolean flag to explicitly WARN_ON() in case we try to unregister a device.
Well, I'm not sure if this can happen in the bridge code, but from looking at the code it doesn't appear to be impossible. It would be better to be sure that it can't deadlock IMHO.
(Although doing unlock/lock/unlock in an unlock function is also a little "uncommon").
But still we have the problem in batman-adv (as Sven pointed out in a previous email) that tries to unregister a device in that "critical window".
Exporting __rtnl_unlock() would solve the issue in this case.
If you think the bridge code should not end up in such situation, what if Simon resends the patch with only the __rtnl_unlock() exportation and the change in batman-adv?
Cheers,