batman-adv checks in different situation if a new device is already on top of a different batman-adv device. This is done by getting the iflink of a device and all its parent. It assumes that this iflink is always a parent device in an acyclic graph. But this assumption is broken by devices like veth which are actually a pair of two devices linked to each other. The recursive check would therefore get veth0 when calling dev_get_iflink on veth1. And it gets veth0 when calling dev_get_iflink with veth1.
Creating a veth pair and loading batman-adv freezes parts of the system
ip link add veth0 type veth peer name veth1 modprobe batman-adv
An RCU stall will be detected on the system which cannot be fixed.
INFO: rcu_sched self-detected stall on CPU 1: (5264 ticks this GP) idle=3e9/140000000000001/0 softirq=144683/144686 fqs=5249 (t=5250 jiffies g=46 c=45 q=43) Task dump for CPU 1: insmod R running task 0 247 245 0x00000008 ffffffff8151f140 ffffffff8107888e ffff88000fd141c0 ffffffff8151f140 0000000000000000 ffffffff81552df0 ffffffff8107b420 0000000000000001 ffff88000e3fa700 ffffffff81540b00 ffffffff8107d667 0000000000000001 Call Trace: <IRQ> [<ffffffff8107888e>] ? rcu_dump_cpu_stacks+0x7e/0xd0 [<ffffffff8107b420>] ? rcu_check_callbacks+0x3f0/0x6b0 [<ffffffff8107d667>] ? hrtimer_run_queues+0x47/0x180 [<ffffffff8107cf9d>] ? update_process_times+0x2d/0x50 [<ffffffff810873fb>] ? tick_handle_periodic+0x1b/0x60 [<ffffffff810290ae>] ? smp_trace_apic_timer_interrupt+0x5e/0x90 [<ffffffff813bbae2>] ? apic_timer_interrupt+0x82/0x90 <EOI> [<ffffffff812c3fd7>] ? __dev_get_by_index+0x37/0x40 [<ffffffffa0031f3e>] ? batadv_hard_if_event+0xee/0x3a0 [batman_adv] [<ffffffff812c5801>] ? register_netdevice_notifier+0x81/0x1a0 [...]
Signed-off-by: Sven Eckelmann sven@narfation.org Cc: Jetmir Gigollai jetmir_gigollaj@web.de --- compat-include/linux/netdevice.h | 6 ------ net/batman-adv/hard-interface.c | 42 ---------------------------------------- 2 files changed, 48 deletions(-)
diff --git a/compat-include/linux/netdevice.h b/compat-include/linux/netdevice.h index f19f624..0e4fcc8 100644 --- a/compat-include/linux/netdevice.h +++ b/compat-include/linux/netdevice.h @@ -111,10 +111,4 @@ static inline int batadv_netdev_set_master(struct net_device *slave,
#endif /* < KERNEL_VERSION(3, 17, 0) */
-#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 1, 0) - -#define dev_get_iflink(_net_dev) ((_net_dev)->iflink) - -#endif /* < KERNEL_VERSION(3, 19, 0) */ - #endif /* _NET_BATMAN_ADV_COMPAT_LINUX_NETDEVICE_H_ */ diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c index bef9147..0c2643d 100644 --- a/net/batman-adv/hard-interface.c +++ b/net/batman-adv/hard-interface.c @@ -75,44 +75,6 @@ out: return hard_iface; }
-/** - * batadv_is_on_batman_iface - check if a device is a batman iface descendant - * @net_dev: the device to check - * - * If the user creates any virtual device on top of a batman-adv interface, it - * is important to prevent this new interface to be used to create a new mesh - * network (this behaviour would lead to a batman-over-batman configuration). - * This function recursively checks all the fathers of the device passed as - * argument looking for a batman-adv soft interface. - * - * Return: true if the device is descendant of a batman-adv mesh interface (or - * if it is a batman-adv interface itself), false otherwise - */ -static bool batadv_is_on_batman_iface(const struct net_device *net_dev) -{ - struct net_device *parent_dev; - bool ret; - - /* check if this is a batman-adv mesh interface */ - if (batadv_softif_is_valid(net_dev)) - return true; - - /* no more parents..stop recursion */ - if (dev_get_iflink(net_dev) == 0 || - dev_get_iflink(net_dev) == net_dev->ifindex) - return false; - - /* recurse over the parent device */ - parent_dev = __dev_get_by_index(&init_net, dev_get_iflink(net_dev)); - /* if we got a NULL parent_dev there is something broken.. */ - if (WARN(!parent_dev, "Cannot find parent device")) - return false; - - ret = batadv_is_on_batman_iface(parent_dev); - - return ret; -} - static int batadv_is_valid_iface(const struct net_device *net_dev) { if (net_dev->flags & IFF_LOOPBACK) @@ -124,10 +86,6 @@ static int batadv_is_valid_iface(const struct net_device *net_dev) if (net_dev->addr_len != ETH_ALEN) return 0;
- /* no batman over batman */ - if (batadv_is_on_batman_iface(net_dev)) - return 0; - return 1; }
Hi Sven and thanks for this RFC. Sorry for taking soo long to comment.
On 14/11/15 03:46, Sven Eckelmann wrote:
batman-adv checks in different situation if a new device is already on top of a different batman-adv device. This is done by getting the iflink of a device and all its parent. It assumes that this iflink is always a parent device in an acyclic graph. But this assumption is broken by devices like veth which are actually a pair of two devices linked to each other. The recursive check would therefore get veth0 when calling dev_get_iflink on veth1. And it gets veth0 when calling dev_get_iflink with veth1.
I agree that this check implemented this way represents a problem. However I also believe that we should still have some kind of prevention against this particular scenario (chain of batman interfaces), because if ignored it could lead to troubles.
Unfortunately I don't know how to implement this check in an elegant and extendible manner.
First of all we should add a check that follows the master interface, but at the same time we should still follow the iflink, otherwise we can't check relationships like VLAN_DEV->REAL_DEV. But how to implement the latter without hitting the cyclic case is not clear to me ..
An option is to add a specific check for veth and break the recursion, but this is not really nice, because the next time a cyclic interface type will be introduced our check will become troublesome again.
Suggestions?
Cheers,
On Tue, Jan 19, 2016 at 08:39:07AM +0800, Antonio Quartulli wrote:
Hi Sven and thanks for this RFC. Sorry for taking soo long to comment.
On 14/11/15 03:46, Sven Eckelmann wrote:
batman-adv checks in different situation if a new device is already on top of a different batman-adv device. This is done by getting the iflink of a device and all its parent. It assumes that this iflink is always a parent device in an acyclic graph. But this assumption is broken by devices like veth which are actually a pair of two devices linked to each other. The recursive check would therefore get veth0 when calling dev_get_iflink on veth1. And it gets veth0 when calling dev_get_iflink with veth1.
I agree that this check implemented this way represents a problem. However I also believe that we should still have some kind of prevention against this particular scenario (chain of batman interfaces), because if ignored it could lead to troubles.
Unfortunately I don't know how to implement this check in an elegant and extendible manner.
First of all we should add a check that follows the master interface, but at the same time we should still follow the iflink, otherwise we can't check relationships like VLAN_DEV->REAL_DEV. But how to implement the latter without hitting the cyclic case is not clear to me ..
An option is to add a specific check for veth and break the recursion, but this is not really nice, because the next time a cyclic interface type will be introduced our check will become troublesome again.
Suggestions?
Hi Antonio
I've got a set of patches i went to send out as soon, once net-next reopens. They add support for network name spaces.
One of the issues i had is in this piece of code and with veth. The other end of the veth can be in a different name space. Hence you cannot look it up using the ifindex in the current name space. So i made the code just exit with an O.K. if the parent device cannot be found.
Something to think about when redesigning this code. The parent could be in a different network stack!
You probably can handle veth and both ends in the same namespace. It should not be too difficult to detect two interfaces which are the parent of each other. You just need to keep a bit more state when doing the recursion. It does however get more complex when there are more than two devices in a parent loop. But can that happen?
Andrew
On 19/01/16 09:22, Andrew Lunn wrote:
On Tue, Jan 19, 2016 at 08:39:07AM +0800, Antonio Quartulli wrote:
Hi Sven and thanks for this RFC. Sorry for taking soo long to comment.
On 14/11/15 03:46, Sven Eckelmann wrote:
batman-adv checks in different situation if a new device is already on top of a different batman-adv device. This is done by getting the iflink of a device and all its parent. It assumes that this iflink is always a parent device in an acyclic graph. But this assumption is broken by devices like veth which are actually a pair of two devices linked to each other. The recursive check would therefore get veth0 when calling dev_get_iflink on veth1. And it gets veth0 when calling dev_get_iflink with veth1.
I agree that this check implemented this way represents a problem. However I also believe that we should still have some kind of prevention against this particular scenario (chain of batman interfaces), because if ignored it could lead to troubles.
Unfortunately I don't know how to implement this check in an elegant and extendible manner.
First of all we should add a check that follows the master interface, but at the same time we should still follow the iflink, otherwise we can't check relationships like VLAN_DEV->REAL_DEV. But how to implement the latter without hitting the cyclic case is not clear to me ..
An option is to add a specific check for veth and break the recursion, but this is not really nice, because the next time a cyclic interface type will be introduced our check will become troublesome again.
Suggestions?
Hi Antonio
I've got a set of patches i went to send out as soon, once net-next reopens. They add support for network name spaces.
One of the issues i had is in this piece of code and with veth. The other end of the veth can be in a different name space. Hence you cannot look it up using the ifindex in the current name space. So i made the code just exit with an O.K. if the parent device cannot be found.
Is this really ok ? Isn't it possible to create "loops" by jumping from one namespace to the other ?
Something to think about when redesigning this code. The parent could be in a different network stack!
exactly my point above! :) So this means that an interface A might have A.iflink pointing to a device in another namespace ? Is this what you are saying ? will there be any way to "retrieve" the namespace of such device?
You probably can handle veth and both ends in the same namespace. It should not be too difficult to detect two interfaces which are the parent of each other. You just need to keep a bit more state when doing the recursion. It does however get more complex when there are more than two devices in a parent loop. But can that happen?
first of all I'd like to answer the question: which device type can create a loop? only veth? or there are other possibilities?
But having more than one interface in the parent loop should probably never happen..
This said, what's your plan about this code? How are you going to change it? :)
Thank you very much for your feedback Andrew!
Cheers,
One of the issues i had is in this piece of code and with veth. The other end of the veth can be in a different name space. Hence you cannot look it up using the ifindex in the current name space. So i made the code just exit with an O.K. if the parent device cannot be found.
Is this really ok ? Isn't it possible to create "loops" by jumping from one namespace to the other ?
Quite possible. I took the easy option:
diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c index a5ce58a..a7a92ae 100644 --- a/net/batman-adv/hard-interface.c +++ b/net/batman-adv/hard-interface.c @@ -104,8 +104,12 @@ static bool batadv_is_on_batman_iface(const struct net_device *net_dev)
/* recurse over the parent device */ parent_dev = __dev_get_by_index(net, dev_get_iflink(net_dev)); - /* if we got a NULL parent_dev there is something broken.. */ - if (WARN(!parent_dev, "Cannot find parent device")) + /* if we got a NULL parent_dev, it might mean the parent is in + * a different network namespace. Things then get really + * complex, so lets just assume the user knows what she is + * doing. + */ + if (!parent_dev) return false;
ret = batadv_is_on_batman_iface(parent_dev);
I didn't spend the time to think all the possible loop situations though. It is not too bad in that the software interface cannot be moved into a different netns. Also, all the hard interfaces have to be in the same netns as the soft interface. If you move a hard interface into a different netns, it will automatically be removed from the soft interface using the notification mechanism.
There are some other restrictions which help avoid loops. You cannot in general move 'soft interfaces' between netns. You cannot move bridge interfaces, tunnel interfaces, team/bonding interfaces, etc.
exactly my point above! :) So this means that an interface A might have A.iflink pointing to a device in another namespace ?
Partly correct. The current code for getting the parent is:
parent_dev = __dev_get_by_index(net, dev_get_iflink(net_dev));
This is actually totally broken, when used with veth. It possible can return a completely different interface than the parent, since the ifindex returned for a veth could be in a different namespace. ifindex is only unique within a namespace, not across name spaces.
I need to look deeper into the code and see if there is a different parenting mechanism that can be used.
Andrew
b.a.t.m.a.n@lists.open-mesh.org