Whenever a new VLAN is created on top of batman virtual interfaces the batman-adv kernel module creates internal structures to track the status of said VLAN. Amongst other things, the MAC address of the VLAN interface itself has to be stored.
Without this change a VLAN and its infrastructure could be created while the interface MAC address is not stored without triggering any error, thus creating issues in other parts of the code.
Prevent the VLAN from being created if the MAC address can not be stored.
Signed-off-by: Marek Lindner mareklindner@neomailbox.ch --- net/batman-adv/soft-interface.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index edeffcb9..f7c3a313 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -572,6 +572,7 @@ struct batadv_softif_vlan *batadv_softif_vlan_get(struct batadv_priv *bat_priv, int batadv_softif_create_vlan(struct batadv_priv *bat_priv, unsigned short vid) { struct batadv_softif_vlan *vlan; + bool client_added; int err;
vlan = batadv_softif_vlan_get(bat_priv, vid); @@ -604,13 +605,18 @@ int batadv_softif_create_vlan(struct batadv_priv *bat_priv, unsigned short vid) /* add a new TT local entry. This one will be marked with the NOPURGE * flag */ - batadv_tt_local_add(bat_priv->soft_iface, - bat_priv->soft_iface->dev_addr, vid, - BATADV_NULL_IFINDEX, BATADV_NO_MARK); + client_added = batadv_tt_local_add(bat_priv->soft_iface, + bat_priv->soft_iface->dev_addr, vid, + BATADV_NULL_IFINDEX, BATADV_NO_MARK);
/* don't return reference to new softif_vlan */ batadv_softif_vlan_put(vlan);
+ if (!client_added) { + batadv_softif_destroy_vlan(bat_priv, vlan); + return -ENOENT; + } + return 0; }
@@ -648,6 +654,7 @@ static int batadv_interface_add_vid(struct net_device *dev, __be16 proto, { struct batadv_priv *bat_priv = netdev_priv(dev); struct batadv_softif_vlan *vlan; + bool client_added; int ret;
/* only 802.1Q vlans are supported. @@ -683,9 +690,14 @@ static int batadv_interface_add_vid(struct net_device *dev, __be16 proto, * flag. This must be added again, even if the vlan object already * exists, because the entry was deleted by kill_vid() */ - batadv_tt_local_add(bat_priv->soft_iface, - bat_priv->soft_iface->dev_addr, vid, - BATADV_NULL_IFINDEX, BATADV_NO_MARK); + client_added = batadv_tt_local_add(bat_priv->soft_iface, + bat_priv->soft_iface->dev_addr, vid, + BATADV_NULL_IFINDEX, BATADV_NO_MARK); + + if (!client_added) { + batadv_softif_destroy_vlan(bat_priv, vlan); + return -ENOENT; + }
return 0; }
On Donnerstag, 10. Mai 2018 17:27:52 CEST Marek Lindner wrote:
Whenever a new VLAN is created on top of batman virtual interfaces the batman-adv kernel module creates internal structures to track the status of said VLAN. Amongst other things, the MAC address of the VLAN interface itself has to be stored.
Without this change a VLAN and its infrastructure could be created while the interface MAC address is not stored without triggering any error, thus creating issues in other parts of the code.
Prevent the VLAN from being created if the MAC address can not be stored.
Signed-off-by: Marek Lindner mareklindner@neomailbox.ch
Fixes: 952cebb57518 ("batman-adv: add per VLAN interface attribute framework")
??
[....]
@@ -604,13 +605,18 @@ int batadv_softif_create_vlan(struct batadv_priv
*bat_priv, unsigned short vid)
/* add a new TT local entry. This one will be marked with the NOPURGE * flag */
- batadv_tt_local_add(bat_priv->soft_iface,
bat_priv->soft_iface->dev_addr, vid,
BATADV_NULL_IFINDEX, BATADV_NO_MARK);
client_added = batadv_tt_local_add(bat_priv->soft_iface,
bat_priv->soft_iface->dev_addr, vid,
BATADV_NULL_IFINDEX, BATADV_NO_MARK);
/* don't return reference to new softif_vlan */ batadv_softif_vlan_put(vlan);
if (!client_added) {
batadv_softif_destroy_vlan(bat_priv, vlan);
return -ENOENT;
}
This function (batadv_softif_destroy_vlan) is static and cannot be reached from here. The definition+declaration of this function follows the function were you've added these changes.
[...]
@@ -683,9 +690,14 @@ static int batadv_interface_add_vid(struct net_device
*dev, __be16 proto,
* flag. This must be added again, even if the vlan object already * exists, because the entry was deleted by kill_vid() */
- batadv_tt_local_add(bat_priv->soft_iface,
bat_priv->soft_iface->dev_addr, vid,
BATADV_NULL_IFINDEX, BATADV_NO_MARK);
- client_added = batadv_tt_local_add(bat_priv->soft_iface,
bat_priv->soft_iface->dev_addr, vid,
BATADV_NULL_IFINDEX, BATADV_NO_MARK);
- if (!client_added) {
batadv_softif_destroy_vlan(bat_priv, vlan);
return -ENOENT;
- }
This looks quite fragile to me. Wouldn't that mean that the vlan would also not be created later? When I remember it correctly, then batadv_interface_add_vid is the only one which would create the internal vlan entries in the first place (beside this odd call in batadv_hard_if_event). The VLAN functionality would therefore be completely broken when such a problem happens.
1. batadv_interface_add_vid is called 2. vlan is added 3. local entry is not added for this vlan 3.1. vlan is removed again 4. node tries to communicate over this vlan
Btw. it is really odd to see code shared between batadv_interface_add_vid and batadv_softif_create_vlan. Especially because batadv_interface_add_vid calls batadv_softif_create_vlan.
Kind regards, Sven
On 11/05/18 00:11, Sven Eckelmann wrote: [...]
@@ -683,9 +690,14 @@ static int batadv_interface_add_vid(struct net_device
*dev, __be16 proto,
* flag. This must be added again, even if the vlan object already * exists, because the entry was deleted by kill_vid() */
- batadv_tt_local_add(bat_priv->soft_iface,
bat_priv->soft_iface->dev_addr, vid,
BATADV_NULL_IFINDEX, BATADV_NO_MARK);
- client_added = batadv_tt_local_add(bat_priv->soft_iface,
bat_priv->soft_iface->dev_addr, vid,
BATADV_NULL_IFINDEX, BATADV_NO_MARK);
- if (!client_added) {
batadv_softif_destroy_vlan(bat_priv, vlan);
return -ENOENT;
- }
This looks quite fragile to me. Wouldn't that mean that the vlan would also not be created later? When I remember it correctly, then batadv_interface_add_vid is the only one which would create the internal vlan entries in the first place (beside this odd call in batadv_hard_if_event). The VLAN functionality would therefore be completely broken when such a problem happens.
- batadv_interface_add_vid is called
- vlan is added
- local entry is not added for this vlan
3.1. vlan is removed again 4. node tries to communicate over this vlan
A failure of this function will cause a failure of the ndo_vlan_rx_add_vid() call made by the kernel/netlink API which in turns will reach the userspace.
Ideally any userspace application that attempted to create the VLAN should be in charge of re-attempting the creation again (I believe).
About your point 4, this should not be possible, because in case of error the bat0.X interface won't be created by the kernel at all. no?
Btw. it is really odd to see code shared between batadv_interface_add_vid and batadv_softif_create_vlan. Especially because batadv_interface_add_vid calls batadv_softif_create_vlan.
This can probably be re-arranged a bit in the master branch.
Cheers,
On Donnerstag, 10. Mai 2018 17:27:52 CEST Marek Lindner wrote:
Amongst other things, the MAC address of the VLAN interface itself has to be stored.
Without this change a VLAN and its infrastructure could be created while the interface MAC address is not stored without triggering any error, thus creating issues in other parts of the code.
Actually, the call to batadv_tt_local_add is using the address from the softif and not from the vlan interface. Just some remark because it let me think about the way the "-1" no-VLAN is added.
I would really like to know what exactly failed here. Does Leonardo's system really try to add the VLAN 0 before it is storing the info for the no-vlan (-1/BATADV_NO_FLAGS) and the mac address is "invalid" for VLAN 0? Or maybe both failed and he is just lucky that he got some non-permanent entries for this device on the non-vlan "vlan" - I am not sure about that.
On Donnerstag, 10. Mai 2018 18:24:50 CEST Antonio Quartulli wrote:
A failure of this function will cause a failure of the ndo_vlan_rx_add_vid() call made by the kernel/netlink API which in turns will reach the userspace.
Ideally any userspace application that attempted to create the VLAN should be in charge of re-attempting the creation again (I believe).
About your point 4, this should not be possible, because in case of error the bat0.X interface won't be created by the kernel at all. no?
Yes, this seems to be true for batadv_interface_add_vid. Still an open question for the batadv_hard_if_event (which cannot return an error to the caller). Would be interesting to know whether Leonardo's node fails for both calls or only for one (see above).
Kind regards, Sven
b.a.t.m.a.n@lists.open-mesh.org