On Thursday 08 May 2014 17:13:15 Antonio Quartulli wrote:
From: Antonio Quartulli antonio@open-mesh.com
When a VLAN interface (on top of batX) is removed and re-added within a short timeframe TT does not have enough time to properly cleanup. This creates an internal TT state mismatch as the newly created softif_vlan will be initialized from scratch with a TT client count of zero (even if TT entries for this VLAN still exist). The resulting TT messages are bogus due to the counter / tt client listing mismatch, thus creating inconsistencies on every node in the network
To fix this issue destroy_vlan() has to not free the VLAN object immediately but it has to be kept alive until all the TT entries for this VLAN have been removed. destroy_vlan() still removes the sysfs folder so that the user has the feeling that everything went fine.
If the same VLAN is re-added before the old object is free'd, then the latter is resurrected and re-used.
Implement such behaviour by increasing the reference counter of a softif_vlan object every time a new local TT entry for such VLAN is created and remove the object from the list only when all the TT entries have been destroyed.
Signed-off-by: Antonio Quartulli antonio@open-mesh.com
Changes since v4:
- improved code in add_vid()
- don't leak TT entries in case of vlan re-add failure
Changes since v3:
- always re-add NO_PURGE local entry on add_vid()
Changes since v2:
- remove cleanup_work member that is not needed anymore in this approach
- reword commit subject
- reword commit message (Thanks Marek!)
Changes since v1:
- destroy and create vlan sysfs folder within softif_vlan_destroy/create()
to avoid lock troubles with soft-interface destruction and delayed jobs.
soft-interface.c | 60 ++++++++++++++++++++++++++++++++++++++++------------- translation-table.c | 26 +++++++++++++++++++++++ types.h | 2 ++ 3 files changed, 74 insertions(+), 14 deletions(-)
Applied in revision 9729d20.
Thanks, Marek