The first two patches are actual fixes.
The first one tries to avoid sending uninitialized data that could be interpreted as invalid TT change events in both TT request and OGM. Following invalid entries could be seen when that happen with batctl o:
* 00:00:00:00:00:00 -1 [....] ( 0) 88:12:4e:ad:7e:ba (179) (0x45845380) * 00:00:00:00:78:79 4092 [.W..] ( 0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b)
The second one fixes an issue that happened when a TT change event list is too big for the MTU, the list was never actually sent nor free and continued to grow indefinitely from this point. That also caused the OGM TTVN to increase at each OGM interval without any changes being ever visible to other nodes. This ever growing TT change event list could be observed by looking at /sys/kernel/slab/batadv_tt_change_cache/objects that sometimes showed unusal high value even after issuing a memcache shrink.
The next two patches are potential slight improvements. While patch 3 is mainly cosmetic (having negative tt.local_changes value is not exactly an issue), patch 4 is here to keep the TT changes list as short as possible.
Remi Pommarel (4): batman-adv: Do not send uninitialized TT changes batman-adv: Do not let TT changes list grows indefinitely batman-adv: Remove atomic usage for tt.local_changes batman-adv: Don't keep redundant TT change events
net/batman-adv/soft-interface.c | 2 +- net/batman-adv/translation-table.c | 48 +++++++++++++++++++----------- net/batman-adv/types.h | 4 +-- 3 files changed, 34 insertions(+), 20 deletions(-)
The number of TT changes can be less than initially expected in batadv_tt_tvlv_container_update() (changes can be removed by batadv_tt_local_event() in ADD+DEL sequence between reading tt_diff_entries_num and actually iterating the change list under lock).
Thus tt_diff_len could be bigger than the actual changes size that need to be sent. Because batadv_send_my_tt_response sends the whole packet, uninitialized data can be interpreted as TT changes on other nodes leading to weird TT global entries on those nodes such as:
* 00:00:00:00:00:00 -1 [....] ( 0) 88:12:4e:ad:7e:ba (179) (0x45845380) * 00:00:00:00:78:79 4092 [.W..] ( 0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b)
All of the above also applies to OGM tvlv container buffer's tvlv_len.
Remove the extra allocated space to avoid sending uninitialized TT changes in batadv_send_my_tt_response() and batadv_v_ogm_send_softif().
Fixes: e1bf0c14096f ("batman-adv: tvlv - convert tt data sent within OGMs") Signed-off-by: Remi Pommarel repk@triplefau.lt --- net/batman-adv/translation-table.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index 2243cec18ecc..f0590f9bc2b1 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -990,6 +990,7 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) int tt_diff_len, tt_change_len = 0; int tt_diff_entries_num = 0; int tt_diff_entries_count = 0; + size_t tt_extra_len = 0; u16 tvlv_len;
tt_diff_entries_num = atomic_read(&bat_priv->tt.local_changes); @@ -1027,6 +1028,9 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) } spin_unlock_bh(&bat_priv->tt.changes_list_lock);
+ tt_extra_len = batadv_tt_len(tt_diff_entries_num - + tt_diff_entries_count); + /* Keep the buffer for possible tt_request */ spin_lock_bh(&bat_priv->tt.last_changeset_lock); kfree(bat_priv->tt.last_changeset); @@ -1035,6 +1039,7 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) tt_change_len = batadv_tt_len(tt_diff_entries_count); /* check whether this new OGM has no changes due to size problems */ if (tt_diff_entries_count > 0) { + tt_diff_len -= tt_extra_len; /* if kmalloc() fails we will reply with the full table * instead of providing the diff */ @@ -1047,6 +1052,8 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) } spin_unlock_bh(&bat_priv->tt.last_changeset_lock);
+ /* Remove extra packet space for OGM */ + tvlv_len -= tt_extra_len; container_register: batadv_tvlv_container_register(bat_priv, BATADV_TVLV_TT, 1, tt_data, tvlv_len);
When TT changes list is too big to fit in packet due to MTU size, an empty OGM is sent expected other node to send TT request to get the changes. The issue is that tt.last_changeset was not built thus the originator was responding with previous changes to those TT requests (see batadv_send_my_tt_response). Also the changes list was never cleaned up effectively never ending growing from this point onwards, repeatedly sending the same TT response changes over and over, and a creatind a new empty OGM every OGM interval expecting for the local changes to be purged.
When there is more TT changes that can fit in packet, drop all changes, send emtpy OGM and wait for TT request so we can respond with a full table instead.
Fixes: e1bf0c14096f ("batman-adv: tvlv - convert tt data sent within OGMs") Signed-off-by: Remi Pommarel repk@triplefau.lt --- net/batman-adv/translation-table.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index f0590f9bc2b1..50af82e7b50a 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -990,6 +990,7 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) int tt_diff_len, tt_change_len = 0; int tt_diff_entries_num = 0; int tt_diff_entries_count = 0; + bool drop_changes = false; size_t tt_extra_len = 0; u16 tvlv_len;
@@ -997,21 +998,29 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) tt_diff_len = batadv_tt_len(tt_diff_entries_num);
/* if we have too many changes for one packet don't send any - * and wait for the tt table request which will be fragmented + * and wait for the tt table request so we can reply with the full + * table. + * + * The local change history should still be cleaned up or it will only + * grow from this point onwards. Also tt.last_changeset should be set + * to NULL so tt response could send the full table instead of diff. */ - if (tt_diff_len > bat_priv->soft_iface->mtu) + if (tt_diff_len > bat_priv->soft_iface->mtu) { tt_diff_len = 0; + tt_diff_entries_num = 0; + drop_changes = true; + }
tvlv_len = batadv_tt_prepare_tvlv_local_data(bat_priv, &tt_data, &tt_change, &tt_diff_len); if (!tvlv_len) return;
- tt_data->flags = BATADV_TT_OGM_DIFF; - - if (tt_diff_len == 0) + if (!drop_changes && tt_diff_len == 0) goto container_register;
+ tt_data->flags = BATADV_TT_OGM_DIFF; + spin_lock_bh(&bat_priv->tt.changes_list_lock); atomic_set(&bat_priv->tt.local_changes, 0);
The tt.local_changes atomic is either written with tt.changes_list_lock or close to it (see batadv_tt_local_event()). Thus the performance gain using an atomic was limited (or because of atomic_read() impact even negative). Using atomic also comes with the need to be wary of potential negative tt.local_changes value.
Simplify the tt.local_changes usage by removing the atomic property and modifying it only with tt.changes_list_lock held.
Signed-off-by: Remi Pommarel repk@triplefau.lt --- net/batman-adv/soft-interface.c | 2 +- net/batman-adv/translation-table.c | 17 ++++++----------- net/batman-adv/types.h | 4 ++-- 3 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index 2758aba47a2f..2575f13992d2 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -783,13 +783,13 @@ static int batadv_softif_init_late(struct net_device *dev) atomic_set(&bat_priv->mesh_state, BATADV_MESH_INACTIVE); atomic_set(&bat_priv->bcast_seqno, 1); atomic_set(&bat_priv->tt.vn, 0); - atomic_set(&bat_priv->tt.local_changes, 0); atomic_set(&bat_priv->tt.ogm_append_cnt, 0); #ifdef CONFIG_BATMAN_ADV_BLA atomic_set(&bat_priv->bla.num_requests, 0); #endif atomic_set(&bat_priv->tp_num, 0);
+ bat_priv->tt.local_changes = 0; bat_priv->tt.last_changeset = NULL; bat_priv->tt.last_changeset_len = 0; bat_priv->isolation_mark = 0; diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index 50af82e7b50a..a22029511eb2 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -463,7 +463,6 @@ static void batadv_tt_local_event(struct batadv_priv *bat_priv, struct batadv_tt_change_node *tt_change_node, *entry, *safe; struct batadv_tt_common_entry *common = &tt_local_entry->common; u8 flags = common->flags | event_flags; - bool event_removed = false; bool del_op_requested, del_op_entry;
tt_change_node = kmem_cache_alloc(batadv_tt_change_cache, GFP_ATOMIC); @@ -508,21 +507,17 @@ static void batadv_tt_local_event(struct batadv_priv *bat_priv, del: list_del(&entry->list); kmem_cache_free(batadv_tt_change_cache, entry); + bat_priv->tt.local_changes--; kmem_cache_free(batadv_tt_change_cache, tt_change_node); - event_removed = true; goto unlock; }
/* track the change in the OGMinterval list */ list_add_tail(&tt_change_node->list, &bat_priv->tt.changes_list); + bat_priv->tt.local_changes++;
unlock: spin_unlock_bh(&bat_priv->tt.changes_list_lock); - - if (event_removed) - atomic_dec(&bat_priv->tt.local_changes); - else - atomic_inc(&bat_priv->tt.local_changes); }
/** @@ -994,7 +989,7 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) size_t tt_extra_len = 0; u16 tvlv_len;
- tt_diff_entries_num = atomic_read(&bat_priv->tt.local_changes); + tt_diff_entries_num = READ_ONCE(bat_priv->tt.local_changes); tt_diff_len = batadv_tt_len(tt_diff_entries_num);
/* if we have too many changes for one packet don't send any @@ -1022,7 +1017,7 @@ static void batadv_tt_tvlv_container_update(struct batadv_priv *bat_priv) tt_data->flags = BATADV_TT_OGM_DIFF;
spin_lock_bh(&bat_priv->tt.changes_list_lock); - atomic_set(&bat_priv->tt.local_changes, 0); + bat_priv->tt.local_changes = 0;
list_for_each_entry_safe(entry, safe, &bat_priv->tt.changes_list, list) { @@ -1438,7 +1433,7 @@ static void batadv_tt_changes_list_free(struct batadv_priv *bat_priv) kmem_cache_free(batadv_tt_change_cache, entry); }
- atomic_set(&bat_priv->tt.local_changes, 0); + bat_priv->tt.local_changes = 0; spin_unlock_bh(&bat_priv->tt.changes_list_lock); }
@@ -3701,7 +3696,7 @@ static void batadv_tt_local_commit_changes_nolock(struct batadv_priv *bat_priv) { lockdep_assert_held(&bat_priv->tt.commit_lock);
- if (atomic_read(&bat_priv->tt.local_changes) < 1) { + if (READ_ONCE(bat_priv->tt.local_changes) < 1) { if (!batadv_atomic_dec_not_zero(&bat_priv->tt.ogm_append_cnt)) batadv_tt_tvlv_container_update(bat_priv); return; diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 04f6398b3a40..f491bff8c51b 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -1022,7 +1022,7 @@ struct batadv_priv_tt { atomic_t ogm_append_cnt;
/** @local_changes: changes registered in an originator interval */ - atomic_t local_changes; + size_t local_changes;
/** * @changes_list: tracks tt local changes within an originator interval @@ -1044,7 +1044,7 @@ struct batadv_priv_tt { */ struct list_head roam_list;
- /** @changes_list_lock: lock protecting changes_list */ + /** @changes_list_lock: lock protecting changes_list & local_changes */ spinlock_t changes_list_lock;
/** @req_list_lock: lock protecting req_list */
When adding a local TT twice within the same OGM interval (e.g. happens when flag get updated), the flags of the first TT change entry is updated with the second one and both change events is added to the change list. This leads to having the same ADD change entry twice. Similarly a DEL+DEL scenario is also creating twice the same event.
Deduplicate ADD+ADD or DEL+DEL scenarios to reduce the TT change events that need to be sent in both OGM and TT response.
Signed-off-by: Remi Pommarel repk@triplefau.lt --- net/batman-adv/translation-table.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index a22029511eb2..01fc19803304 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -500,14 +500,17 @@ static void batadv_tt_local_event(struct batadv_priv *bat_priv, /* this is a second add in the same originator interval. It * means that flags have been changed: update them! */ - if (!del_op_requested && !del_op_entry) + if (del_op_requested == del_op_entry) { entry->change.flags = flags; + goto discard; + }
continue; del: list_del(&entry->list); kmem_cache_free(batadv_tt_change_cache, entry); bat_priv->tt.local_changes--; +discard: kmem_cache_free(batadv_tt_change_cache, tt_change_node); goto unlock; }
b.a.t.m.a.n@lists.open-mesh.org