On Montag, 7. Mai 2018 10:06:20 CEST me wrote:
On 05/07/2018 08:32 AM, Sven Eckelmann wrote:
Can you attach a (small) pcap of actual traffic which show a packet which triggers this behavior?
For sure. See for the originator 5e:25:f7:13:2d:eb and vid=0x8000 in the attached pcap. I selected the interested packets (one ogm, one full table request and one full table response).
Thanks,
These are always from intermediate nodes. I was hoping to see the initial packet which "introduced" the incorrect vid=0x8000 without entry in the first place to better understand the problem which causes this.
At least I would have hoped that this might help the people at WBM to have a better discussion about it. At least to understand whether this is from an intermediate node or from the sender.
On Sonntag, 6. Mai 2018 21:55:59 CEST Leonardo Mörlein wrote: [...]
- The receiving batman-adv can not handle incoming TT announcements with empty vlans. (The crc check routine batadv_tt_global_check_crc() fails.)
This part is not 100% clear in the commit message.
Right now it just sounds like softif_vlan_list (see batadv_tt_prepare_tvlv_local_data) has an "empty" vlan entry (an entry without mac addresses). Which brings me to the point that batadv_softif_create_vlan initializes new vlan's with the CRC 0x0 before it adds it to the list (see tt/ batadv_vlan_tt in batadv_softif_vlan). We know now that the function batadv_tt_local_update_crc is responsible for updating the crc (holding the tt.commit_lock) on the sender site (for non-intermediate nodes). And this crc is calculated over the bat_priv->tt.local_hash - which (according to the crc 0x0) doesn't have a single (non-NEW) entry for this VID. The return (new crc) is therefore 0x0 and it will be submitted this way.
The same should be true for the receiver. The batadv_tt_global_update_crc goes over the list of entries for this vid. And we just received a full table which only contains non-vlan entries and no vlan 0 entries. My assumption would therefore be that the crc on the receiver should also be 0x0 when batadv_tt_global_update_crc was done. But if it is not then it would mean that there was still some entry on the receiver which batadv_tt_global_crc used to calculate the crc for this vid or the crc was just not calculated for it (because it doesn't exist).
The latter would match the content of the full table reply in your pcap, the comment in the patch and the info info I got from Antonio: The vid entry will only be created in the originator vlan_list when an entry was received (I haven't checked this). This would sound to me like the vlan should just be created instead of adding this workaround. But maybe you should discuss the details here with Antonio because at the end it is just about the different ways of handling "!vlan" case from your patch.
Btw. Antonio, wasn't the VLAN 0 always created on each device in the local table by the VLAN driver? See the message
[ 29.866038] 8021q: adding VLAN 0 to HW filter on device bat0
when batman-adv attaches itself to the vlan monitoring. But usually, there is at least one entry in it (the bat0 mac address). Not sure why this isn't the case here. At least batadv_interface_add_vid should add it at the end. Maybe there was an error before the entry was created? Would be interesting to know where exactly.
Kind regards, Sven