Anno domini 2018 Matthias Schiffer scripsit:
On 01/22/2018 10:18 PM, Matthias Schiffer wrote:
> On 01/22/2018 09:52 PM, Sven Eckelmann wrote:
>> On Montag, 22. Januar 2018 20:24:50 CET Matthias Schiffer wrote:
>>> skb_postpull_rcsum() is necessary after eth_type_trans() to adjust the
>>> skb checksum, otherwise log spam of the form "bat0: hw csum
>>> result when packets with CHECKSUM_COMPLETE are received (at least in some
>>> setups, e.g. when stacking batman-adv on top of VXLAN).
>> Would be nice to have a better explanation here.
>> The comment previously assumed that skb_pull_rcsum would be enough. But the
>> problem here is that the skb_pull_rcsum only pulls the batman-adv headers. The
>> actual pull of the ethernet header (with skb_pull_inline) happens inside
>> eth_type_trans. Or did I miss anything?
> This is correct, eth_type_trans() contains a simple skb_pull(), so the csum
> must be adjusted afterwards (grepping the kernel for eth_type_trans will
> find a lot of this). I can send a v2 with a better commit message later.
>>> I don't know what the exact circumstances are that trigger the log
>>> but it seems this was broken forever (I could also reproduce the issue with
>>> our compat-14 legacy branch)... so please ask David to queue this up for
>>> stable :)
>> Yes, this is broken since earliest commits. The most relevant commit in
>> batman-adv is:
>> Fixes: fe28a94c01e1 ("batman-adv: receive packets directly using
>> But I would propose to use following in the kernel tree:
>> Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
>> The 4.15 release will be soon(tm) and Simon is currently on vacation. So we
>> will most likely postpone the submission to David until Simon found a way out
>> of the snow and after 4.15 is released...
>> But it would be nice when some people could test the patch  (together with
>> vxlan?) on batman-adv or batman-adv-legacy. And please provide a
>> "Tested-by: Full Name <email(a)example.org>"  reply when it
>> Thanks,> Sven
> I've tested this on Kernel 4.14.14 (everything working correctly now) and
> 4.4.110 (here, there are still checksum errors; it seems on older kernels,
> the checksum handling in VXLAN is broken too? Still debugging this...)
I've found the issue of this other checksum problem: batman-adv
fragmentation code doesn't handle the checksum on reassembly at all. I
think the best option here is to simply set ip_summed to CHECKSUM_NONE on
reassembly, I will send another patch for that.
The IP fragmentation code does more fancy things when all fragments have
CHECKSUM_COMPLETE, adding up the checksums of the fragments under certain
circumstances. This only works because IP fragments are guaranteed to be
split at even byte offsets (multiples of 8, actually); as far as I can
tell, batman-adv allows odd fragment sizes, making it impossible to add up
the 16bit checksums in the general case.
Tested-By: Maximilian Wilhelm <max(a)sdn.clinic>
to the fix for fragmentation.c, too.
Disclaimer: As MTUs are calculated accordingly in our backbone
fragmentation of VXLAN packets isn't an issue and we did not see these
messages before. I can confirm, that I still don't see any now,
meaning the log spam from the previous fix is still fixed and no new
issues have arisen as of now.
Thanks a lot! <3
"Does is bother me, that people hurt others, because they are to weak to face the
truth? Yeah. Sorry 'bout that."
-- Thirteen, House M.D.