eth_type_trans() internally calls skb_pull(), which does not adjust the skb checksum; skb_postpull_rcsum() is necessary to avoid log spam of the form "bat0: hw csum failure" when packets with CHECKSUM_COMPLETE are received.
Note that in usual setups, packets don't reach batman-adv with CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see batadv's ethtype?), which is why the log messages do nor occur on every system using batman-adv. I could reproduce this issue by stacking batman-adv on top of a VXLAN interface.
Signed-off-by: Matthias Schiffer mschiffer@universe-factory.net --- net/batman-adv/soft-interface.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index 9f673cdf..6f7ce7a6 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -451,13 +451,7 @@ void batadv_interface_rx(struct net_device *soft_iface,
/* skb->dev & skb->pkt_type are set here */ skb->protocol = eth_type_trans(skb, soft_iface); - - /* should not be necessary anymore as we use skb_pull_rcsum() - * TODO: please verify this and remove this TODO - * -- Dec 21st 2009, Simon Wunderlich - */ - - /* skb->ip_summed = CHECKSUM_UNNECESSARY; */ + skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
batadv_inc_counter(bat_priv, BATADV_CNT_RX); batadv_add_counter(bat_priv, BATADV_CNT_RX_BYTES,
A more sophisticated implementation could try to combine fragment checksums when all fragments have CHECKSUM_COMPLETE and are split at even offsets. For now, we just set ip_summed to CHECKSUM_NONE to avoid "hw csum failure" warnings in the kernel log when fragmented frames are received. In consequence, skb_pull_rcsum() can be replaced with skb_pull().
Note that in usual setups, packets don't reach batman-adv with CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see batadv's ethtype?), which is why the log messages do nor occur on every system using batman-adv. I could reproduce this issue by stacking batman-adv on top of a VXLAN interface.
Signed-off-by: Matthias Schiffer mschiffer@universe-factory.net --- net/batman-adv/fragmentation.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c index ebe6e389..1bb2b43f 100644 --- a/net/batman-adv/fragmentation.c +++ b/net/batman-adv/fragmentation.c @@ -287,7 +287,8 @@ batadv_frag_merge_packets(struct hlist_head *chain) /* Move the existing MAC header to just before the payload. (Override * the fragment header.) */ - skb_pull_rcsum(skb_out, hdr_size); + skb_pull(skb_out, hdr_size); + skb_out->ip_summed = CHECKSUM_NONE; memmove(skb_out->data - ETH_HLEN, skb_mac_header(skb_out), ETH_HLEN); skb_set_mac_header(skb_out, -ETH_HLEN); skb_reset_network_header(skb_out);
On Tue, Jan 23, 2018 at 10:59:49AM +0100, Matthias Schiffer wrote:
eth_type_trans() internally calls skb_pull(), which does not adjust the skb checksum; skb_postpull_rcsum() is necessary to avoid log spam of the form "bat0: hw csum failure" when packets with CHECKSUM_COMPLETE are received.
Note that in usual setups, packets don't reach batman-adv with CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see batadv's ethtype?), which is why the log messages do nor occur on every system using batman-adv. I could reproduce this issue by stacking batman-adv on top of a VXLAN interface.
Signed-off-by: Matthias Schiffer mschiffer@universe-factory.net
Seems reasonable, this change.
I'm just a little confused though: Two years ago someone had reported checksumming errors with a Raspberry Pi and batman-adv:
https://www.open-mesh.org/issues/224
And they seemed to be gone in newer kernel versions, while hardware checksumming was supposedly still enabled.
Are the issues reproduceable without using VXLANs (for instance on a Pi1 or Pi2)? Are they reproduceable on a recent kernel version? (I guess you tested with a 4.4 kernel?)
Regards, Linus
On Dienstag, 23. Januar 2018 10:59:49 CET Matthias Schiffer wrote:
eth_type_trans() internally calls skb_pull(), which does not adjust the skb checksum; skb_postpull_rcsum() is necessary to avoid log spam of the form "bat0: hw csum failure" when packets with CHECKSUM_COMPLETE are received.
Note that in usual setups, packets don't reach batman-adv with CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see batadv's ethtype?), which is why the log messages do nor occur on every system using batman-adv. I could reproduce this issue by stacking batman-adv on top of a VXLAN interface.
Signed-off-by: Matthias Schiffer mschiffer@universe-factory.net
Applied both as 798174b15153 [1] and 2c1bce065baa [2]. They are also queued up in linux-merge [3] but will most likely only forwarded when Simon is back from his vacation. Tested-by's are still welcome :)
Thanks, Sven
[1] https://git.open-mesh.org/batman-adv.git/commit/798174b15153afd88268f2f87811... [2] https://git.open-mesh.org/batman-adv.git/commit/2c1bce065baa688bc1eca4116f83... [3] https://git.open-mesh.org/linux-merge.git/shortlog/refs/heads/batadv/net
b.a.t.m.a.n@lists.open-mesh.org