Anyone done any performance testing over wired links? ie, if two ethernet ports are gigabit and are added to bat0, what kind of throughput is expected across these interfaces and how much CPU is needed to get up near wire-speed?
I'm only able to test in virtualbox at the moment.
I have a 3.1Ghz i5 and I'm able to do iperf across a batman-adv network (nodes A-B-C with A and C not directly connected).
iperf between a<>b is ~1Gbps, A<>C is ~560Mbps and B has 1 core hit 100% by ksoftirqd.
So it looks like I'm stuck in a single thread.
I'm going to add a 4th node and do a double hop A<>D iperf and see what happens. full duplex interfaces here so I don't think I'll see a big drop in the next test.
On Wed, Aug 16, 2017 at 9:39 AM, dan dandenson@gmail.com wrote:
Anyone done any performance testing over wired links? ie, if two ethernet ports are gigabit and are added to bat0, what kind of throughput is expected across these interfaces and how much CPU is needed to get up near wire-speed?
2 hops on full duplex links I'm getting 464Mbps. There is some loss at each hop and I'm not sure why. Could be virtualbox and resource contension that isn't showing up in task manager and/or top on the vm's.
is this to be expected?
On Sat, Aug 19, 2017 at 12:25 PM, dan dandenson@gmail.com wrote:
I'm only able to test in virtualbox at the moment.
I have a 3.1Ghz i5 and I'm able to do iperf across a batman-adv network (nodes A-B-C with A and C not directly connected).
iperf between a<>b is ~1Gbps, A<>C is ~560Mbps and B has 1 core hit 100% by ksoftirqd.
So it looks like I'm stuck in a single thread.
I'm going to add a 4th node and do a double hop A<>D iperf and see what happens. full duplex interfaces here so I don't think I'll see a big drop in the next test.
On Wed, Aug 16, 2017 at 9:39 AM, dan dandenson@gmail.com wrote:
Anyone done any performance testing over wired links? ie, if two ethernet ports are gigabit and are added to bat0, what kind of throughput is expected across these interfaces and how much CPU is needed to get up near wire-speed?
On Mittwoch, 16. August 2017 09:39:00 CEST dan wrote:
Anyone done any performance testing over wired links? ie, if two ethernet ports are gigabit and are added to bat0, what kind of throughput is expected across these interfaces and how much CPU is needed to get up near wire-speed?
I can easily reach gigabit here between an i5-3230M (receiver; r8169) and an i7-6700K (sender; igb). I was able to reach 926 Mbits/sec in a single stream iperf 2.0.9+dfsg1-1 30 second test. Both ethernet devices were configured with an MTU of 1560. Both systems were idling most of the time (> 90% of time while also running things like xapian, akonadi, ...).
There are some ethernet chips which have offloading features which work for IP or similar layer 3 protocols. This layer 3 specific offloading will not work when batman-adv is added as a layer between the ethernet header and the IP header. I have even seen chips which then went completely slow and got only fast again after the offloading features were disabled in hardware (either using ethtool or by patching the driver).
And there is currently no special batman-adv support for the flow dissector [1] in the kernel. This could be also a reason why multiple flows are not distributed well to different cores when you enable RPS/XPS. It is not yet know whether this will actually be helpful but at least someone interested could do same research and implement a proof-of-concept patches for further testing.
Kind regards, Sven
[1] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/net/...
And there is currently no special batman-adv support for the flow dissector [1] in the kernel. This could be also a reason why multiple flows are not distributed well to different cores when you enable RPS/XPS. It is not yet know whether this will actually be helpful but at least someone interested could do same research and implement a proof-of-concept patches for further testing.
Hi Sven
DSA just added a patch to help with flow dissectors. Maybe it can be generalized and made to work for BATMAN as well?
commit 43e665287f931a167cd2eea3387efda901bff0ce Author: John Crispin john@phrozen.org Date: Wed Aug 9 14:41:19 2017 +0200
net-next: dsa: fix flow dissection
RPS and probably other kernel features are currently broken on some if not all DSA devices. The root cause of this is that skb_hash will call the flow_dissector. At this point the skb still contains the magic switch header and the skb->protocol field is not set up to the correct 802.3 value yet. By the time the tag specific code is called, removing the header and properly setting the protocol an invalid hash is already set. In the case of the mt7530 this will result in all flows always having the same hash.
On Sun, Aug 20, 2017 at 8:05 AM, Andrew Lunn andrew@lunn.ch wrote:
And there is currently no special batman-adv support for the flow dissector [1] in the kernel. This could be also a reason why multiple flows are not distributed well to different cores when you enable RPS/XPS. It is not yet know whether this will actually be helpful but at least someone interested could do same research and implement a proof-of-concept patches for further testing.
Hi Sven
DSA just added a patch to help with flow dissectors. Maybe it can be generalized and made to work for BATMAN as well?
commit 43e665287f931a167cd2eea3387efda901bff0ce Author: John Crispin john@phrozen.org Date: Wed Aug 9 14:41:19 2017 +0200
net-next: dsa: fix flow dissection RPS and probably other kernel features are currently broken on some if not all DSA devices. The root cause of this is that skb_hash will call the flow_dissector. At this point the skb still contains the magic switch header and the skb->protocol field is not set up to the correct 802.3 value yet. By the time the tag specific code is called, removing the header and properly setting the protocol an invalid hash is already set. In the case of the mt7530 this will result in all flows always having the same hash.
Ok, maybe virtualbox specific slowness then. I'm running batman-adv 2015.2 on ubuntu kernel 4.4.0, so a little out of date.
I'm actually wanting to target something embedded, like a pc-engines APU2c4. It has 3 onboard ethernets and I can add up to 8 more via pci-e. But that has a 1Ghz CPU (quad core) and being stuck on 1 core 1/4ers the potential.
The test network I'm designing is handling wireless PtP backhauls and having batman-adv make all tower sites look like a single switch, almost list a replacement for shortest path bridging. 1Gbps would be the real base line performance here.
I have another question on the list about running on top of vlans, specifically do to something like router-on-a-stick with a wisp specific PoE switch and running a vlan across backhaul links and putting those vlans in batman-adv. That way I don't have issue accessing the ptp link interfaces themselves.
b.a.t.m.a.n@lists.open-mesh.org