Hi I'm trying to build a mesh on Rasberry PI3 boards using latest (master) OpenWRT and batman 2018.2.
I only use IPv6 link local addressses (no IPV4/dhcp etc) with adhoc WIFI. GW mode of batman is off. Alfred is used to advertise some services to the mesh. A single node in the mesh has Alfred in master mode the others in slave mode. The node with Alfred in master mode, has a MQTT server. The others connect to this server once they get the information from Alfred.
Everything works fine with 31 hosts when I add more (34 max) everything stop working, hosts start to drop, I do see a lot ICMP6 neighbor advertisement/solicitation. I increased the default values for elp_interval and orig_interval to 5 and 10 seconds. Once I power down the extra host(s) everything goes back to normal. The actual traffic in the mesh beside the ICMP6 is low a few KB sent once per minute.
What I am doing wrong to hit this limit so soon ?
I'm attaching output from batctl originators and neighbors with 31 hosts.
Thanks, Nicu
On Thursday, 26 July 2018 03:50:10 HKT Nicu Pavel wrote:
Everything works fine with 31 hosts when I add more (34 max) everything stop working, hosts start to drop, I do see a lot ICMP6 neighbor advertisement/solicitation. I increased the default values for elp_interval and orig_interval to 5 and 10 seconds. Once I power down the extra host(s) everything goes back to normal. The actual traffic in the mesh beside the ICMP6 is low a few KB sent once per minute.
There is no hard-coded host limit in batman-adv. It is more likely to do with your wifi driver. What wifi chip & driver have you deployed ?
If you check the mailing list archives you will see the wifi driver issue coming up on a regular basis. This example discussion is merely a few days old:
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2018-July/017952.html
Cheers, Marek
Hi,
On Thu, Jul 26, 2018 at 4:12 AM, Marek Lindner mareklindner@neomailbox.ch wrote:
On Thursday, 26 July 2018 03:50:10 HKT Nicu Pavel wrote:
Everything works fine with 31 hosts when I add more (34 max) everything stop working, hosts start to drop, I do see a lot ICMP6 neighbor advertisement/solicitation. I increased the default values for elp_interval and orig_interval to 5 and 10 seconds. Once I power down the extra host(s) everything goes back to normal. The actual traffic in the mesh beside the ICMP6 is low a few KB sent once per minute.
There is no hard-coded host limit in batman-adv. It is more likely to do with your wifi driver. What wifi chip & driver have you deployed ?
I'm using kernel 4.9.111 with latest brcmfmac driver which loads the brcmfmac43430-sdio firmware for this board. The strange part is that once I get over 31 hosts the ICMP6 "explodes" with no apparent reason.
I'm also not sure about the originators list, if you look at the attached file, all originators TQ is pretty low and also each originator has all other hosts as NextHop. Not sure if this is correct or can affect. All these 30 hosts are pretty close to one another.
If you check the mailing list archives you will see the wifi driver issue coming up on a regular basis. This example discussion is merely a few days old:
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2018-July/017952.html
Yes, I found that thread, but problem is quite different as with RealTek chipsets seems like nothing really works.
Thanks, Nicu
On Thursday, 26 July 2018 14:14:14 HKT Nicu Pavel wrote:
I'm also not sure about the originators list, if you look at the attached file, all originators TQ is pretty low and also each originator has all other hosts as NextHop. Not sure if this is correct or can affect. All these 30 hosts are pretty close to one another.
It might be (as Linus has suggested) related to poor connectivity / airtime overcrowding. Though for that to be the case, I'd expect a more gradual packet loss symptoms - not a complete collapse when you switch from 31 to 32 devices.
If you check the mailing list archives you will see the wifi driver issue coming up on a regular basis. This example discussion is merely a few days old:
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2018-July/017952.html
Yes, I found that thread, but problem is quite different as with RealTek chipsets seems like nothing really works.
My point was rather to consider that the WiFi driver has its fingers in the pot too. It is somewhat unlikely that your issues are exactly the same as mentioned in this randomly selected thread.
By the way, searching for 'brcmfmac adhoc' returns quite a number of hits mentioning complaints concerning Rasberry PIs. Have you tried running your setup without enabling batman-adv at all ? Since all of the devices are so close to each other batman-adv is not needed. Pure adhoc mode will work just fine. This test will help you figuring out whether the ICMPv6 misbehavior is related to batman-adv or not.
Cheers, Marek
On Wed, Jul 25, 2018 at 10:50:10PM +0300, Nicu Pavel wrote:
root@gateway-b827ebecbfa7:~# batctl o [B.A.T.M.A.N. adv 2018.2, MainIF/MAC: wlan0/b8:27:eb:ec:bf:a7 (bat0/4e:06:5f:54:e5:d5 BATMAN_IV)] Originator last-seen (#/255) Nexthop [outgoingIF] b8:27:eb:a7:ed:ff 0.330s ( 24) b8:27:eb:ad:4b:7e [ wlan0] b8:27:eb:a7:ed:ff 0.330s ( 0) b8:27:eb:ec:e5:9f [ wlan0] b8:27:eb:a7:ed:ff 0.330s ( 43) b8:27:eb:97:55:b6 [ wlan0] b8:27:eb:a7:ed:ff 0.330s ( 38) b8:27:eb:1b:61:f0 [ wlan0] b8:27:eb:a7:ed:ff 0.330s ( 15) b8:27:eb:91:5c:41 [ wlan0]
[...]
Overall, the TQ values seem rather low.
root@gateway-b827ebecbfa7:~# batctl n [B.A.T.M.A.N. adv 2018.2, MainIF/MAC: wlan0/b8:27:eb:ec:bf:a7 (bat0/4e:06:5f:54:e5:d5 BATMAN_IV)] IF Neighbor last-seen wlan0 b8:27:eb:ad:df:c0 1.000s wlan0 b8:27:eb:30:15:c9 0.390s wlan0 b8:27:eb:a8:22:f6 1.560s wlan0 b8:27:eb:92:eb:b8 0.650s
[...]
wlan0 b8:27:eb:bb:a1:94 11.710s wlan0 b8:27:eb:91:5c:41 1.480s wlan0 b8:27:eb:5f:eb:e8 7.600s
Those are quite a lot of neighbor nodes. Do you actually have any airtime left? I can recommend H.O.R.S.T. to check that:
https://github.com/br101/horst
As a rule of thumb, BATMAN IV overhead increases linear to the overall number of mesh nodes, but squared to the number of neighbor nodes a particular node has.
Also, just to be on the safe side, check that there is no MAC address conflict / duplicate MAC address somewhere.
Regards, Linus
b.a.t.m.a.n@lists.open-mesh.org