Hi,
in the last days we upgraded nearly all (~500) routers of our Freifunk -mesh from 2013.4 to 2014.3. The most things run fine again, but with some routers we have connectivity issues as follows:
- gateway gets alfreddata from router - router pingable via batctl - ping via linklocal, ULA and public-IPv6 from gateway to router not possible - other routers in the same mesh can ping it via ULA etc (connected via the same fastd-VPN)
Public-IPv6 is announced from a gateway via radvd into the mesh.
Unfortunately we are out of ideas where we can search for the reason.
Regards bjo
Hi,
Am 2015-06-11 14:19, schrieb Bjoern Franke:
- gateway gets alfreddata from router
Continuously? So that directions seems fine then.
- router pingable via batctl
- ping via linklocal, ULA and public-IPv6 from gateway to router not
possible
What does "not possible" mean? There's no reply?
- other routers in the same mesh can ping it via ULA etc (connected via
the same fastd-VPN)
So it's just the gateway that can't reach the "router"?
Public-IPv6 is announced from a gateway via radvd into the mesh.
So even in that direction, NDP is fine. Seems only ICMPv6 echo requests are somewhat different then?
Unfortunately we are out of ideas where we can search for the reason.
Look what's actually on the wire on both ends? Tcpdump and batctl tcpdump?
-hwh
Hi hwh,
Continuously? So that directions seems fine then.
Yep.
What does "not possible" mean? There's no reply?
No reply, yes.
So it's just the gateway that can't reach the "router"?
Correct.
Public-IPv6 is announced from a gateway via radvd into the mesh.
So even in that direction, NDP is fine. Seems only ICMPv6 echo requests are somewhat different then?
ip -6 neigh show says "FAILED".
Unfortunately we are out of ideas where we can search for the reason.
Look what's actually on the wire on both ends? Tcpdump and batctl tcpdump?
On the gateway, there are neighbor solicitations from the "router" and advertisements from the gateway.
At the moment, it's not possible to access the "router" to take a look what's going wrong there.
Regards bjo
Hi,
Am 2015-06-12 17:20, schrieb Bjoern Franke:
Public-IPv6 is announced from a gateway via radvd into the mesh.
So even in that direction, NDP is fine. Seems only ICMPv6 echo requests are somewhat different then?
ip -6 neigh show says "FAILED".
My bad: I planned to write "RA" instead of NDP. In fact, NDP seems to be failing, as you have determined.
Look what's actually on the wire on both ends? Tcpdump and batctl tcpdump?
On the gateway, there are neighbor solicitations from the "router" and advertisements from the gateway.
The advertisements are fine then, good.
I've seen the behaviour you're looking at with Linux bridge code and icmp snooping. Mind you: with *activated* icmp snooping. The NDP does not (does it?) register ICMP multicast with listener notices (or does it once, too early?), and the bridge code does then not consider the relevant end that should receive NDP traffic a destination for the multicast traffic that should have gotten there.
Of course, the question that evolves it whether the solicitation is actually received, then handled - answered to - by the gateway.
Did any network bridges come into play on the gateway(s) that hadn't been there before?
-hwh
On Thursday, June 11, 2015 14:19:23 Bjoern Franke wrote:
- ping via linklocal, ULA and public-IPv6 from gateway to router not
possible
- other routers in the same mesh can ping it via ULA etc (connected via
the same fastd-VPN)
Public-IPv6 is announced from a gateway via radvd into the mesh.
Unfortunately we are out of ideas where we can search for the reason.
Do you have the multicast optimizations enabled ? 2014.3.0 still has a known bug causing these optimizations to harm multicast traffic. Either disable this feature or upgrade to something newer than 2014.3.0.
Cheers, Marek
Hi Marek,
Do you have the multicast optimizations enabled ? 2014.3.0 still has a known bug causing these optimizations to harm multicast traffic. Either disable this feature or upgrade to something newer than 2014.3.0.
IThanks for your reply. Multicast optimizations are disabled, but we ha ve multicast related errors in the logs: br-mesh: Multicast hash table chain limit reached: bat0br-mesh: Cannot rehash multicast hash table, disabling snooping: bat0, 201, -22 Regardsbjo -- xmpp bjo@schafweide.org
On Thursday, June 11, 2015 16:44:14 Bjoern Franke wrote:
Thanks for your reply. Multicast optimizations are disabled, but we ha ve multicast related errors in the logs: br-mesh: Multicast hash table chain limit reached: bat0br-mesh: Cannot rehash multicast hash table, disabling snooping: bat0, 201, -22
That message is not coming from batman-adv but the bridge code itself. To me it looks like the bridge code disables multicast once the hash table limit is reached. Use the search engine of your choice to get better info. Sounds definitely related to your issue.
Cheers, Marek
On Thu, Jun 11, 2015 at 04:44:14PM +0200, Bjoern Franke wrote:
Hi Marek,
Do you have the multicast optimizations enabled ? 2014.3.0 still has a known bug causing these optimizations to harm multicast traffic. Either disable this feature or upgrade to something newer than 2014.3.0.
IThanks for your reply. Multicast optimizations are disabled, but we ha ve multicast related errors in the logs: br-mesh: Multicast hash table chain limit reached: bat0br-mesh: Cannot rehash multicast hash table, disabling snooping: bat0, 201, -22 Regardsbjo
There's a hash_max value for the bridge (/sys/class/net/<br>/bridge/hash_max). By default it's rather small, just 512 entries / multicast listeners, so it's expected that with 500 nodes the bridge multicast snooping will shut down.
Nevertheless, even if the bridge deactivates its multicast snooping that shouldn't cause trouble for ICMPv6.
Which firmware are you using, do you see ICMPv6 packets entering bat0 on the gateway and leaving bat0 on the router?
Cheers, Linus
b.a.t.m.a.n@lists.open-mesh.org