Hi,
At first everything seemed to work. A node on
the one end could ping a
node on the other end over the mesh-network. The ping was hopping from
node to node as expected.
But sometimes some paths do not work anymore.
Some nodes can only reach their direct neighbors via a "normal ping". A
ping to a node via one hop does not work. A "batctl ping" does work!
This only happens to parts of the network and is not permanent. If i
wait it will recover, but then the problem appears at another node.
since "batctl ping" works I'd say your mesh works fine - you have a problem
in
your higher layers. Maybe a mac address collision or an ARP timeout ?
Can you provide specific examples we can go through ? For instance, provide
the batctl ping output to the neighbor in question, the ping error message
(does it say timeout / host could not be found / etc), a batctl traceroute to
the neighbor in question and the output of the global translation table.
Are you trying to ping a 'fixed' node or a node that is roaming ?
Regards,
Marek
Hello Marek,
thanks for you response. I'll try to give you an example - i'll cut out
the parts that are not relevant (i hope).
First i have to correct the version - it seems to be 2011.3 - not 2011.2
as the subject says.
root@fon-58:~# dmesg | grep "batman_adv"
batman_adv: B.A.T.M.A.N. advanced 2011.3.0 (compatibility version 14) loaded
The route from bat49 to bat58 is not working. It should hop via bat59.
root@1043-49:~# batctl o
[B.A.T.M.A.N. adv 2011.3.0, MainIF/MAC: wlan2/9e:0c:6d:ee:7c:ba (bat0)]
Originator last-seen (#/255) Nexthop [outgoingIF]:
Potential nexthops ...
bat58 3.080s (168) bat59 [
wlan2]: bat59 (168) bat51 ( 0)
bat60 (127)
bat59 3.130s (202) bat59 [
wlan2]: bat60 (155) bat51 (134)
bat59 (202)
root@fon-59:~# batctl o
[B.A.T.M.A.N. adv 2011.3.0, MainIF/MAC: wlan2/0a:18:84:80:87:9d (bat0)]
Originator last-seen (#/255) Nexthop [outgoingIF]:
Potential nexthops ...
bat58 4.740s (210) bat58 [
wlan2]: bat55 ( 0) bat51 ( 0)
bat49 ( 0) bat52 ( 0) bat67 (148)
bat53 (120) bat54 (191) bat60 (170)
bat58 (210)
bat49 0.040s (192) bat49 [
wlan2]: bat52 ( 36) bat55 ( 0)
bat67 (106) bat58 (148) bat54 (129)
bat53 ( 80) bat60 (152) bat49 (192)
bat51 (112)
root@fon-58:~# batctl o
[B.A.T.M.A.N. adv 2011.3.0, MainIF/MAC: wlan2/0a:18:84:81:a1:0d (bat0)]
Originator last-seen (#/255) Nexthop [outgoingIF]:
Potential nexthops ...
bat49 0.570s (174) bat59 [
wlan2]: bat51 ( 4) bat52 ( 9)
bat55 ( 5) bat54 (149) bat53 (140)
bat67 (156) bat60 ( 95) bat59 (174)
bat49 ( 0)
bat59 0.990s (245) bat59 [
wlan2]: bat55 ( 8) bat51 ( 3)
bat52 ( 8) bat60 (137) bat53 (186)
bat67 (217) bat54 (206) bat59 (245)
ifconfigs:
root@1043-49:~# ifconfig bat0
bat0 Link encap:Ethernet HWaddr 9E:90:FC:DC:99:09
inet addr:192.168.111.49 Bcast:192.168.111.255
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:9410 errors:0 dropped:0 overruns:0 frame:0
TX packets:64693 errors:0 dropped:2560 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1622914 (1.5 MiB) TX bytes:13322553 (12.7 MiB)
root@1043-49:~# ifconfig wlan2
wlan2 Link encap:Ethernet HWaddr 9E:0C:6D:EE:7C:BA
UP BROADCAST RUNNING MULTICAST MTU:1528 Metric:1
RX packets:84071 errors:0 dropped:78 overruns:0 frame:0
TX packets:112446 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6693056 (6.3 MiB) TX bytes:20633620 (19.6 MiB)
root@fon-59:~# ifconfig bat0
bat0 Link encap:Ethernet HWaddr D6:0F:24:F1:43:3C
inet addr:192.168.111.59 Bcast:192.168.111.255
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:23493 errors:0 dropped:0 overruns:0 frame:0
TX packets:5078 errors:0 dropped:8 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4006301 (3.8 MiB) TX bytes:726585 (709.5 KiB)
root@fon-59:~# ifconfig wlan2
wlan2 Link encap:Ethernet HWaddr 0A:18:84:80:87:9D
UP BROADCAST RUNNING MULTICAST MTU:1528 Metric:1
RX packets:298487 errors:0 dropped:748 overruns:0 frame:0
TX packets:176654 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18665398 (17.7 MiB) TX bytes:14335354 (13.6 MiB)
root@fon-58:~# ifconfig bat0
bat0 Link encap:Ethernet HWaddr C2:90:A3:3B:4E:C9
inet addr:192.168.111.58 Bcast:192.168.111.255
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:23159 errors:0 dropped:0 overruns:0 frame:0
TX packets:7759 errors:0 dropped:2298 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1115758 (1.0 MiB) TX bytes:737874 (720.5 KiB)
root@fon-58:~# ifconfig wlan2
wlan2 Link encap:Ethernet HWaddr 0A:18:84:81:A1:0D
UP BROADCAST RUNNING MULTICAST MTU:1528 Metric:1
RX packets:3475063 errors:0 dropped:1422 overruns:0 frame:0
TX packets:1601622 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:154462355 (147.3 MiB) TX bytes:100565468 (95.9 MiB)
this is working:
root@1043-49:~# batctl p bat59
PING bat59 (0a:18:84:80:87:9d) 20(48) bytes of data
20 bytes from bat59 icmp_seq=1 ttl=49 time=6.12 ms
and this:
root@1043-49:~# batctl p bat58
PING bat58 (0a:18:84:81:a1:0d) 20(48) bytes of data
20 bytes from bat58 icmp_seq=1 ttl=48 time=17.61 ms
and this too:
root@1043-49:~# ping 192.168.111.59
PING 192.168.111.59 (192.168.111.59): 56 data bytes
64 bytes from 192.168.111.59: seq=0 ttl=64 time=7.621 ms
this NOT:
root@1043-49:~# ping 192.168.111.58
PING 192.168.111.58 (192.168.111.58): 56 data bytes
the route seems ok:
root@1043-49:~# batctl tr bat58
traceroute to bat58 (0a:18:84:81:a1:0d), 50 hops max, 20 byte packets
1: bat59 (0a:18:84:80:87:9d) 4.297 ms 31.777 ms 0.938 ms
2: bat58 (0a:18:84:81:a1:0d) 7.868 ms 4.153 ms 3.352 ms
I see the pings going out on bat49
root@1043-49:~# batctl td wlan2 | grep "ICMP"
13:16:39.026715 BAT bat49 > bat58: UCAST, ttvn 1, ttl 50, IP
192.168.111.49 > 192.168.111.58: ICMP echo request, id 9467, seq 16,
length 64
i even see the packet come into bat58:
root@fon-58:~# batctl td wlan2 | grep "ICMP"
13:18:39.715935 BAT bat59 > bat58: UCAST, ttvn 1, ttl 48, IP
192.168.111.49 > 192.168.111.58: ICMP echo request, id 9467, seq 159,
length 64
but no reply.
in the bat0-interface i can see the reply:
root@fon-58:~# batctl td bat0 | grep "ICMP"
13:19:15.730081 IP 192.168.111.49 > 192.168.111.58: ICMP echo request,
id 9467, seq 195, length 64
13:19:15.732864 IP 192.168.111.58 > 192.168.111.49: ICMP echo reply, id
9467, seq 195, length 64
the arp-table of bat58 looks good:
root@fon-58:~# arp -a
IP address HW type Flags HW address Mask
Device
192.168.111.49 0x1 0x2 9e:90:fc:dc:99:09 * bat0
the other direction does not work either:
root@fon-58:~# ping 192.168.111.49
PING 192.168.111.49 (192.168.111.49): 56 data bytes
the packet go out on bat58 on the bat0 interface
root@fon-58:~# batctl td bat0 | grep "ICMP"
13:54:15.727222 IP 192.168.111.58 > 192.168.111.49: ICMP echo request,
id 1961, seq 112, length 64
but it its *NOT* visible in the wlan-interface:
root@fon-58:~# batctl td wlan2 | grep "ICMP"
A ping from bat58 to bat59 works:
root@fon-58:~# ping 192.168.111.59
PING 192.168.111.59 (192.168.111.59): 56 data bytes
64 bytes from 192.168.111.59: seq=0 ttl=64 time=15.729 ms
and appears in both dumps:
root@fon-58:~# batctl td wlan2 | grep "ICMP"
14:00:50.522992 BAT bat58 > bat59: UCAST, ttvn 1, ttl 50, IP
192.168.111.58 > 192.168.111.59: ICMP echo request, id 1997, seq 3,
length 64
14:00:50.530158 BAT bat59 > bat58: UCAST, ttvn 1, ttl 50, IP
192.168.111.59 > 192.168.111.58: ICMP echo reply, id 1997, seq 3, length 64
root@fon-58:~# batctl td bat0 | grep "ICMP"
14:01:05.563243 IP 192.168.111.58 > 192.168.111.59: ICMP echo request,
id 1997, seq 18, length 64
14:01:05.567195 IP 192.168.111.59 > 192.168.111.58: ICMP echo reply, id
1997, seq 18, length 64
Why is the ICMP-Ping from 58 to 49 not send on the wlan?
Does the "TX-dropped" count in ifconfig mean anything?
I dont't understand the "batctl tg". If i repeat the command it gives me
different results:
root@fon-58:~# batctl tg |grep "49"
* 04:11:80:f4:40:c8 ( 1) via bat49 ( 1)
root@fon-58:~# batctl tg |grep "49"
* 0c:6d:ee:7c:ba:01 ( 1) via bat49 ( 1)
root@fon-58:~# batctl tg |grep "49"
* 04:11:80:f4:40:c8 ( 1) via bat49 ( 1)
root@fon-58:~# batctl tg |grep "49"
* 18:84:80:34:51:01 ( 1) via bat49 ( 1)
root@fon-58:~# batctl tg |grep "49"
* 04:30:48:60:6c:dd ( 1) via bat49 ( 1)
root@fon-58:~# batctl tg |grep "49"
* 04:11:80:f4:40:c8 ( 1) via bat49 ( 1)
i have not yet found a device with the mac "04:11:80:f4:40:c8"
if i look in the logs on bat49 it keeps creating and deleting an enrty
with this address:
root@1043-49:~# batctl l | grep "40:c8"
[ 9726] Creating new global tt entry: 04:11:80:f4:40:c8 (via
0a:18:84:1e:f6:05)
[ 9726] Deleting global tt entry 04:11:80:f4:40:c8 (via
0a:18:84:1e:f6:05): originator time out
The nodes are fixed an not moving. Do i have to specify them as
non-roaming somehow?
We have problems with the correct / same time on all devices. Is that a
problem for batman?
Tobias