On Sat, Feb 20, 2010 at 07:04:11PM +0100, Andrew Lunn wrote:
On Fri, Feb 19, 2010 at 06:19:05PM +0100, Linus L??ssing wrote:
Hi Andrew,
Sorry, didn't have the time to try your patch any earlier, I'm right in the middle of my exams :).
Hi Linus
Marek told me. No problems. I remember what its like studying for exams. However, it is nice to sometimes take a break and do something totally different.
Your patch already looks quite good, I couldn't reproduce any memory leaks or crashes here (tried that with three routers and 1 or 2 vis servers activated, also activating/deactivating them a lot, no problems with that). And yes, the slow-path warning has gone with your patch.
Great. So we are on the right tracks.
However, I'm having some weird output when connecting two routers over wifi _and_ over ethernet cable. The setup:
Before plugging in the cable: r1-ath1 <-- wifi --> r2-ath1
root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.32"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.11"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } }
After plugging in the cable: r1-ath1 <-- wifi --> r2-ath1 + r1-eth0.3 <-- cable --> r2-eth0.3
root@OpenWrt:~# batctl vd dot digraph { "r1-ath1" -> "r2-ath1" [label="1.0"] "r1-ath1" -> "r2-eth0.3" [label="1.66"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } } root@OpenWrt:~# cat /proc/net/batman-adv/vis_data 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA 5a:2e:1e:1f:64:6b, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 06:22:b0:98:87:dd 255, TQ 04:22:b0:98:87:de 251, HNA 00:22:b0:98:87:f9, HNA 82:31:95:f9:14:6f, SEC 04:22:b0:98:87:fa, PRIMARY,
Actually, this vis_data to does not map to the dot above! There are the wrong number of HNA, wrong order etc.
Hmm, just noticed, the output also seems to be flapping between those two from time to time: ------------------ root@OpenWrt:~# cat /proc/net/batman-adv/vis 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA f6:ae:97:b3:9a:5c, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 04:22:b0:98:87:de 251, HNA da:3e:79:2c:d3:3e, HNA 00:22:b0:98:87:f9, PRIMARY, SEC 04:22:b0:98:87:fa, root@OpenWrt:~# cat /proc/net/batman-adv/vis 06:22:b0:98:87:dd,TQ 04:22:b0:98:87:fa 251, HNA 00:22:b0:98:87:dd, HNA f6:ae:97:b3:9a:5c, PRIMARY, SEC 04:22:b0:98:87:de, 06:22:b0:98:87:f9,TQ 06:22:b0:98:87:dd 255, TQ 04:22:b0:98:87:de 251, HNA da:3e:79:2c:d3:3e, HNA 00:22:b0:98:87:f9, SEC 04:22:b0:98:87:fa, PRIMARY, ------------------
Here is what i think your bat-host file contains: 06:22:b0:98:87:dd r1-ath1 06:22:b0:98:87:f9 r2-ath1 00:22:b0:98:87:dd r1-hna 04:22:b0:98:87:de r1-eth0.3 00:22:b0:98:87:f9 r2-hna 04:22:b0:98:87:fa r2-eth0.3
and this is what i get, assuming i got the MAC->name mapping correct:
Yes, correct mapping :).
digraph { "r1-ath1" -> "r2-eth0.3" [label="1.15"] "r1-ath1" -> "r1-hna" [label="HNA"] "r1-ath1" -> "5a:2e:1e:1f:64:6b" [label="HNA"] subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] } subgraph "cluster_r1-ath1" { "r1-ath1" [peripheries=2] "r1-eth0.3" } "r2-ath1" -> "r1-ath1" [label="1.0"] "r2-ath1" -> "r1-eth0.3" [label="1.15"] "r2-ath1" -> "r2-hna" [label="HNA"] "r2-ath1" -> "82:31:95:f9:14:6f" [label="HNA"] subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] "r2-eth0.3" } subgraph "cluster_r2-ath1" { "r2-ath1" [peripheries=2] } }
batctl parses top-to-bottom, left-to-right. It does not consolidate the PRIMARY and the SECONDARY into one cluster. It leaves DOT to do that. Hence there are two cluster statements for each cluster actually drawn.
So the second 'subgraph "cluster_r1-ath1"' is obviously unnecessary.
Yes, unnecessary, but makes the batctl code easier.
Also "r1-ath1" -> "r2-eth0.3" looks wrong, should be
"r1-eth0.3" -> "r2-eth0.3" instead (and the same with r2 a few lines later).
These comments i agree with. A wireless and a wired device should not be neighbours. We don't have any records which originate from the secondary MAC address. That is guess is the major problem here.
So, did my/Mareks patch break it, or was it broken before?
First i suggest you go back to just before Simon's patch which introduced receiving using skbufs:
http://open-mesh.org/changeset/1517
That will tell us if we need to go back further, or our patch broke it.
If you need to go back further, i would suggest just before:
Okay, just checked, this got introduced with 1510 already, yes. I might have a closer look at this next week.
Cheers, Linus