On Thursday, January 03, 2013 22:16:32 Jan Lühr wrote:
We noticed a few, unusual things: -> regardless of the link tested (kif or fastd3) and regardless of the node (node1, node2) small interruptions (loss-peaks) are appearing. -> These peaks appear almost synchronous, a few "noise" comes from different vpn-links and node-wan-uplinks. -> Since all links are wired, radio-noise won't have an impact -> The losses appear in batman-adv 2011.2.0 as well as in batman-adv 2012.4.0. Thus I suspect that batman-adv is triggering theses interruptions.
If you believe batman-adv is the culprit, please remove batman-adv from your test setup and repeat the exact same test. Otherwise we move into the realm of speculation.
Haven't heard of any "peak loss" due to batman-adv.
Cheers, Marek
PS: Forgot to click "reply to all", sorry for that.
Hello,
Am 03.01.2013 um 16:00 schrieb Marek Lindner:
On Thursday, January 03, 2013 22:16:32 Jan Lühr wrote:
We noticed a few, unusual things: -> regardless of the link tested (kif or fastd3) and regardless of the node (node1, node2) small interruptions (loss-peaks) are appearing. -> These peaks appear almost synchronous, a few "noise" comes from different vpn-links and node-wan-uplinks. -> Since all links are wired, radio-noise won't have an impact -> The losses appear in batman-adv 2011.2.0 as well as in batman-adv 2012.4.0. Thus I suspect that batman-adv is triggering theses interruptions.
If you believe batman-adv is the culprit, please remove batman-adv from your test setup and repeat the exact same test. Otherwise we move into the realm of speculation.
Well, I started echoing on the underlying VPN-connection as well and changed the chart-titles for clearance:
In chart 4 and chart 5 RTT / Loss from kif.kbu to node-1 is measured. In chart 4 kif is sending ICMPv6 to the link-local address of bat0-interface - in chart 5 the link_local address of the underlying vpn-interface is used.
The loss-peaks appear in char 4 (bat0) only.
Thanks for your help,
Keep smiling yanosz
PS: Forgot to click "reply to all", sorry for that.
Well, I set it f2up this list, since you're node subscribed. Sorry for not mentioning it.
On Thursday, January 03, 2013 23:52:42 Jan Lühr wrote:
Well, I started echoing on the underlying VPN-connection as well and changed the chart-titles for clearance:
In chart 4 and chart 5 RTT / Loss from kif.kbu to node-1 is measured. In chart 4 kif is sending ICMPv6 to the link-local address of bat0-interface - in chart 5 the link_local address of the underlying vpn-interface is used.
All tests are from the same node to the same node using exactly the same means (protocols, etc) with the exception of batman-adv ? Sorry for my dumb question but your topology might be obvious to you but isn't for me (us?).
The spikes on your graphs show the packet loss ? How can we have a RTT < 50ms while experiencing packet loss ?
Cheers, Marek
Hello,
Am 03.01.2013 um 17:05 schrieb Marek Lindner:
On Thursday, January 03, 2013 23:52:42 Jan Lühr wrote:
Well, I started echoing on the underlying VPN-connection as well and changed the chart-titles for clearance:
In chart 4 and chart 5 RTT / Loss from kif.kbu to node-1 is measured. In chart 4 kif is sending ICMPv6 to the link-local address of bat0-interface - in chart 5 the link_local address of the underlying vpn-interface is used.
All tests are from the same node to the same node using exactly the same means (protocols, etc) with the exception of batman-adv ? Sorry for my dumb question but your topology might be obvious to you but isn't for me (us?).
yes.
The spikes on your graphs show the packet loss ?
yes.
How can we have a RTT < 50ms while experiencing packet loss ?
Collectd uses a ping rate of 10 seconds here, RTT >= 0.9s = 900ms is "loss" If 1/3 arrives in < 50ms and 2/3 are lost, RTT is < 50ms while loss = 2/3.
Btw. I you wanna see our setup, we can take a tour via ssh / tmux - I can provide the rrd-Files, too. You can contact my by jabber (yanosz@jabber.ccc.de) as well,
Thanks, Keep smiling yanosz
What were the exact batman-adv versions at the time you made those graphs for the shown nodes as well as any intermediate nodes (although if I got your setup right then the nodes in those graphs do not have any intermediate batman-adv hops involved, do they?)?
Does it make a difference if you make the entries in the neighbor solicitation table permanent (e.g. with 'ip -6 neigh')?
Do the same spikes appear with a 'batctl ping' (or if adding such graphs is quite a hassle then at least, is the overall 'batctl ping' packet loss similar to the one from the ip ping)?
Do you see any "Changing route towards" events in the batman-adv logs between these supposedly always direct neighbors - if yes, maybe with a similar frequency? Or even better, check whether the same packet loss appears when isolating any of such pairs of nodes so that no batman-adv route changes could happen at all.
Do these packet loss events correlate with periods of high OGM loss (e.g. check for high last-seen values in the originator table)?
Gesendet: Donnerstag, 03. Januar 2013 um 17:17 Uhr Von: "Jan Lühr" ff@stephan.homeunix.net An: "The list for a Better Approach To Mobile Ad-hoc Networking" b.a.t.m.a.n@lists.open-mesh.org Betreff: Re: [B.A.T.M.A.N.] Packet-Loss-Peaks in a Freifunk-Network
Hello,
Am 03.01.2013 um 17:05 schrieb Marek Lindner:
On Thursday, January 03, 2013 23:52:42 Jan Lühr wrote:
Well, I started echoing on the underlying VPN-connection as well and changed the chart-titles for clearance:
In chart 4 and chart 5 RTT / Loss from kif.kbu to node-1 is measured. In chart 4 kif is sending ICMPv6 to the link-local address of bat0-interface - in chart 5 the link_local address of the underlying vpn-interface is used.
All tests are from the same node to the same node using exactly the same means (protocols, etc) with the exception of batman-adv ? Sorry for my dumb question but your topology might be obvious to you but isn't for me (us?).
yes.
The spikes on your graphs show the packet loss ?
yes.
How can we have a RTT < 50ms while experiencing packet loss ?
Collectd uses a ping rate of 10 seconds here, RTT >= 0.9s = 900ms is "loss" If 1/3 arrives in < 50ms and 2/3 are lost, RTT is < 50ms while loss = 2/3.
Btw. I you wanna see our setup, we can take a tour via ssh / tmux - I can provide the rrd-Files, too. You can contact my by jabber (yanosz@jabber.ccc.de) as well,
Thanks, Keep smiling yanosz
Hi Linus,
nice to hear from you again.
Am 04.01.2013 um 06:34 schrieb Linus Lüssing:
What were the exact batman-adv versions at the time you made those graphs for the shown nodes as well as any intermediate nodes (although if I got your setup right then the nodes in those graphs do not have any intermediate batman-adv hops involved, do they?)?
Does it make a difference if you make the entries in the neighbor solicitation table permanent (e.g. with 'ip -6 neigh')?
Do the same spikes appear with a 'batctl ping' (or if adding such graphs is quite a hassle then at least, is the overall 'batctl ping' packet loss similar to the one from the ip ping)?
Do you see any "Changing route towards" events in the batman-adv logs between these supposedly always direct neighbors - if yes, maybe with a similar frequency? Or even better, check whether the same packet loss appears when isolating any of such pairs of nodes so that no batman-adv route changes could happen at all.
For some reason, the spikes disappeared around midnight - and I don't know why. Maybe batman-adv converged to a different plan (or converged at all) or the Flying Spaghetti Monster our network to work nowadays.
I'll go one inspecting this issue, but I cannot answer any of your questions. :-(
Keep smiling yanosz
b.a.t.m.a.n@lists.open-mesh.org