On Mon, Jul 23, 2012 at 2:28 PM, Antonio Quartulli
> On Sun, Jul 22, 2012 at 08:20:21AM -0300, Guido Iribarren wrote:
>> On Sun, Jul 22, 2012 at 7:57 AM, Guido Iribarren
>> <guidoiribarren(a)buenosaireslibre.org> wrote:
>> > This time it solved itself after some brief time (a minute) but the
>> > symptoms were the same.
>> > So I could catch some logs,
>> > http://pastebin.com/MEENj94i
>> > sadly, i wasn't fast enough to get a live log from the node involved
>> > in the inconsistency as you suggested, so the report might be pretty
>> > useless.
>> from this particular node i ran previous report (colmena-casa) that
>> was rebooted recently, L3 ping to all of the network had the same
>> issue, (no replies for a minute or so) so i had the chance to
>> "recreate" the situation several times.
>> Turns out, a "batctl ll tt ; batctl l" on the nodes mentioned in the
>> inconsistencies gave no output at all, so the previous pastebin report
>> is in fact complete :P
>> Looks like the inconsistency is being resolved locally between
>> neighbours, without the need to contact the far end of the network
>> (which is coherent with what's described in the wiki)
> Exactly! If the neighbour has the needed information, the node can directly get
> answered without bothering the real destination ;)
>> In any case, AFAIR previous ocurrences of the bug didn't resolve by
>> themselves (in a reasonable amount of time) so what I'm looking at now
>> might be perfectly normal behaviour? (tt tables take some time to
> Well, the log you posted is perfectly correct. You missed some OGMs, therefore
> the node is asking for an update that he missed.
> it would be interesting to run batctl ll tt; batctl l all the time on the node
> that usually experiences the "problem". The log should be not so big,
> bug happens.
I admit i haven't left this running as instructed, but on the other
hand, so far I haven't come across the original bug again, and a few
days ago I asked Nico Echaniz which confirmed that he's not suffering
it as previously.
he does bump from time to time with [a few moments | a few minutes] of
"nodes majaretas" (at first sight) but it resolves by itself
quickly[*], which indicates normal behaviour, of missing OGMs and
consequently a delay in TT table updating, as you explained.
[*] "quickly" means under 15 minutes , at most. Previously, problem
would never resolve by itself, being L3-unreachable for hours or days
until manual reboot was done.
In conclusion, so far so good, i think we can close this as fixed for
lack of evidence stating the contrary, heh.
I hope gioacchino managed to recompile ninux images and is having the
same stableness as we do :)
Hello Guido and thank you for reporting back your results :) However, even if
the "behaviour" is good (table gets recovered and everything starts working
again) it is a bit strange that it takes 15 minutes to do so.
If you accidentally see the bug, it would be interesting to get the log of the
"non-working" node and see why it is taking so long.
Thank you very much!
..each of us alone is worth nothing..
Ernesto "Che" Guevara