Hi,
The strange thing is that the debug-level-4 output stops in the middle of an action. Can you also check for the number of batmand processes before and after the stopped batmand process?
The number of task are the same. But I have seen, that when the -d4 output stopps and I keep this batmand running when accessing an different log level from another terminal, I see the socket-connection logs in -d4 output.
Also I still can just call "batmand -c" to see the parameters and current gateway settings. I also can change the gateway settings.
The batmand seems to stop processing any OGMs.
The messages you see are logged from another thread (not the thread which is doing the OGM processing). Thats also the reason why some of the dynamically changeable parameters _seem_ to be processed. I guess for example a "batmand -c -a 1.2.3.4/32" wont be processed completely. In this case a simultaneous running "batmand -cd3" _should_ report: [ 162940] Unix socket: got connection [ 162946] got request: 10 [ 162947] Unix socket: Requesting adding of HNA 1.2.3.4/32 - put this on todo list... [ 162951] got request: 10 [ 162952] Unix client closed connection ... [ 163157] found todo item, adding HNA 1.2.3.4/32 atype 1
I guess everything except the last line will be shown. The last line is generated from the OGM-processing thread which seems to be blocked.
Perhaps, if you can find a way to reliable reproduce this kind of problem then it would be much easier to fix it. Just an idea, what happens with batmand (bound to the tap interface) when stopping the running tincd like this: kill -STOP $(pidof tincd) and later on: kill -CONT $(pidof tincd)
ciao, axel
Have you ever tried what happens if you connect the tap interface to a bridge and bind batmand to the bridge device instead?
I haven't tried it, yet. but this also came in my mind. I will this check after finishing the "no-tap-dev-test"
Last but not least: have you observed (or explicitly not observed) this phenomenon also with previous revisions in the same scenario ?
I can not say, because implementing tinc and updateing the batmand version was at same time.
I never have seen this problem with the WRT54GS, only with GL.
Is the batmand on the WRT54GS also bound to a tinc interface ?
Yes, the GL is running standalone with stubid tincsetup and also the GS was running with same parameters and standalone (no network cable).
Perhaps it is more random and is depending on speed of the router when the event occurs.
bye Stephan
B.A.T.M.A.N mailing list B.A.T.M.A.N@open-mesh.net https://list.open-mesh.net/mm/listinfo/b.a.t.m.a.n