Hi,
I currently use batman-experimental rev 1105. Yesterday a node was not reachable. The "top" showed me that batmand was running with almost 100% load.
I could not attach to the batmand to watch any debug info or states. When calling batmand -cd8 (or others) the call simply hangs without prints.
The only call I could make was batmand -c, which displayed: /sbin/batmand [not-all-options-displayed] -r 1 -a 10.12.10.16/28 -a 172.16.10.17/32 eth1 tbb /t
logread did not show any batman message. The memory consumption was also ok. After hard killing and starting the daemon batmand runs normal.
Creating a core dump on wrt was not possible.
Do you have already seen this?
/Stephan
--------------------------------------- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Hello,
I forgot to add the process list of batmand. The hanging thread was created by 1115. This may give you a hint to find the reason for hanging.
1114 root 1216 S /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 1115 root 1216 S /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 1116 root 1216 S /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 13821 root 1216 R /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t
Mem: 16424K used, 14200K free, 0K shrd, 1588K buff, 7104K cached CPU: 6.6% usr 93.3% sys 0.0% nice 0.0% idle 0.0% io 0.0% irq 0.0% softirq Load average: 0.89 1.05 1.00 PID PPID USER STAT VSZ %MEM %CPU COMMAND 13821 1115 root R 1216 3.9 94.3 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rul
Regards and thanks Stephan --------------------------------------- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Hi Stephan,
On Mittwoch 24 September 2008, Stephan Enderlein (Freifunk Dresden) wrote:
Hi,
I currently use batman-experimental rev 1105. Yesterday a node was not reachable. The "top" showed me that batmand was running with almost 100% load.
...
Do you have already seen this?
No, not recently. I've seen something similar (see: https://list.open-mesh.net/pipermail/b.a.t.m.a.n/2008-January/000550.html ) but I dont remember which PID caused the trouble. Do you have an idea for how long the daemon was running without any interruption ?
The PID which referenced the hanging thread seems to be the debug thread. Perhaps the problem was related to the bug which Sven Eckelmann identified recently. Revision 1123 for bmx has no dedicated thread for debugging anymore. So if this was the problem it should be fixed now.
greetings /axel
Hi Axel,
but I dont remember which PID caused the trouble. Do you have an idea for how long the daemon was running without any interruption ?
I can not say how long batman was running, but I think about two days. But an other router that are using the same firmware are running since weeks.
The PID which referenced the hanging thread seems to be the debug thread. Perhaps the problem was related to the bug which Sven Eckelmann identified recently. Revision 1123 for bmx has no dedicated thread for debugging anymore. So if this was the problem it should be fixed now.
I will try it. but I think that I can not give a "it is working now" because the hanging batmand issue happend only once.
bye Stephan
b.a.t.m.a.n@lists.open-mesh.org