hi!
as openwrt 8.09.2 still ships with an old batman-adv 0.1 module, i tried to compile a batman-adv 0.2 module. the compile worked, the module loads, originators see each other, but on the openwrt box on bat0 tx packets stays 0 while tx dropped obviously increases with each packet to be transmitted.
the setup: laptop debian squeeze amd64 2.6.31.12 batman-adv 0.2 laptop debian sid x86 2.6.32 batman-adv 0.2 ap openwrt 8.09.2 ixp4xx/armeb (cambria) 2.6.26.8 batman-adv 0.2
the facts: all bridges and iptables switched off. with plain ip on the wlan interfaces, pinging between all nodes works fine (when within reach). all three nodes have the respective two other nodes listed as originators, and if all are within reach of each other, with originator=nexthop. pinging via bat0 works between the two laptops. pinging the laptops via bat0 from the ap results in no packets seen on the laptops' bat0. pinging the ap via bat0 from a laptop results in incoming arp-requests and outgoing arp-replies seen on the ap's bat0 - but again, the arp-replies aren't seen on the laptops' bat0 (nor on the laptops' wlan interfaces). on the ap's bat0, the tx packets counter stays at 0, while the tx dropped counter seems to increase with each packet that should be sent over it.
i enabled all logging (15) on the ap and the laptops, but found no hint in there...
the only interesting messages seem to be in dmesg, saying: protocol 4305 is buggy, dev ath1
so to me it seems like all tx packets on bat0 on the ap are dropped, while everything else seems to work as it's supposed to.
i then tried to compile the current (r1568) version from svn for the ap. again, the compile worked, but the ap just freezes immediately when i try to load it.
i thought about trying a newer kernel for the ap, but from openwrt there's a special cambria kernel and i haven't found its config and also don't know what patches might have been applied, so i haven't had much hope for any helpful result along this path...
regards,
Chris
Hi Chris,
I hope it's okay that I'm attaching our chatlog here: http://pastebin.org/89225 (being stored for a month). And just to point out, the two captures on your router: http://filebin.ca/hzoxmj (athX) http://filebin.ca/xtwoa (bat0) They seem to show quite well, that batman-adv and/or the kernel seem to drop the arp replays which the router wants to put into the bat0 interface as you described below. I couldn't spot anything wrong in the second dump's arp-replays though.
Anyone else seen this "protocol 4305 is buggy, dev ath1" message before? Could just find 6-10 years old posts on mailinglists to this topic...
On Sun, Feb 07, 2010 at 09:54:38PM +0100, x@muc.ccc.de wrote:
hi!
as openwrt 8.09.2 still ships with an old batman-adv 0.1 module, i tried to compile a batman-adv 0.2 module. the compile worked, the module loads, originators see each other, but on the openwrt box on bat0 tx packets stays 0 while tx dropped obviously increases with each packet to be transmitted.
the setup: laptop debian squeeze amd64 2.6.31.12 batman-adv 0.2 laptop debian sid x86 2.6.32 batman-adv 0.2 ap openwrt 8.09.2 ixp4xx/armeb (cambria) 2.6.26.8 batman-adv 0.2
the facts: all bridges and iptables switched off. with plain ip on the wlan interfaces, pinging between all nodes works fine (when within reach). all three nodes have the respective two other nodes listed as originators, and if all are within reach of each other, with originator=nexthop. pinging via bat0 works between the two laptops. pinging the laptops via bat0 from the ap results in no packets seen on the laptops' bat0. pinging the ap via bat0 from a laptop results in incoming arp-requests and outgoing arp-replies seen on the ap's bat0 - but again, the arp-replies aren't seen on the laptops' bat0 (nor on the laptops' wlan interfaces). on the ap's bat0, the tx packets counter stays at 0, while the tx dropped counter seems to increase with each packet that should be sent over it.
i enabled all logging (15) on the ap and the laptops, but found no hint in there...
the only interesting messages seem to be in dmesg, saying: protocol 4305 is buggy, dev ath1
so to me it seems like all tx packets on bat0 on the ap are dropped, while everything else seems to work as it's supposed to.
i then tried to compile the current (r1568) version from svn for the ap. again, the compile worked, but the ap just freezes immediately when i try to load it.
I also had tried some Debian stable versions with a 2.6.26 kernel, and you're right in one of the last maintenance patches, a bug has been introduced for kernel versions < 2.6.29. (I made another post with some call traces here: https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-February/002282.html)
i thought about trying a newer kernel for the ap, but from openwrt there's a special cambria kernel and i haven't found its config and also don't know what patches might have been applied, so i haven't had much hope for any helpful result along this path...
regards,
Chris
Cheers, Linus
Something I just noticed... the router (B.A.T.M.A.N., Orig: Sparklan_1e:61:17 (00:0e:8e:1e:61:17)) is announcing the host 00:00:00:00:00:00, which is odd, isn't it? (see hzomxmj for athX dump).
I also tried to dig a little deeper to see where this protocol 0x4305 buggy error comes from. The source is net/core/dev.c in dev_queue_xmit_nit() with the following sectio (it hasn't been altered from 2.6.26 to 2.6.32): ------ 1490 /* skb->nh should be correctly 1491 set by sender, so that the second statement is 1492 just protection against buggy protocols. 1493 */ 1494 skb_reset_mac_header(skb2); 1495 1496 if (skb_network_header(skb2) < skb2->data || 1497 skb2->network_header > skb2->tail) { 1498 if (net_ratelimit()) 1499 printk(KERN_CRIT "protocol %04x is " 1500 "buggy, dev %s\n", 1501 skb2->protocol, dev->name); 1502 skb_reset_network_header(skb2); 1503 } ------ So one of the two statements can only cause it. Did we forget to set something in the skb_buff structure in batman-adv?
Cheers, Linus
On Wed, Feb 10, 2010 at 02:15:38AM +0100, Linus Lüssing wrote:
Hi Chris,
I hope it's okay that I'm attaching our chatlog here: http://pastebin.org/89225 (being stored for a month). And just to point out, the two captures on your router: http://filebin.ca/hzoxmj (athX) http://filebin.ca/xtwoa (bat0) They seem to show quite well, that batman-adv and/or the kernel seem to drop the arp replays which the router wants to put into the bat0 interface as you described below. I couldn't spot anything wrong in the second dump's arp-replays though.
Anyone else seen this "protocol 4305 is buggy, dev ath1" message before? Could just find 6-10 years old posts on mailinglists to this topic...
On Sun, Feb 07, 2010 at 09:54:38PM +0100, x@muc.ccc.de wrote:
hi!
as openwrt 8.09.2 still ships with an old batman-adv 0.1 module, i tried to compile a batman-adv 0.2 module. the compile worked, the module loads, originators see each other, but on the openwrt box on bat0 tx packets stays 0 while tx dropped obviously increases with each packet to be transmitted.
the setup: laptop debian squeeze amd64 2.6.31.12 batman-adv 0.2 laptop debian sid x86 2.6.32 batman-adv 0.2 ap openwrt 8.09.2 ixp4xx/armeb (cambria) 2.6.26.8 batman-adv 0.2
the facts: all bridges and iptables switched off. with plain ip on the wlan interfaces, pinging between all nodes works fine (when within reach). all three nodes have the respective two other nodes listed as originators, and if all are within reach of each other, with originator=nexthop. pinging via bat0 works between the two laptops. pinging the laptops via bat0 from the ap results in no packets seen on the laptops' bat0. pinging the ap via bat0 from a laptop results in incoming arp-requests and outgoing arp-replies seen on the ap's bat0 - but again, the arp-replies aren't seen on the laptops' bat0 (nor on the laptops' wlan interfaces). on the ap's bat0, the tx packets counter stays at 0, while the tx dropped counter seems to increase with each packet that should be sent over it.
i enabled all logging (15) on the ap and the laptops, but found no hint in there...
the only interesting messages seem to be in dmesg, saying: protocol 4305 is buggy, dev ath1
so to me it seems like all tx packets on bat0 on the ap are dropped, while everything else seems to work as it's supposed to.
i then tried to compile the current (r1568) version from svn for the ap. again, the compile worked, but the ap just freezes immediately when i try to load it.
I also had tried some Debian stable versions with a 2.6.26 kernel, and you're right in one of the last maintenance patches, a bug has been introduced for kernel versions < 2.6.29. (I made another post with some call traces here: https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-February/002282.html)
i thought about trying a newer kernel for the ap, but from openwrt there's a special cambria kernel and i haven't found its config and also don't know what patches might have been applied, so i haven't had much hope for any helpful result along this path...
regards,
Chris
Cheers, Linus
Okay, had a closer look at the 2.6.26-kernel issue. Could you please try the patch I've posted here: https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-February/002285.html This should fix the freezing on your router with r1568-maintenance at least.
Would be interesting to know, in case this patch works, if you are still having this "protocol 4305 buggy" in the current batman-adv version then (though I doubt that this issue might have vanished there).
Cheers, Linus
On Sun, Feb 07, 2010 at 09:54:38PM +0100, x@muc.ccc.de wrote:
hi!
as openwrt 8.09.2 still ships with an old batman-adv 0.1 module, i tried to compile a batman-adv 0.2 module. the compile worked, the module loads, originators see each other, but on the openwrt box on bat0 tx packets stays 0 while tx dropped obviously increases with each packet to be transmitted.
the setup: laptop debian squeeze amd64 2.6.31.12 batman-adv 0.2 laptop debian sid x86 2.6.32 batman-adv 0.2 ap openwrt 8.09.2 ixp4xx/armeb (cambria) 2.6.26.8 batman-adv 0.2
the facts: all bridges and iptables switched off. with plain ip on the wlan interfaces, pinging between all nodes works fine (when within reach). all three nodes have the respective two other nodes listed as originators, and if all are within reach of each other, with originator=nexthop. pinging via bat0 works between the two laptops. pinging the laptops via bat0 from the ap results in no packets seen on the laptops' bat0. pinging the ap via bat0 from a laptop results in incoming arp-requests and outgoing arp-replies seen on the ap's bat0 - but again, the arp-replies aren't seen on the laptops' bat0 (nor on the laptops' wlan interfaces). on the ap's bat0, the tx packets counter stays at 0, while the tx dropped counter seems to increase with each packet that should be sent over it.
i enabled all logging (15) on the ap and the laptops, but found no hint in there...
the only interesting messages seem to be in dmesg, saying: protocol 4305 is buggy, dev ath1
so to me it seems like all tx packets on bat0 on the ap are dropped, while everything else seems to work as it's supposed to.
i then tried to compile the current (r1568) version from svn for the ap. again, the compile worked, but the ap just freezes immediately when i try to load it.
i thought about trying a newer kernel for the ap, but from openwrt there's a special cambria kernel and i haven't found its config and also don't know what patches might have been applied, so i haven't had much hope for any helpful result along this path...
regards,
Chris
hi!
Linus Lüssing wrote:
please try the patch I've posted here:
i tested the patch, and the box doesn't freeze anymore when loading the module! then i bring up the interfaces (bat0 and ath1) - ok. then i echo ath1 >/proc/net/batman-adv/interfaces - then this happens:
------------[ cut here ]------------ WARNING: at net/core/dev.c:1454 () Modules linked in: batman_adv w1_therm cdc_ether usb_storage usbserial usbnet wire ehci_hcd ath_pci ath_hal(P) ip6t_mh nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_state nf_conntrack_ipv4 nf_conntrack sd_mod ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables tun ppp_async ppp_generic slhc crc_ccitt usbcore scsi_mod [last unloaded: batman_adv] Function entered at [<c0023890>] from [<c0030234>] Function entered at [<c00301f0>] from [<c013c348>] r6:00004305 r5:c7216d80 r4:c7216d80 Function entered at [<c013c2c0>] from [<c013c60c>] r7:c71bd000 r6:c7f997c0 r5:c7216d80 r4:c71bd000 Function entered at [<c013c484>] from [<c013f154>] r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71bd000 Function entered at [<c013ef7c>] from [<bf078f8c>] r8:bf081ba0 r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71efe60 Function entered at [<bf078ec8>] from [<bf07900c>] r7:c7f997c0 r6:c7f5b200 r5:00000016 r4:c7216d80 Function entered at [<bf078fa8>] from [<bf07917c>] r8:00000000 r7:00000000 r6:00000001 r5:c7f997c0 r4:c715dba0 Function entered at [<bf079010>] from [<c003f830>] r6:c00401a0 r5:c71e8000 r4:c7167cc0 Function entered at [<c003f784>] from [<c0040248>] r5:c7167cc0 r4:c7167cc8 Function entered at [<c00401a0>] from [<c00432b8>] r5:c7167cc0 r4:c71e8000 Function entered at [<c0043260>] from [<c0033214>] r6:00000000 r5:00000000 r4:00000000 ---[ end trace da97f061c7bedcb9 ]--- Unable to handle kernel paging request at virtual address 6e2d6164 pgd = c0004000 [6e2d6164] *pgd=00000000 Internal error: Oops: f5 [#1] Modules linked in: batman_adv w1_therm cdc_ether usb_storage usbserial usbnet wire ehci_hcd ath_pci ath_hal(P) ip6t_mh nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_state nf_conntrack_ipv4 nf_conntrack sd_mod ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables tun ppp_async ppp_generic slhc crc_ccitt usbcore scsi_mod [last unloaded: batman_adv] CPU: 0 Tainted: P W (2.6.26.8 #5) pc : [<c005e634>] lr : [<c0136788>] psr: 80000013 sp : c71e9e8c ip : c71e9e9c fp : c71e9e98 r10: 00000000 r9 : 00000000 r8 : bf081ba0 r7 : c71bd000 r6 : c7f997c0 r5 : c7216d80 r4 : 00000001 r3 : 0000ffff r2 : c71efe60 r1 : c71efe1e r0 : 6e2d6164 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 000039ff Table: 07ed8000 DAC: 00000017 Process bat_events (pid: 3001, stack limit = 0xc71e8260) Stack: (0xc71e9e8c to 0xc71ea000) 9e80: c71e9eb0 c71e9e9c c0136788 c005e634 c7216d80 9ea0: c7216d80 c71e9ec4 c71e9eb4 c0136e98 c0136724 c7216d80 c71e9ed8 c71e9ec8 9ec0: c0136598 c0136e00 c71bd000 c71e9ee8 c71e9edc c0136694 c0136590 c71e9f10 9ee0: c71e9eec c013c6ec c0136654 c0136724 c7216d80 c71bd000 c7216d80 c7f997c0 9f00: bf081ba0 c71e9f34 c71e9f14 c013f154 c013c490 c71efe60 c7216d80 c7f997c0 9f20: bf081ba0 bf081ba0 c71e9f54 c71e9f38 bf078f8c c013ef88 c7216d80 00000016 9f40: c7f5b200 c7f997c0 c71e9f78 c71e9f58 bf07900c bf078ed4 c715dba0 c7f997c0 9f60: 00000001 00000000 00000000 c71e9f94 c71e9f7c bf07917c bf078fb4 c7167cc0 9f80: c71e8000 c00401a0 c71e9fac c71e9f98 c003f830 bf07901c c7167cc8 c7167cc0 9fa0: c71e9fd8 c71e9fb0 c0040248 c003f790 00000000 c7f6b200 c00435f4 c71e9fbc 9fc0: c71e9fbc c71e8000 c7167cc0 c71e9ff4 c71e9fdc c00432b8 c00401ac 00000000 9fe0: 00000000 00000000 00000000 c71e9ff8 c0033214 c004326c 00000000 00000000 Backtrace: Function entered at [<c005e628>] from [<c0136788>] Function entered at [<c0136718>] from [<c0136e98>] r5:c7216d80 r4:c7216d80 Function entered at [<c0136df4>] from [<c0136598>] r4:c7216d80 Function entered at [<c0136584>] from [<c0136694>] r4:c71bd000 Function entered at [<c0136648>] from [<c013c6ec>] Function entered at [<c013c484>] from [<c013f154>] r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71bd000 Function entered at [<c013ef7c>] from [<bf078f8c>] r8:bf081ba0 r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71efe60 Function entered at [<bf078ec8>] from [<bf07900c>] r7:c7f997c0 r6:c7f5b200 r5:00000016 r4:c7216d80 Function entered at [<bf078fa8>] from [<bf07917c>] r8:00000000 r7:00000000 r6:00000001 r5:c7f997c0 r4:c715dba0 Function entered at [<bf079010>] from [<c003f830>] r6:c00401a0 r5:c71e8000 r4:c7167cc0 Function entered at [<c003f784>] from [<c0040248>] r5:c7167cc0 r4:c7167cc8 Function entered at [<c00401a0>] from [<c00432b8>] r5:c7167cc0 r4:c71e8000 Function entered at [<c0043260>] from [<c0033214>] r6:00000000 r5:00000000 r4:00000000 Code: e89da800 e1a0c00d e92dd800 e24cb004 (e5903000) Kernel panic - not syncing: Fatal exception in interrupt
hope this helps in some way.
regards,
Chris
Argh, okay. I guess, you had batman-adv on your laptops already running and the wifi interfaces up when starting batman-adv on your laptop, right?
Could you have a look, if this kernel panic also occures, when you start batman-adv on your router first? I'd like to know if this might be related to the bug stated here [0].
If I remember right, then you were saying that you're using the 8.09.2 release of OpenWRT. Otherwise you could have activated "Compile the kernel with symbol table information" under "make menuconfig" --> "Global build settings" to add some function names to those nice numbers :)
Or does anyone of the others know if/how ksymoops could be useful/used in this situation to extract the function names afterwards?
Cheers, Linus
[0]: https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2010-February/002281.html
On Wed, Feb 10, 2010 at 07:58:04PM +0100, x@muc.ccc.de wrote:
hi!
Linus Lüssing wrote:
please try the patch I've posted here:
i tested the patch, and the box doesn't freeze anymore when loading the module! then i bring up the interfaces (bat0 and ath1) - ok. then i echo ath1 >/proc/net/batman-adv/interfaces - then this happens:
------------[ cut here ]------------ WARNING: at net/core/dev.c:1454 () Modules linked in: batman_adv w1_therm cdc_ether usb_storage usbserial usbnet wire ehci_hcd ath_pci ath_hal(P) ip6t_mh nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_state nf_conntrack_ipv4 nf_conntrack sd_mod ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables tun ppp_async ppp_generic slhc crc_ccitt usbcore scsi_mod [last unloaded: batman_adv] Function entered at [<c0023890>] from [<c0030234>] Function entered at [<c00301f0>] from [<c013c348>] r6:00004305 r5:c7216d80 r4:c7216d80 Function entered at [<c013c2c0>] from [<c013c60c>] r7:c71bd000 r6:c7f997c0 r5:c7216d80 r4:c71bd000 Function entered at [<c013c484>] from [<c013f154>] r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71bd000 Function entered at [<c013ef7c>] from [<bf078f8c>] r8:bf081ba0 r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71efe60 Function entered at [<bf078ec8>] from [<bf07900c>] r7:c7f997c0 r6:c7f5b200 r5:00000016 r4:c7216d80 Function entered at [<bf078fa8>] from [<bf07917c>] r8:00000000 r7:00000000 r6:00000001 r5:c7f997c0 r4:c715dba0 Function entered at [<bf079010>] from [<c003f830>] r6:c00401a0 r5:c71e8000 r4:c7167cc0 Function entered at [<c003f784>] from [<c0040248>] r5:c7167cc0 r4:c7167cc8 Function entered at [<c00401a0>] from [<c00432b8>] r5:c7167cc0 r4:c71e8000 Function entered at [<c0043260>] from [<c0033214>] r6:00000000 r5:00000000 r4:00000000 ---[ end trace da97f061c7bedcb9 ]--- Unable to handle kernel paging request at virtual address 6e2d6164 pgd = c0004000 [6e2d6164] *pgd=00000000 Internal error: Oops: f5 [#1] Modules linked in: batman_adv w1_therm cdc_ether usb_storage usbserial usbnet wire ehci_hcd ath_pci ath_hal(P) ip6t_mh nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_state nf_conntrack_ipv4 nf_conntrack sd_mod ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables tun ppp_async ppp_generic slhc crc_ccitt usbcore scsi_mod [last unloaded: batman_adv] CPU: 0 Tainted: P W (2.6.26.8 #5) pc : [<c005e634>] lr : [<c0136788>] psr: 80000013 sp : c71e9e8c ip : c71e9e9c fp : c71e9e98 r10: 00000000 r9 : 00000000 r8 : bf081ba0 r7 : c71bd000 r6 : c7f997c0 r5 : c7216d80 r4 : 00000001 r3 : 0000ffff r2 : c71efe60 r1 : c71efe1e r0 : 6e2d6164 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 000039ff Table: 07ed8000 DAC: 00000017 Process bat_events (pid: 3001, stack limit = 0xc71e8260) Stack: (0xc71e9e8c to 0xc71ea000) 9e80: c71e9eb0 c71e9e9c c0136788 c005e634 c7216d80 9ea0: c7216d80 c71e9ec4 c71e9eb4 c0136e98 c0136724 c7216d80 c71e9ed8 c71e9ec8 9ec0: c0136598 c0136e00 c71bd000 c71e9ee8 c71e9edc c0136694 c0136590 c71e9f10 9ee0: c71e9eec c013c6ec c0136654 c0136724 c7216d80 c71bd000 c7216d80 c7f997c0 9f00: bf081ba0 c71e9f34 c71e9f14 c013f154 c013c490 c71efe60 c7216d80 c7f997c0 9f20: bf081ba0 bf081ba0 c71e9f54 c71e9f38 bf078f8c c013ef88 c7216d80 00000016 9f40: c7f5b200 c7f997c0 c71e9f78 c71e9f58 bf07900c bf078ed4 c715dba0 c7f997c0 9f60: 00000001 00000000 00000000 c71e9f94 c71e9f7c bf07917c bf078fb4 c7167cc0 9f80: c71e8000 c00401a0 c71e9fac c71e9f98 c003f830 bf07901c c7167cc8 c7167cc0 9fa0: c71e9fd8 c71e9fb0 c0040248 c003f790 00000000 c7f6b200 c00435f4 c71e9fbc 9fc0: c71e9fbc c71e8000 c7167cc0 c71e9ff4 c71e9fdc c00432b8 c00401ac 00000000 9fe0: 00000000 00000000 00000000 c71e9ff8 c0033214 c004326c 00000000 00000000 Backtrace: Function entered at [<c005e628>] from [<c0136788>] Function entered at [<c0136718>] from [<c0136e98>] r5:c7216d80 r4:c7216d80 Function entered at [<c0136df4>] from [<c0136598>] r4:c7216d80 Function entered at [<c0136584>] from [<c0136694>] r4:c71bd000 Function entered at [<c0136648>] from [<c013c6ec>] Function entered at [<c013c484>] from [<c013f154>] r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71bd000 Function entered at [<c013ef7c>] from [<bf078f8c>] r8:bf081ba0 r7:bf081ba0 r6:c7f997c0 r5:c7216d80 r4:c71efe60 Function entered at [<bf078ec8>] from [<bf07900c>] r7:c7f997c0 r6:c7f5b200 r5:00000016 r4:c7216d80 Function entered at [<bf078fa8>] from [<bf07917c>] r8:00000000 r7:00000000 r6:00000001 r5:c7f997c0 r4:c715dba0 Function entered at [<bf079010>] from [<c003f830>] r6:c00401a0 r5:c71e8000 r4:c7167cc0 Function entered at [<c003f784>] from [<c0040248>] r5:c7167cc0 r4:c7167cc8 Function entered at [<c00401a0>] from [<c00432b8>] r5:c7167cc0 r4:c71e8000 Function entered at [<c0043260>] from [<c0033214>] r6:00000000 r5:00000000 r4:00000000 Code: e89da800 e1a0c00d e92dd800 e24cb004 (e5903000) Kernel panic - not syncing: Fatal exception in interrupt
hope this helps in some way.
regards,
Chris
On Thursday 11 February 2010 10:39:09 Linus Lüssing wrote:
If I remember right, then you were saying that you're using the 8.09.2 release of OpenWRT. Otherwise you could have activated "Compile the kernel with symbol table information" under "make menuconfig" --> "Global build settings" to add some function names to those nice numbers :)
You can activate these settings manually: make kernel_menuconfig ---> General setup ---> Configure standard kernel features (for small systems) ---> Load all symbols for debugging/ksymoops
Regards, Marek
hi!
Linus Lüssing wrote:
Argh, okay. I guess, you had batman-adv on your laptops already running and the wifi interfaces up when starting batman-adv on your laptop, right?
possible, i did not check whether bat0/wlan0 was up or down on my laptop (2nd one wasn't running) when i loaded the batman-adv r1568+patch module. so i tried it again, after making sure the laptop (and nothing else) was running batman-adv. but immediately after echo ath1 >/proc/net/batman-adv/interfaces the kernel paniced again. this time, right before the panic, i also got:
foo:<6>batman-adv:Adding interface: ath1 foo:<6>batman-adv:Interface activated: ath1
(i didn't remove the foo: when patching, but i did add the ifdef printk undef printk).
If I remember right, then you were saying that you're using the 8.09.2 release of OpenWRT. Otherwise you could have activated "Compile the kernel with symbol table information" under "make menuconfig" --> "Global build settings" to add some function names to those nice numbers :)
i guess this needs a kernel compile, just compiling the module after setting these options will probably not work or not help? i can't compile a kernel for the box because i wasn't able to find the config and possible patches used to build the cambria kernel, so i'm currently kind of stuck with the released 8.09.2 binary kernel...
regards,
Chris
On Thursday 11 February 2010 22:00:34 x@muc.ccc.de wrote:
i guess this needs a kernel compile, just compiling the module after setting these options will probably not work or not help? i can't compile a kernel for the box because i wasn't able to find the config and possible patches used to build the cambria kernel, so i'm currently kind of stuck with the released 8.09.2 binary kernel...
You can try only recompiling the module. Not sure whether it will work or not. Otherwise we have to fallback to the old-style (printk) debugging. :-)
Cheers, Marek
hi!
Marek Lindner wrote:
You can try only recompiling the module. Not sure whether it will work or not. Otherwise we have to fallback to the old-style (printk) debugging. :-)
ok, tried that, the resulting module actually grew by 32B, but still just addresses, no names, sorry...
(a short excerpt: Function entered at [<c0023890>] from [<c0030234>] Function entered at [<c00301f0>] from [<c013c348>] r6:00004305 r5:c7001240 r4:c7001240 Function entered at [<c013c2c0>] from [<c013c60c>] r7:c71db000 r6:c7d47380 r5:c7001240 r4:c71db000 in case the full output might still help, just drop me a note)
iirc it wasn't easy to reproduce the setup - so if i can help by applying a patch, compile, and test, please just tell me what to do :)
regards,
Chris
Ehm, and another question. You said, that there's no bridging involved. However I see two mac addresses being announced by one of your laptops (B.A.T.M.A.N., Orig: Azurewav_8b:81:18 (00:25:d3:8b:81:18)): - B.A.T.M.A.N. HNA: 0a:51:61:16:a1:51 (0a:51:61:16:a1:51) - B.A.T.M.A.N. HNA: 26:46:d1:f8:8c:54 (26:46:d1:f8:8c:54) Any idea where the first one might come from?
Cheers, Linus
On Sun, Feb 07, 2010 at 09:54:38PM +0100, x@muc.ccc.de wrote:
hi!
as openwrt 8.09.2 still ships with an old batman-adv 0.1 module, i tried to compile a batman-adv 0.2 module. the compile worked, the module loads, originators see each other, but on the openwrt box on bat0 tx packets stays 0 while tx dropped obviously increases with each packet to be transmitted.
the setup: laptop debian squeeze amd64 2.6.31.12 batman-adv 0.2 laptop debian sid x86 2.6.32 batman-adv 0.2 ap openwrt 8.09.2 ixp4xx/armeb (cambria) 2.6.26.8 batman-adv 0.2
the facts: all bridges and iptables switched off. with plain ip on the wlan interfaces, pinging between all nodes works fine (when within reach). all three nodes have the respective two other nodes listed as originators, and if all are within reach of each other, with originator=nexthop. pinging via bat0 works between the two laptops. pinging the laptops via bat0 from the ap results in no packets seen on the laptops' bat0. pinging the ap via bat0 from a laptop results in incoming arp-requests and outgoing arp-replies seen on the ap's bat0 - but again, the arp-replies aren't seen on the laptops' bat0 (nor on the laptops' wlan interfaces). on the ap's bat0, the tx packets counter stays at 0, while the tx dropped counter seems to increase with each packet that should be sent over it.
i enabled all logging (15) on the ap and the laptops, but found no hint in there...
the only interesting messages seem to be in dmesg, saying: protocol 4305 is buggy, dev ath1
so to me it seems like all tx packets on bat0 on the ap are dropped, while everything else seems to work as it's supposed to.
i then tried to compile the current (r1568) version from svn for the ap. again, the compile worked, but the ap just freezes immediately when i try to load it.
i thought about trying a newer kernel for the ap, but from openwrt there's a special cambria kernel and i haven't found its config and also don't know what patches might have been applied, so i haven't had much hope for any helpful result along this path...
regards,
Chris
hi!
Linus Lüssing wrote:
Ehm, and another question. You said, that there's no bridging involved. However I see two mac addresses being announced by one of your laptops (B.A.T.M.A.N., Orig: Azurewav_8b:81:18 (00:25:d3:8b:81:18)):
- B.A.T.M.A.N. HNA: 0a:51:61:16:a1:51 (0a:51:61:16:a1:51)
- B.A.T.M.A.N. HNA: 26:46:d1:f8:8c:54 (26:46:d1:f8:8c:54)
Any idea where the first one might come from?
hmm, can't find any of these macs in the setup. i checked the prefixes, and both ouis are local. so i guess they're the macs of bat0 interfaces (and which changed over reboots). one of them probably the local interface, and the other one probably a batman-switched one of the other laptop's bat0 (i tested both with all nodes within each other's reach, and with two nodes not within each other's reach)?
regards,
Chris
b.a.t.m.a.n@lists.open-mesh.org