I have a B.A.T.M.A.N mesh successfully configured on x86 hardware.
I can currently ping the internet from the router that is connected via the mesh, however i can't download or connect to any webpages.
I see the request for the page going over the bat0 interface and being received at the other end.
I've a feeling this is a basic nat problem, but have been unable to find a solution.
I am using OpenWRT software.
Any advice? or hints?
Thanks
Hi,
I see the request for the page going over the bat0 interface and being received at the other end.
I've a feeling this is a basic nat problem, but have been unable to find a solution.
how is the routing (especially bat0) configured on the node that is connected to the internet ? Are you bridging bat0 or doing something else ?
Regards, Marek
The bat0 interface is currently bridged as follows
On the router with internet going into the WAN port (ETH1), BAT0 is bridged with the LAN (ETH0) Internet works correctly though the lan, so i would assume NAT routing to the WAN is setup correctly?
On the router on the other end of the mesh, BAT0 is bridged with the WAN port (ETH1) so that the router will route WAN traffic though BAT0?
Have i got some of the basics confused somewhere?
I had BATMAN (not advanced) working fine, although was experiencing i think a known problem with openwrt, whereas when traffic was sent over the TUN interface for the captive portal the gate0 interface would collapse. Lets stay on topic with batman advanced for now though i guess :-)
On Thu, Aug 12, 2010 at 12:31 PM, Marek Lindner lindner_marek@yahoo.de wrote:
Hi,
I see the request for the page going over the bat0 interface and being received at the other end.
I've a feeling this is a basic nat problem, but have been unable to find a solution.
how is the routing (especially bat0) configured on the node that is connected to the internet ? Are you bridging bat0 or doing something else ?
Regards, Marek
David Beaumont wrote:
The bat0 interface is currently bridged as follows
On the router with internet going into the WAN port (ETH1), BAT0 is bridged with the LAN (ETH0) Internet works correctly though the lan, so i would assume NAT routing to the WAN is setup correctly?
so you have eth0 and bat0(ath0) in a bridge br0 -> eth1 uses nat to translate packets for/from eth1?
On the router on the other end of the mesh, BAT0 is bridged with the WAN port (ETH1) so that the router will route WAN traffic though BAT0?
What? Why a different internet gateway and why do you bridge the bat0(ath0) interface with the wan interface instead of using nat here?
Maybe there is a misunderstanding somewhere... At least I am extremely confused about your setup.
Best regards, Sven
The bat0 interface is currently bridged as follows
On the router with internet going into the WAN port (ETH1), BAT0 is bridged with the LAN (ETH0) Internet works correctly though the lan, so i would assume NAT routing to the WAN is setup correctly?
so you have eth0 and bat0(ath0) in a bridge br0 -> eth1 uses nat to translate packets for/from eth1?
Yes this is correct, bat0 and eth0 are bridged to form br-lan, which uses nat to translate things over to eth1
On the router on the other end of the mesh, BAT0 is bridged with the WAN port (ETH1) so that the router will route WAN traffic though BAT0?
What? Why a different internet gateway and why do you bridge the bat0(ath0) interface with the wan interface instead of using nat here?
Not sure what you mean by a "a different internet gateway"
The way openWRT is setup, the WAN network is actually a bridge of bat0 and wifi0 (which is the card that bat0 is running on) Talking this though its looking like this might be the bit where the problem lies. Its setup currently like this as i was initially trying to run OLSR and BATMAN simultaneously (although have since simplified to just batman).
My theory was that if bat0 is the WAN, the NAT rules built into openwrt would take control and internet would be routed though bat0.
I have also attempted to setup nat myself at this end, but am so far unsuccessful and loose the ability to even ping the internet.
I think i am close to the solution as i can ping things, there is just something missing.
Will try un-bridging wifi0 and report on what happens.
Maybe there is a misunderstanding somewhere... At least I am extremely confused about your setup.
Best regards, Sven
No luck :-(
root@Generic:/# cat /proc/net/batman-adv/originators Originator (#/255) Nexthop [outgoingIF]: Potential nexthops .. . [B.A.T.M.A.N. Adv 0.1 rv1176, MainIF/MAC: ath0/00:0c:42:60:12:cf] 00:0c:42:3a:75:a2 (229) 00:0c:42:3a:75:a2 [ ath0]: 00:0c:42:3a:75:a2 (229)
root@Generic:/# ping google.com PING google.com (74.125.39.105): 56 data bytes 64 bytes from 74.125.39.105: seq=0 ttl=53 time=124.425 ms ^C --- google.com ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 124.425/124.425/124.425 ms root@Generic:/# wget http://www.google.com Connecting to www.google.com (209.85.135.147:80)
Pings work fine, a wget just pauses and waits indefinitely
On Thu, Aug 12, 2010 at 1:11 PM, David Beaumont djb31st@gmail.com wrote:
The bat0 interface is currently bridged as follows
On the router with internet going into the WAN port (ETH1), BAT0 is bridged with the LAN (ETH0) Internet works correctly though the lan, so i would assume NAT routing to the WAN is setup correctly?
so you have eth0 and bat0(ath0) in a bridge br0 -> eth1 uses nat to translate packets for/from eth1?
Yes this is correct, bat0 and eth0 are bridged to form br-lan, which uses nat to translate things over to eth1
On the router on the other end of the mesh, BAT0 is bridged with the WAN port (ETH1) so that the router will route WAN traffic though BAT0?
What? Why a different internet gateway and why do you bridge the bat0(ath0) interface with the wan interface instead of using nat here?
Not sure what you mean by a "a different internet gateway"
The way openWRT is setup, the WAN network is actually a bridge of bat0 and wifi0 (which is the card that bat0 is running on) Talking this though its looking like this might be the bit where the problem lies. Its setup currently like this as i was initially trying to run OLSR and BATMAN simultaneously (although have since simplified to just batman).
My theory was that if bat0 is the WAN, the NAT rules built into openwrt would take control and internet would be routed though bat0.
I have also attempted to setup nat myself at this end, but am so far unsuccessful and loose the ability to even ping the internet.
I think i am close to the solution as i can ping things, there is just something missing.
Will try un-bridging wifi0 and report on what happens.
Maybe there is a misunderstanding somewhere... At least I am extremely confused about your setup.
Best regards, Sven
On Thursday 12 August 2010 12:16:29 David Beaumont wrote:
nexthops .. . [B.A.T.M.A.N. Adv 0.1 rv1176, MainIF/MAC:
FYI: You use a very old version of batman-adv ...
Pings work fine, a wget just pauses and waits indefinitely
That sounds like a MTU problem. What is the mtu of your wifi interfaces ? Here is a document describing the issue: http://www.open-mesh.org/wiki/batman-adv-quick-start-guide
Cheers, Marek
I originally had my MTU values set incorrectly, but believe i have them correct now
router with internet
ath0 - MTU 1524 (physical wifi device) bat0 - MTU 1500 br-lan - MTU 1500 (bridged with bat0/eth1) wifi0 - MTU 1500 (should this be 1524 also)
router on otherside of mesh
ath0 - MTU 1524 (physical wifi device) bat0 - MTU 1500 br-lan - MTU 1500 wifi0 - MTU 1500 (should this be 1524 also)
I am aware that it is quite an old verison, although this is the version that is currently in the openwrt repo. I will compile a more recent version once i have the basics working i think.
Thanks for you help so far guys, two people have suggested MTU issues now, so fingers crossed someone can see something wrong with the output above?
On Thu, Aug 12, 2010 at 1:33 PM, Marek Lindner lindner_marek@yahoo.de wrote:
On Thursday 12 August 2010 12:16:29 David Beaumont wrote:
nexthops .. . [B.A.T.M.A.N. Adv 0.1 rv1176, MainIF/MAC:
FYI: You use a very old version of batman-adv ...
Pings work fine, a wget just pauses and waits indefinitely
That sounds like a MTU problem. What is the mtu of your wifi interfaces ? Here is a document describing the issue: http://www.open-mesh.org/wiki/batman-adv-quick-start-guide
Cheers, Marek
On Thursday 12 August 2010 12:41:41 David Beaumont wrote:
I originally had my MTU values set incorrectly, but believe i have them correct now
router with internet
ath0 - MTU 1524 (physical wifi device)
router on otherside of mesh
ath0 - MTU 1524 (physical wifi device)
Yes, that looks good. The problems you describe sound like MTU or firewall issues which is why I was asking.
wifi0 - MTU 1500 (should this be 1524 also)
You should ignore the wifi0 interface and not include it in any configuration. It is of no relevance in your setup.
I am aware that it is quite an old verison, although this is the version that is currently in the openwrt repo. I will compile a more recent version once i have the basics working i think.
Ok.
Thanks for you help so far guys, two people have suggested MTU issues now, so fingers crossed someone can see something wrong with the output above?
Could you post the config of both nodes ? That includes: * ifconfig * brctl show * ip route / route -n * cat /proc/net/batman-adv/interfaces * iptables (-t nat) -vnL [on the internet-router]
Regards, Marek
Hopefully attachments come though ok?
net_ is from the router connected to the internet mesh_ is the other side of the mesh
On Thu, Aug 12, 2010 at 1:50 PM, Marek Lindner lindner_marek@yahoo.de wrote:
On Thursday 12 August 2010 12:41:41 David Beaumont wrote:
I originally had my MTU values set incorrectly, but believe i have them correct now
router with internet
ath0 - MTU 1524 (physical wifi device)
router on otherside of mesh
ath0 - MTU 1524 (physical wifi device)
Yes, that looks good. The problems you describe sound like MTU or firewall issues which is why I was asking.
wifi0 - MTU 1500 (should this be 1524 also)
You should ignore the wifi0 interface and not include it in any configuration. It is of no relevance in your setup.
I am aware that it is quite an old verison, although this is the version that is currently in the openwrt repo. I will compile a more recent version once i have the basics working i think.
Ok.
Thanks for you help so far guys, two people have suggested MTU issues now, so fingers crossed someone can see something wrong with the output above?
Could you post the config of both nodes ? That includes:
- ifconfig
- brctl show
- ip route / route -n
- cat /proc/net/batman-adv/interfaces
- iptables (-t nat) -vnL [on the internet-router]
Regards, Marek
David Beaumont wrote:
Hopefully attachments come though ok?
net_ is from the router connected to the internet mesh_ is the other side of the mesh
to the mesh thing:
* Why has ath0 an IP... which also conflicts with the ip range of bat0 and br-lan?
* Why has bat0 an ip when it is part of br-wan.
* Why has the ath0 device iptables entries?
to the net thing:
* why has bat0 an ip when it is part of br-lan?
Why don't I see masquerade anywhere in the iptables output (-t nat)?
Best regards, Sven
It does appear that i have got somewhat confused with my ip ranges and addresses, let me try and clear that up now as ath0 and bat0 certainly doesn't need an ip address.
Sorry for my oversight on this, i've gotten myself in a bit of a mess trying to resolve this by the looks of things.
Ah, sorry i missed the nat information here it is
mesh_
Chain PREROUTING (policy ACCEPT) target prot opt source destination
Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT) target prot opt source destination
Chain luci_splash_leases (1 references) target prot opt source destination REDIRECT tcp -- anywhere anywhere tcp dpt:80 redir ports 8082 DROP all -- anywhere anywhere
Chain luci_splash_portal (0 references) target prot opt source destination RETURN udp -- anywhere anywhere udp dpts:33434:33523 RETURN icmp -- anywhere anywhere RETURN udp -- anywhere anywhere udp dpt:53 luci_splash_leases all -- anywhere anywhere
Chain luci_splash_prerouting (0 references) target prot opt source destination
Chain natfix_ath0 (0 references) target prot opt source destination ACCEPT all -- 10.0.0.0/8 10.0.0.0/8
Chain natfix_br-lan (0 references) target prot opt source destination ACCEPT all -- 10.2.4.0/24 10.2.4.0/24
Chain natfix_br-wan (0 references) target prot opt source destination ACCEPT all -- 192.168.1.0/24 192.168.1.0/24
net_
Chain PREROUTING (policy ACCEPT) target prot opt source destination zone_wan_prerouting all -- anywhere anywhere zone_lan_prerouting all -- anywhere anywhere prerouting_rule all -- anywhere anywhere
Chain POSTROUTING (policy ACCEPT) target prot opt source destination postrouting_rule all -- anywhere anywhere zone_wan_nat all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT) target prot opt source destination
Chain postrouting_rule (1 references) target prot opt source destination
Chain prerouting_lan (1 references) target prot opt source destination
Chain prerouting_rule (1 references) target prot opt source destination
Chain prerouting_wan (1 references) target prot opt source destination
Chain zone_lan_nat (0 references) target prot opt source destination MASQUERADE all -- anywhere anywhere
Chain zone_lan_prerouting (1 references) target prot opt source destination prerouting_lan all -- anywhere anywhere
Chain zone_wan_nat (1 references) target prot opt source destination MASQUERADE all -- anywhere anywhere
Chain zone_wan_prerouting (1 references) target prot opt source destination prerouting_wan all -- anywhere anywhere
On Thu, Aug 12, 2010 at 2:29 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
David Beaumont wrote:
Hopefully attachments come though ok?
net_ is from the router connected to the internet mesh_ is the other side of the mesh
to the mesh thing:
* Why has ath0 an IP... which also conflicts with the ip range of bat0 and br-lan?
* Why has bat0 an ip when it is part of br-wan.
* Why has the ath0 device iptables entries?
to the net thing:
* why has bat0 an ip when it is part of br-lan?
Why don't I see masquerade anywhere in the iptables output (-t nat)?
Best regards, Sven
I have removed the extra ip addresses that where not needed and tried to simplify a few other things, however i am still in the same position where i can ping but not transfer.
Can anybody see anything wrong with my NAT rules that could be causing this?
On Thu, Aug 12, 2010 at 2:41 PM, David Beaumont djb31st@gmail.com wrote:
It does appear that i have got somewhat confused with my ip ranges and addresses, let me try and clear that up now as ath0 and bat0 certainly doesn't need an ip address.
Sorry for my oversight on this, i've gotten myself in a bit of a mess trying to resolve this by the looks of things.
Ah, sorry i missed the nat information here it is
mesh_
Chain PREROUTING (policy ACCEPT) target prot opt source destination
Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT) target prot opt source destination
Chain luci_splash_leases (1 references) target prot opt source destination REDIRECT tcp -- anywhere anywhere tcp dpt:80 redir ports 8082 DROP all -- anywhere anywhere
Chain luci_splash_portal (0 references) target prot opt source destination RETURN udp -- anywhere anywhere udp dpts:33434:33523 RETURN icmp -- anywhere anywhere RETURN udp -- anywhere anywhere udp dpt:53 luci_splash_leases all -- anywhere anywhere
Chain luci_splash_prerouting (0 references) target prot opt source destination
Chain natfix_ath0 (0 references) target prot opt source destination ACCEPT all -- 10.0.0.0/8 10.0.0.0/8
Chain natfix_br-lan (0 references) target prot opt source destination ACCEPT all -- 10.2.4.0/24 10.2.4.0/24
Chain natfix_br-wan (0 references) target prot opt source destination ACCEPT all -- 192.168.1.0/24 192.168.1.0/24
net_
Chain PREROUTING (policy ACCEPT) target prot opt source destination zone_wan_prerouting all -- anywhere anywhere zone_lan_prerouting all -- anywhere anywhere prerouting_rule all -- anywhere anywhere
Chain POSTROUTING (policy ACCEPT) target prot opt source destination postrouting_rule all -- anywhere anywhere zone_wan_nat all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT) target prot opt source destination
Chain postrouting_rule (1 references) target prot opt source destination
Chain prerouting_lan (1 references) target prot opt source destination
Chain prerouting_rule (1 references) target prot opt source destination
Chain prerouting_wan (1 references) target prot opt source destination
Chain zone_lan_nat (0 references) target prot opt source destination MASQUERADE all -- anywhere anywhere
Chain zone_lan_prerouting (1 references) target prot opt source destination prerouting_lan all -- anywhere anywhere
Chain zone_wan_nat (1 references) target prot opt source destination MASQUERADE all -- anywhere anywhere
Chain zone_wan_prerouting (1 references) target prot opt source destination prerouting_wan all -- anywhere anywhere
On Thu, Aug 12, 2010 at 2:29 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
David Beaumont wrote:
Hopefully attachments come though ok?
net_ is from the router connected to the internet mesh_ is the other side of the mesh
to the mesh thing:
* Why has ath0 an IP... which also conflicts with the ip range of bat0 and br-lan?
* Why has bat0 an ip when it is part of br-wan.
* Why has the ath0 device iptables entries?
to the net thing:
* why has bat0 an ip when it is part of br-lan?
Why don't I see masquerade anywhere in the iptables output (-t nat)?
Best regards, Sven
On Thursday 12 August 2010 15:14:08 David Beaumont wrote:
I have removed the extra ip addresses that where not needed and tried to simplify a few other things, however i am still in the same position where i can ping but not transfer.
Can anybody see anything wrong with my NAT rules that could be causing this?
Would you mind posting your new settings ?
Regards, Marek
Here are the settings on the mesh node, will include internet node in next post
Thank you for taking the time to try and work though this with me.
On Thu, Aug 12, 2010 at 4:19 PM, Marek Lindner lindner_marek@yahoo.de wrote:
On Thursday 12 August 2010 15:14:08 David Beaumont wrote:
I have removed the extra ip addresses that where not needed and tried to simplify a few other things, however i am still in the same position where i can ping but not transfer.
Can anybody see anything wrong with my NAT rules that could be causing this?
Would you mind posting your new settings ?
Regards, Marek
internet node settings
Dave
On Thu, Aug 12, 2010 at 4:26 PM, David Beaumont djb31st@gmail.com wrote:
Here are the settings on the mesh node, will include internet node in next post
Thank you for taking the time to try and work though this with me.
On Thu, Aug 12, 2010 at 4:19 PM, Marek Lindner lindner_marek@yahoo.de wrote:
On Thursday 12 August 2010 15:14:08 David Beaumont wrote:
I have removed the extra ip addresses that where not needed and tried to simplify a few other things, however i am still in the same position where i can ping but not transfer.
Can anybody see anything wrong with my NAT rules that could be causing this?
Would you mind posting your new settings ?
Regards, Marek
Anything that looks obviously out of place here?
On Thu, Aug 12, 2010 at 4:27 PM, David Beaumont djb31st@gmail.com wrote:
internet node settings
Dave
On Thu, Aug 12, 2010 at 4:26 PM, David Beaumont djb31st@gmail.com wrote:
Here are the settings on the mesh node, will include internet node in next post
Thank you for taking the time to try and work though this with me.
On Thu, Aug 12, 2010 at 4:19 PM, Marek Lindner lindner_marek@yahoo.de wrote:
On Thursday 12 August 2010 15:14:08 David Beaumont wrote:
I have removed the extra ip addresses that where not needed and tried to simplify a few other things, however i am still in the same position where i can ping but not transfer.
Can anybody see anything wrong with my NAT rules that could be causing this?
Would you mind posting your new settings ?
Regards, Marek
On Friday 13 August 2010 07:45:07 David Beaumont wrote:
Anything that looks obviously out of place here?
I see nothing obviously wrong here. The best next step is using tcpdump / batctl / wireshark to find out where the packets get dropped. Feel free to share the result with us. :-)
Regards, Marek
Sorry for the late reply, a few things came up over the weekend that i had to attend to.
Here are three tcp dump files from the internet node on bat0 and one on eth1 (the internet port)
Really don't understand what is wrong here :-(
On Sat, Aug 14, 2010 at 5:46 PM, Marek Lindner lindner_marek@yahoo.de wrote:
On Friday 13 August 2010 07:45:07 David Beaumont wrote:
Anything that looks obviously out of place here?
I see nothing obviously wrong here. The best next step is using tcpdump / batctl / wireshark to find out where the packets get dropped. Feel free to share the result with us. :-)
Regards, Marek
David Beaumont wrote:
Sorry for the late reply, a few things came up over the weekend that i had to attend to.
Here are three tcp dump files from the internet node on bat0 and one on eth1 (the internet port)
Really don't understand what is wrong here :-(
Ok, test plan:
* Find the machine and interface were a response from google.com could be received but which will not forward it to the other interface * take a real dump on all interfaces (wan and lan) tcpdump -s 0 -i eth1 -w eth1.dump * when the response packet is forwarded over the lan/bat0 interface but doesn't get to the final machine than please also create a tcpdump on the receiving machine (real interface and maybe bat0) * Go to your router and check mtu of your wan interface * Try to ping google.com with the maximum size (mtu - 28 bytes, for example mtu 1492): ping -M do -s 1464 google.com * Send small tcp packet with small tcp response: echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
Best regards, Sven
The plot thickens..
i started producing the tcp dumps that you requested to take a look at and noticed the following.
On the main internet node, if i ping google.com everything is fine. However if i ping -s 1464 google.com i do not get a reply, this isn't even going over the batman interface. So it looks like i have more of a local problem.
To clarify
ping -s 1464 google.com
results in ping requests being sent and recieved on ETH1, but not being returned to br-lan
ping google.com
results in ping requests being sent and recieved on ETH1, and being returned on br-lan
ping -s 84 google.com will work ping -s 85 google.com will not work.
I've never encountered these issues before, but i think they are the route cause of my problem? As was initially stated an MTU issue, i just need to find where!
echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
from the mesh node brings no results, although works as expected on the internet node.
On Mon, Aug 16, 2010 at 7:32 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
David Beaumont wrote:
Sorry for the late reply, a few things came up over the weekend that i had to attend to.
Here are three tcp dump files from the internet node on bat0 and one on eth1 (the internet port)
Really don't understand what is wrong here :-(
Ok, test plan:
* Find the machine and interface were a response from google.com could be received but which will not forward it to the other interface * take a real dump on all interfaces (wan and lan) tcpdump -s 0 -i eth1 -w eth1.dump * when the response packet is forwarded over the lan/bat0 interface but doesn't get to the final machine than please also create a tcpdump on the receiving machine (real interface and maybe bat0) * Go to your router and check mtu of your wan interface * Try to ping google.com with the maximum size (mtu - 28 bytes, for example mtu 1492): ping -M do -s 1464 google.com * Send small tcp packet with small tcp response: echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
Best regards, Sven
I've been busy trying to track down these ping issues and it appears to be a problem with the actual ping program supplied with open rather than a network problem.
I know get the following results from the mesh router
root@Generic:~# /usr/bin/ping -M do -s 1472 google.com PING google.com (74.125.39.105) 1472(1500) bytes of data. 72 bytes from fx-in-f105.1e100.net (74.125.39.105): icmp_seq=1 ttl=53 (truncated) 72 bytes from fx-in-f105.1e100.net (74.125.39.105): icmp_seq=2 ttl=53 (truncated)
root@Generic:~# /usr/bin/ping -M do -s 1473 google.com PING google.com (74.125.39.99) 1473(1501) bytes of data.
From 192.168.1.123 icmp_seq=1 Frag needed and DF set (mtu = 1500) From 192.168.1.123 icmp_seq=1 Frag needed and DF set (mtu = 1500)
So large pings appear to be going over the batman interface.
However still not getting any web traffic through
root@Generic:~# echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
root@Generic:~# wget http://www.google.com Connecting to www.google.com (74.125.39.104:80)
What else can i provide to help track down the problem here :-(
Dave
On Tue, Aug 17, 2010 at 11:14 AM, David Beaumont djb31st@gmail.com wrote:
The plot thickens..
i started producing the tcp dumps that you requested to take a look at and noticed the following.
On the main internet node, if i ping google.com everything is fine. However if i ping -s 1464 google.com i do not get a reply, this isn't even going over the batman interface. So it looks like i have more of a local problem.
To clarify
ping -s 1464 google.com
results in ping requests being sent and recieved on ETH1, but not being returned to br-lan
ping google.com
results in ping requests being sent and recieved on ETH1, and being returned on br-lan
ping -s 84 google.com will work ping -s 85 google.com will not work.
I've never encountered these issues before, but i think they are the route cause of my problem? As was initially stated an MTU issue, i just need to find where!
echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
from the mesh node brings no results, although works as expected on the internet node.
On Mon, Aug 16, 2010 at 7:32 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
David Beaumont wrote:
Sorry for the late reply, a few things came up over the weekend that i had to attend to.
Here are three tcp dump files from the internet node on bat0 and one on eth1 (the internet port)
Really don't understand what is wrong here :-(
Ok, test plan:
* Find the machine and interface were a response from google.com could be received but which will not forward it to the other interface * take a real dump on all interfaces (wan and lan) tcpdump -s 0 -i eth1 -w eth1.dump * when the response packet is forwarded over the lan/bat0 interface but doesn't get to the final machine than please also create a tcpdump on the receiving machine (real interface and maybe bat0) * Go to your router and check mtu of your wan interface * Try to ping google.com with the maximum size (mtu - 28 bytes, for example mtu 1492): ping -M do -s 1464 google.com * Send small tcp packet with small tcp response: echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
Best regards, Sven
Sorry previous message got cut off...
I've been busy trying to track down these ping issues and it appears to be a problem with the actual ping program supplied with open rather than a network problem.
I know get the following results from the mesh router
Generic:~# /usr/bin/ping -M do -s 1472 google.com PING google.com (74.125.39.105) 1472(1500) bytes of data. 72 bytes from fx-in-f105.1e100.net (74.125.39.105): icmp_seq=1 ttl=53 (truncated) 72 bytes from fx-in-f105.1e100.net (74.125.39.105): icmp_seq=2 ttl=53 (truncated)
Generic:~# /usr/bin/ping -M do -s 1473 google.com PING google.com (74.125.39.99) 1473(1501) bytes of data.
From 192.168.1.123 icmp_seq=1 Frag needed and DF set (mtu = 1500) From 192.168.1.123 icmp_seq=1 Frag needed and DF set (mtu = 1500)
So large pings appear to be going over the batman interface.
However still not getting any web traffic through
root@Generic:~# echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
root@Generic:~# wget http://www.google.com Connecting to www.google.com (74.125.39.104:80)
What else can i provide to help track down the problem here :-(
Dave
On Tue, Aug 17, 2010 at 11:14 AM, David Beaumont djb31st@gmail.com wrote:
The plot thickens..
i started producing the tcp dumps that you requested to take a look at and noticed the following.
On the main internet node, if i ping google.com everything is fine. However if i ping -s 1464 google.com i do not get a reply, this isn't even going over the batman interface. So it looks like i have more of a local problem.
To clarify
ping -s 1464 google.com
results in ping requests being sent and recieved on ETH1, but not being returned to br-lan
ping google.com
results in ping requests being sent and recieved on ETH1, and being returned on br-lan
ping -s 84 google.com will work ping -s 85 google.com will not work.
I've never encountered these issues before, but i think they are the route cause of my problem? As was initially stated an MTU issue, i just need to find where!
echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
from the mesh node brings no results, although works as expected on the internet node.
On Mon, Aug 16, 2010 at 7:32 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
David Beaumont wrote:
Sorry for the late reply, a few things came up over the weekend that i had to attend to.
Here are three tcp dump files from the internet node on bat0 and one on eth1 (the internet port)
Really don't understand what is wrong here :-(
Ok, test plan:
* Find the machine and interface were a response from google.com could be received but which will not forward it to the other interface * take a real dump on all interfaces (wan and lan) tcpdump -s 0 -i eth1 -w eth1.dump * when the response packet is forwarded over the lan/bat0 interface but doesn't get to the final machine than please also create a tcpdump on the receiving machine (real interface and maybe bat0) * Go to your router and check mtu of your wan interface * Try to ping google.com with the maximum size (mtu - 28 bytes, for example mtu 1492): ping -M do -s 1464 google.com * Send small tcp packet with small tcp response: echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
Best regards, Sven
So large pings appear to be going over the batman interface.
However still not getting any web traffic through
root@Generic:~# echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
root@Generic:~# wget http://www.google.com Connecting to www.google.com (74.125.39.104:80)
What else can i provide to help track down the problem here :-(
Dave
On Fri, Aug 20, 2010 at 12:57 PM, David Beaumont djb31st@gmail.com wrote:
Sorry previous message got cut off...
I've been busy trying to track down these ping issues and it appears to be a problem with the actual ping program supplied with open rather than a network problem.
I know get the following results from the mesh router
Generic:~# /usr/bin/ping -M do -s 1472 google.com PING google.com (74.125.39.105) 1472(1500) bytes of data. 72 bytes from fx-in-f105.1e100.net (74.125.39.105): icmp_seq=1 ttl=53 (truncated) 72 bytes from fx-in-f105.1e100.net (74.125.39.105): icmp_seq=2 ttl=53 (truncated)
Generic:~# /usr/bin/ping -M do -s 1473 google.com PING google.com (74.125.39.99) 1473(1501) bytes of data. From 192.168.1.123 icmp_seq=1 Frag needed and DF set (mtu = 1500) From 192.168.1.123 icmp_seq=1 Frag needed and DF set (mtu = 1500)
So large pings appear to be going over the batman interface.
However still not getting any web traffic through
root@Generic:~# echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
root@Generic:~# wget http://www.google.com Connecting to www.google.com (74.125.39.104:80)
What else can i provide to help track down the problem here :-(
Dave
On Tue, Aug 17, 2010 at 11:14 AM, David Beaumont djb31st@gmail.com wrote:
The plot thickens..
i started producing the tcp dumps that you requested to take a look at and noticed the following.
On the main internet node, if i ping google.com everything is fine. However if i ping -s 1464 google.com i do not get a reply, this isn't even going over the batman interface. So it looks like i have more of a local problem.
To clarify
ping -s 1464 google.com
results in ping requests being sent and recieved on ETH1, but not being returned to br-lan
ping google.com
results in ping requests being sent and recieved on ETH1, and being returned on br-lan
ping -s 84 google.com will work ping -s 85 google.com will not work.
I've never encountered these issues before, but i think they are the route cause of my problem? As was initially stated an MTU issue, i just need to find where!
echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
from the mesh node brings no results, although works as expected on the internet node.
On Mon, Aug 16, 2010 at 7:32 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
David Beaumont wrote:
Sorry for the late reply, a few things came up over the weekend that i had to attend to.
Here are three tcp dump files from the internet node on bat0 and one on eth1 (the internet port)
Really don't understand what is wrong here :-(
Ok, test plan:
* Find the machine and interface were a response from google.com could be received but which will not forward it to the other interface * take a real dump on all interfaces (wan and lan) tcpdump -s 0 -i eth1 -w eth1.dump * when the response packet is forwarded over the lan/bat0 interface but doesn't get to the final machine than please also create a tcpdump on the receiving machine (real interface and maybe bat0) * Go to your router and check mtu of your wan interface * Try to ping google.com with the maximum size (mtu - 28 bytes, for example mtu 1492): ping -M do -s 1464 google.com * Send small tcp packet with small tcp response: echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
Best regards, Sven
On Friday 20 August 2010 11:58:32 David Beaumont wrote:
So large pings appear to be going over the batman interface.
So, first you say that all packets go over the bat interface and that this part works fine. Now you say that large packets will also work... which is no gain of information for the batman-adv related parts.
However still not getting any web traffic through
root@Generic:~# echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
root@Generic:~# wget http://www.google.com Connecting to www.google.com (74.125.39.104:80)
What else can i provide to help track down the problem here :-(
Create a real minimal setup. Minimal as possible. Get that working and then at parts to it (iptables, bridges, ...) until it doesn't work anymore. Check if that is real the part which makes the problem by reducing the complexity of other parts you already added.
You already told us that it is not related to batman-adv and that the bridge makes problems.
Actually nobody understands here what you are currently try to archive with your setup and why all the iptables or maybe ebtables stuff/bridges/... is needed to find a problem.
And why have both mesh and net (for whatever they are used) a masquerade rule in postrouting?
Simplest setup would be: * net is a nat router; everything in iptables to accept: iptables -F iptables -t nat -F iptables -t mangle -F iptables -X iptables -P INPUT ACCEPT iptables -P FORWARD ACCEPT iptables -P OUTPUT ACCEPT masquerade enabled iptables -t nat -A POSTROUTING -o "${OUTIF}" -j MASQUERADE * configure outif (the thing which has globally routable address) * enable wired connection between net and mesh by adding them to the same subnet (eth0 on net 192.168.1.1, eth0 on mesh 192.168.1.2) * Try to ping each other * test if connection between net and internet works flawless * test if connection between mesh and indirectly to the internet over net works flawless * set mtu of eth0 on both sides to 1530 * check if `ping -M do -s 1500` works between both net and mesh * remove ip addresses of eth0 on both ends (but keep devices up) * add eth0 on both sides using `batctl if add` to bat0 * set mtu of bat0 to 1500 on both hosts * give bat0 the same ips which were used before by eth0 * set bat0 up * check if both hosts finds each other using `batctl o` * try to ping other host * try if internet works flawless indirectly from mesh over net * remove ip from bat0 devices * add bat0 to a bridge on both ends * set ips which were used by bat0 to the bridge devices * set mtu of bridge to 1500 * try to.... I think you can guess the next 1000 steps by yourself
Regards, Sven
Hi Sven,
Thanks so much for your patience with this matter and for holding my hand though the process.
I have gone back to basics (i think trying to adapt my current olsr platform over to this was the root cause of the issue) and now i am able to retrieve webpages over the bat interface.
I'll briefly outline the steps i took to get batman working with openwrt Kamikaze
*net router*
configure wan
vi /etc/config/network config interface wan option ifname "eth1" option proto dhcp
*the following should be done on both the net and mesh router*
set the ip address of eth0 to 192.168.1.1 (.2 on the mesh)
configure wireless cards in /etc/config/wireless so that they have the same BSSID and channel.
opkg update opkg install kmod-batman-advanced
reboot
bridge the eth0 and bat0 interfaces (may not be the best way)
Ensure that you can see the other node cat /proc/net/batman-adv/originators
test pinging the internet
Thanks
On Fri, Aug 20, 2010 at 2:27 PM, Sven Eckelmann sven.eckelmann@gmx.de wrote:
On Friday 20 August 2010 11:58:32 David Beaumont wrote:
So large pings appear to be going over the batman interface.
So, first you say that all packets go over the bat interface and that this part works fine. Now you say that large packets will also work... which is no gain of information for the batman-adv related parts.
However still not getting any web traffic through
root@Generic:~# echo "HEAD / HTTP/1.1\nHost: git.open-mesh.net\n\n"|nc git.open-mesh.net 80
root@Generic:~# wget http://www.google.com Connecting to www.google.com (74.125.39.104:80)
What else can i provide to help track down the problem here :-(
Create a real minimal setup. Minimal as possible. Get that working and then at parts to it (iptables, bridges, ...) until it doesn't work anymore. Check if that is real the part which makes the problem by reducing the complexity of other parts you already added.
You already told us that it is not related to batman-adv and that the bridge makes problems.
Actually nobody understands here what you are currently try to archive with your setup and why all the iptables or maybe ebtables stuff/bridges/... is needed to find a problem.
And why have both mesh and net (for whatever they are used) a masquerade rule in postrouting?
Simplest setup would be: * net is a nat router; everything in iptables to accept: iptables -F iptables -t nat -F iptables -t mangle -F iptables -X iptables -P INPUT ACCEPT iptables -P FORWARD ACCEPT iptables -P OUTPUT ACCEPT masquerade enabled iptables -t nat -A POSTROUTING -o "${OUTIF}" -j MASQUERADE * configure outif (the thing which has globally routable address) * enable wired connection between net and mesh by adding them to the same subnet (eth0 on net 192.168.1.1, eth0 on mesh 192.168.1.2) * Try to ping each other * test if connection between net and internet works flawless * test if connection between mesh and indirectly to the internet over net works flawless * set mtu of eth0 on both sides to 1530 * check if `ping -M do -s 1500` works between both net and mesh * remove ip addresses of eth0 on both ends (but keep devices up) * add eth0 on both sides using `batctl if add` to bat0 * set mtu of bat0 to 1500 on both hosts * give bat0 the same ips which were used before by eth0 * set bat0 up * check if both hosts finds each other using `batctl o` * try to ping other host * try if internet works flawless indirectly from mesh over net * remove ip from bat0 devices * add bat0 to a bridge on both ends * set ips which were used by bat0 to the bridge devices * set mtu of bridge to 1500 * try to.... I think you can guess the next 1000 steps by yourself
Regards, Sven
b.a.t.m.a.n@lists.open-mesh.org