Dynamic DHCP server assignment and spin-up on batman-adv mesh network

List overview All Threads
Download

newer

older

Re: Some bugs in batmand(Github)

[syzbot] INFO: task hung in...

tanner.perkins＠cnftech.com

31 Aug 2021 31 Aug '21

5:28 p.m.

If this is not the best place to ask questions regarding mesh networks utilizing the batman-adv kernel module, I apologies and please point me to where I need to be.

I'm looking to set up distributed mesh network using the batman-adv Linux kernel module. However, I don't want to have to statically assign IP addresses to all my nodes therefore my first thought was to use DHCP. The problem arises in my scenario that any node could come and go in the mesh network as they move in and out range of the network. Therefore manually allocating a single or even a few DHCP servers isn't realistic as that DHCP server may drop out of the network at anytime. Is there a dynamic way to reassign the DHCP server based on the nodes still within the network when the previous DHCP server drops from the network?

Thanks, -tdev

Show replies by date

Linus Lüssing

1 Sep 1 Sep

7:54 p.m.

On Tue, Aug 31, 2021 at 05:28:41PM -0000, tanner.perkins@cnftech.com wrote:

...

If this is not the best place to ask questions regarding mesh networks utilizing the batman-adv kernel module, I apologies and please point me to where I need to be.

I'm looking to set up distributed mesh network using the batman-adv Linux kernel module. However, I don't want to have to statically assign IP addresses to all my nodes therefore my first thought was to use DHCP. The problem arises in my scenario that any node could come and go in the mesh network as they move in and out range of the network. Therefore manually allocating a single or even a few DHCP servers isn't realistic as that DHCP server may drop out of the network at anytime. Is there a dynamic way to reassign the DHCP server based on the nodes still within the network when the previous DHCP server drops from the network?

Tricky question. DHCP was probably not designed with highly dynamic networks in mind back then.

Typically people run a few DHCP servers on hosts with a high availability and not mesh nodes that might go offline at "random" times. And then use the batman-adv gateway feature to stear DHCP requests to the "best" one [0]. For some topologies people also make use of the batman-adv "Bridge Loop Avoidance" [1] feature and place the DHCP server(s) on a common LAN backbone to which multiple nodes are connected via cable, which can add some extra fault tolerance. And maybe set the lease interval to something faster then the usual defaults.

Someone had also been writing DDHCP, a "Distributed DHCP Daemon" here [2] but I'm not sure if it is actually used by anyone at the moment. The original idea was to integrate it into the OpenWrt based mesh routing framework "Gluon" [3], so that every node would be a DHCP server for all its locally connected client devices and the DHCP requests from the client devices would be filtered from entering the mesh directly. And the DDHCP servers would organize leases among themselves. But there hasn't been a PR for Gluon yet. But it was tested (and developed) at Freifunk Kiel at some point. If you were to try it out, I'd be very interested to hear what your experiences with it are.

Next there is also AHCP [4] which was built for/with BABEL I think. But don't know how it actually works and if it could be useful on layer 2 at all or if it is only usable with a layer 3 mesh routing protocol.

Regards, Linus

[0]: https://www.open-mesh.org/projects/batman-adv/wiki/Gateways [2]: https://github.com/sargon/ddhcpd [3]: https://www.open-mesh.org/projects/batman-adv/wiki/Bridge-loop-avoidance-II#... [4]: https://www.irif.fr/~jch/software/ahcp/

...

Thanks, -tdev

Linus Lüssing

8:12 p.m.

On Wed, Sep 01, 2021 at 09:54:36PM +0200, Linus Lüssing wrote:

...

On Tue, Aug 31, 2021 at 05:28:41PM -0000, tanner.perkins@cnftech.com wrote:

...
If this is not the best place to ask questions regarding mesh networks utilizing the batman-adv kernel module, I apologies and please point me to where I need to be.

I'm looking to set up distributed mesh network using the batman-adv Linux kernel module. However, I don't want to have to statically assign IP addresses to all my nodes therefore my first thought was to use DHCP. The problem arises in my scenario that any node could come and go in the mesh network as they move in and out range of the network. Therefore manually allocating a single or even a few DHCP servers isn't realistic as that DHCP server may drop out of the network at anytime. Is there a dynamic way to reassign the DHCP server based on the nodes still within the network when the previous DHCP server drops from the network?

Tricky question. DHCP was probably not designed with highly dynamic networks in mind back then.

Typically people run a few DHCP servers on hosts with a high availability and not mesh nodes that might go offline at "random" times. And then use the batman-adv gateway feature to stear DHCP requests to the "best" one [0]. For some topologies people also make use of the batman-adv "Bridge Loop Avoidance" [1] feature and place the DHCP server(s) on a common LAN backbone to which multiple nodes are connected via cable, which can add some extra fault tolerance. And maybe set the lease interval to something faster then the usual defaults.

Someone had also been writing DDHCP, a "Distributed DHCP Daemon" here [2] but I'm not sure if it is actually used by anyone at the moment. The original idea was to integrate it into the OpenWrt based mesh routing framework "Gluon" [3], so that every node would be a DHCP server for all its locally connected client devices and the DHCP requests from the client devices would be filtered from entering the mesh directly. And the DDHCP servers would organize leases among themselves. But there hasn't been a PR for Gluon yet. But it was tested (and developed) at Freifunk Kiel at some point. If you were to try it out, I'd be very interested to hear what your experiences with it are.

Next there is also AHCP [4] which was built for/with BABEL I think. But don't know how it actually works and if it could be useful on layer 2 at all or if it is only usable with a layer 3 mesh routing protocol.

Regards, Linus

Sorry, links and numberings are off, should have been:

[0]: https://www.open-mesh.org/projects/batman-adv/wiki/Gateways [1]: https://www.open-mesh.org/projects/batman-adv/wiki/Bridge-loop-avoidance-II#... [2]: https://github.com/sargon/ddhcpd [3]: https://github.com/freifunk-gluon/gluon/ [4]: https://www.irif.fr/~jch/software/ahcp/

Steve Newcomb

2 Sep 2 Sep

1:02 p.m.

On 8/31/21 1:28 PM, tanner.perkins@cnftech.com wrote:

...

I don't want to have to statically assign IP addresses to all my nodes therefore my first thought was to use DHCP. The problem arises in my scenario that any node could come and go in the mesh network as they move in and out range of the network. Therefore manually allocating a single or even a few DHCP servers isn't realistic as that DHCP server may drop out of the network at anytime. Is there a dynamic way to reassign the DHCP server based on the nodes still within the network when the previous DHCP server drops from the network?

After trying everything to cope with reliability problems that ensued when I attempted to assign MAC addresses to the mesh radios, I ultimately found myself forced to adopt a hybrid (partly-static and partly-dynamic) approach. At last I have reliable performance, but I was unable to avoid the bookkeeping involved in knowing the MAC address of each batman radio, and the IP I had statically assigned to it.

(Aside: The native MAC address of a 5 GHz radio in an Archer A7 or C7 is evidently an emergent property. If you assign a different one, the mesh will work for a while, and then die mysteriously. Perhaps this has to do with the fact that the hardware in combination with the stock QCA firmware is running an 802.11s network, and BATMAN is running on top of that 802.11s network. Maybe if you just use the hardware's native MAC, the two implementations don't clash over the question of what the MAC address is. That's a guess. Anyway, it works quite well that way, but not the other way.)

As annoying as the bookkeeping is, it's do-able, and one does have to set the credentials of each node in any event; it's just another tiny chore to configure each node. I've now scripted whole configuration issue in self-defense, because it takes too much mental energy to think each node through individually and then debug it.

The dynamic part is that each 2.4 GHz radio can administer its own LAN, providing its own independent DHCP server. Originally I had a single DHCP server for the whole mesh (actually for several meshes), but this proved impractical over the long term. More to the point, I had no choice but to make each node's LAN self-sufficient, because only then could I go near where a mysteriously offline node was, near some building to which I had no access, and then reliably get into the node and find out what happened via 2.4 GHz wifi in the normal way (which for me is an ssh tunnel).

Steve Newcomb

1170

Age (days ago)

1172

Last active (days ago)

b.a.t.m.a.n@lists.open-mesh.org

3 comments

3 participants

tags (0)

participants (3)

Linus Lüssing
Steve Newcomb
tanner.perkins＠cnftech.com