Hi,
after the last mail of David S. Miller, it was more than clear that I am not the right person to speak on behalf of the batman-adv community. I have the same opinion about the always changing protocol as he does. I first struggled with it after getting the batman dissector merged in wireshark, but it is confronting everyone since the merge into the linux kernel began. Andrew already told us that we would need to get some kind of stable on the wire format and some kind of backward compatibility, but it doesn't look like we get there soon.
I don't know who currently follows which vision, but I know that all those ideas make the protocol incompatible with the versions before (and this will happen again and again and again ...). I don't have answers how it can be done differently and still the current problems can be solved. That means, I can neither represent the position of the ongoing protocol developments nor the position of the backward compatibility.
So, I doubt that I can present the changes to David S. Miller any longer when I am sure that he doesn't like the patches for the same reasons I don't like them. It is better that I don't stand in the way as "grumbling old man" [2] and make the position free for someone who can really speak on behalf of the current active developers of batman-adv.
So the first thing I did was a big cleanup. Most of my git repositories are gone now:
* ecsv/batctl-rebase / ecsv/master-rebase The development was changed to git some time ago. This makes those repository relatively useless when the development is done in the current branch structure. This means that next gets bugfixes/cleanups which still can be merged by David and then the branch is merged into master instead of cherry picking patches around. After a release the next branch (or better the tag of the release) has to be merged into master (without changing the SOURCE_VERSION in main.h to something else than "devel"), then the next branch merges the master branch and the new development cycle is started in next. * ecsv/debian/batctl ecsv/debian/batmand Those are only backups. I already have backups in different places - so no need to store them again on open-mesh * ecsv/viking That was a private repository - nobody seems to have noticed it. I doubt that those maintainers will merge it soon and I will probably just send them the patches over ml instead of doing a pull request again. * ecsv/git-conversation-svn Project to convert the open-mesh svn to smaller git repositories. Seems to have worked, but is now useless * ecsv/hash_regression Just to show that the hash regression was actually a regression in Linus code. Funny but useless :) * ecsv/post-commit-daemon A daemon which started svn hooks at a later point. Those build scripts on open-mesh were gruel, but I think that we killed them. This makes the script useless for us * ecsv/wireshark-batman-adv ecsv/wireshark-batman Both dissectors are merged in wireshark... somewhat. At least parts of v12/13 are now supported and it was now noticed that the v14 patches cannot be applied out of order... but I am full of hope that the patches will be enter the wireshark svn sometime in the future ("where no man has gone before..."). But is not necessary to have the plugins in an external module. It can't be build against an older version of wireshark and new version of wireshark can't load it due to the conflicting names. * ecsv/linux-merge Ok, this one is not really gone - it just can be found under marek/linux-merge. Maybe it will change the place again.
There are also some scripts which run on daily basis and are a little bit more verbose.
* /home/batman/linux-next-check/sync-git It downloads the linux-next tree, Linus tree and pushes parts of the changes to linux-merge.git - this should be quite silent. But the ./hooks/manual-hook in /home/batman/linux-next-check/linux-next.git/ is triggered and sends mails about changes in net/batman-adv, Documentation/networking/batman-adv.txt, Documentation/ABI/testing/sysfs-class-net-batman-adv and Documentation/ABI/testing/sysfs-class-net-mesh to the unhappy person mentioned in /home/batman/linux-next-check/linux-next.git/config (currently Marek and Simon) * /home/batman/packet.h_check/check.sh This one just send the difference of packet.h in all branches in batctl and batman-adv to the responsible persons (see the TO="..." at the beginning of the file) * /home/batman/build_test/checkstuff.sh This is more or less the build monster. It builds against kernel 2.6.21 till 2.6.39. I know that it will not work directly against 3.0, but it is no big change to fix it. It uses sparse, checkpatch.pl from linux-next and the minimized kernel sources generated through make_all.sh (of course, this one also doesn't know about 3.0) Marek and Simon will see the rcu warnings every morning till Andrew Morton forward my rcu checkpatch patch or it appears through other channels in linux-next.
The rest should be explained on wiki pages, /srv/git/README or some private mails.
Kind regards, Sven
[1] https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2011-June/005020.html [2] http://upload.wikimedia.org/wikipedia/en/8/8b/StatlerAndWaldorf.jpg
On Tue, Jun 21, 2011 at 03:12:31PM +0200, Sven Eckelmann wrote:
Hi,
after the last mail of David S. Miller, it was more than clear that I am not the right person to speak on behalf of the batman-adv community.
Hi Sven
You have done great work with git, getting RCU correct, cleanup, etc. Even if you don't feel capable of speaking on behalf of the batman-adv community in the direction of David S. Miller, i hope you can stay around and contribute to the project.
I have the same opinion about the always changing protocol as he does. I first struggled with it after getting the batman dissector merged in wireshark, but it is confronting everyone since the merge into the linux kernel began. Andrew already told us that we would need to get some kind of stable on the wire format and some kind of backward compatibility, but it doesn't look like we get there soon.
There are some big changes in the pipeline. NDP from Linus, HNA/TT changes from Antonio, Multicast from Linus and Simon, Linus's GSOC work, other things i've forgotten....
Here is a few ideas for discussion.....
Explain to David that these changes are in the pipeline. Explain what benefits they bring. And probably most importantly, try to promise they will all arrive at once. This might mean delaying some features for a while, but it will upset compatibility the least. After that, there will not be any none compatible changes for "a long time". We should discussion here what "a long time" means, eg 4 kernel cycles, 8 kernel cycles, etc.
At the same time as getting these big changes ready to go, it would be good to ensure that we have options to make backward compatible changes during this "long time". I would suggest we document all the available reserved bits we have in the different messages. Ensure they are always set to 0 in the current implementation when creating messages and always ignored when receiving messages. Also, for messages which are forwarded, maybe it makes sense to ensure that a well defined number of the reserved bits get forwarded as they are received and the rest get reset back to 0. Also, received messages of unknown type are silently dropped. Maybe, unknown messages with a particular bit set could be forwarded using the routing table, using an originator address in a well known location in the message.
The point of this is to put infrastructure in place to allow the protocol to be extended without breaking compatibility. It will limit what can be added as new features, cause more headaches while figuring out how to implement something using only this infrastructure, but will keep a lot of people happy they don't need a flag day when upgrading their kernel to an incompatible batman-adv version.
Andrew
On Tuesday, June 21, 2011 10:01:24 PM Andrew Lunn wrote:
Explain to David that these changes are in the pipeline. Explain what benefits they bring. And probably most importantly, try to promise they will all arrive at once. This might mean delaying some features for a while, but it will upset compatibility the least. After that, there will not be any none compatible changes for "a long time". We should discussion here what "a long time" means, eg 4 kernel cycles, 8 kernel cycles, etc.
[..]
The point of this is to put infrastructure in place to allow the protocol to be extended without breaking compatibility. It will limit what can be added as new features, cause more headaches while figuring out how to implement something using only this infrastructure, but will keep a lot of people happy they don't need a flag day when upgrading their kernel to an incompatible batman-adv version.
Merging all "big" features at once does not seem feasible. We still want to be able to deliver something that does not break each and every bit at the same time. I also doubt that David would be happy with a big blob to be merged at once. In case you refer to aggregating compatibility changes - that is what we did. This patchset not only contained the TT protocol changes but also the TTL header changes we discussed at the WBMv4.
For the upcoming routing protocol changes I propose the following: First we abstract the routing handling and adjust the current routing algo to be usable. Then we add a compile time option to choose this algo or the older one (afaik the wireless folks do the same with their rate control algorithm). The new algo can be marked as experimental and be completed step by step.
Cheers, Marek
Hi Marek
Merging all "big" features at once does not seem feasible. We still want to be able to deliver something that does not break each and every bit at the same time. I also doubt that David would be happy with a big blob to be merged at once.
I guess you need to ask David. Does he prefer one big change, which breaks compatibility once, but has a high risk of being buggy. Or does he prefer breaking compatibility for the next X kernel releases until all the features are merged?
For the upcoming routing protocol changes I propose the following: First we abstract the routing handling and adjust the current routing algo to be usable. Then we add a compile time option to choose this algo or the older one (afaik the wireless folks do the same with their rate control algorithm). The new algo can be marked as experimental and be completed step by step.
Im not sure the wireless folks example is valid. I doubt changing the rate control algorithm changes the "in the air" protocol. Thus i can mix experimental and well established algos in a wireless net and it will all happily work.
I think maybe a better example is TCP flow control algorithms. The TCP messages stay the same, but how fast start, delayed or repeated ACKs, enlarging the window etc vary. So TCP-vegas, TCP-reno, TCP-new-reno, are all compatible with each other and can be mixed on the Internet.
Do you think you can have two routing protocols, side by side, using the same "in the air" protocol and they are compatible with each other? David's request is that the "in the air" protocol is fixed. What is behind the protocol can evolve, but the "in the air" messages have to remain compatible.
Andrew
b.a.t.m.a.n@lists.open-mesh.org