Re: [B.A.T.M.A.N.] race condition with activate_module?

11 Feb 2010


      Okay, I could narrow it down a little further: There is a problem
with the num_ifs variable. When activate_module() gets called in
proc_interfaces_write() and an ogm of a neighbour arrives after
this for the first time but before we've set 'num_ifs = if_num + 1;',
then we're not allocating enough space in get_orig_node(), leading
to a kernel panic.
num_ifs is just getting used in those two functions,
locking this variable seemed an easy choice for fixing this. But
nevertheless, I'm unsure if this might be enough, as quite a lot
of copies of num_ifs are being stored/modified in a lot of other
functions (if_num for instance) which gave me some headaches
today :). Therefore I'm doubting the simple locking of num_ifs
might be enough. Any ideas how this problem could be dealt with
instead?
The problem can be easily reproduced by adding a "ssleep(3)" for
instance in front of "num_ifs = if_num + 1;" in
proc_interfaces_write(). Then insmod, connect a running batman-adv
node to the other end of the interface being used and set those
interfaces up. Adding the interface to batman-adv then causes the
kernel panic within those 3 seconds then.
Putting the ssleep behind num_ifs = ... does not cause any kernel
panics on my vm here.
Cheers, Linus
On Mon, Feb 08, 2010 at 08:38:48PM +0100, Linus Lüssing wrote:
...
Hi guys,
I think I've seen this bug a couple of times but I've never been
able to reproduce it. Now I added a little patch to slow down the
activate_module() procedure and the bug occures every time now. My
question is, did I make a race condition apparent or did I introduce
a bug with this patch?
Cheers, Linus

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [B.A.T.M.A.N.] race condition with activate_module?