On Mon, 7 Oct 2019 12:04:16 +0200 Marco Elver elver@google.com wrote:
+RCU maintainers This might be a data-race in RCU itself.
write to 0xffffffff85a7f140 of 8 bytes by task 7 on cpu 0: rcu_report_exp_cpu_mult+0x4f/0xa0 kernel/rcu/tree_exp.h:244
Here we have:
raw_spin_lock_irqsave_rcu_node(rnp, flags); if (!(rnp->expmask & mask)) { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); return; } rnp->expmask &= ~mask; __rcu_report_exp_rnp(rnp, wake, flags); /* Releases rnp->lock. */
read to 0xffffffff85a7f140 of 8 bytes by task 7251 on cpu 1: _find_next_bit lib/find_bit.c:39 [inline] find_next_bit+0x57/0xe0 lib/find_bit.c:70 sync_rcu_exp_select_node_cpus+0x28e/0x510 kernel/rcu/tree_exp.h:375
and here we have:
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
/* IPI the remaining CPUs for expedited quiescent state. */ for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) {
The write to rnp->expmask is done under the rnp->lock, but on the read side, that lock is released before the for loop. Should we have something like:
unsigned long expmask; [...]
expmask = rnp->expmask; raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
/* IPI the remaining CPUs for expedited quiescent state. */ for_each_leaf_node_cpu_mask(rnp, cpu, expmask) {
?
-- Steve