On Tue, May 21, 2024 at 09:43:56AM +0200, Antonio Quartulli wrote:
Hi,
On 18/05/2024 17:50, Remi Pommarel wrote:
Wiphy should be locked before calling rdev_get_station() (see lockdep assert in ieee80211_get_station()).
Adding the lock is fine as nowadays it is taken in pre_doit and released in post_doit (with some exceptions). Therefore when invoking .get_station from a side path the lock should be taken too.
It was actually a05829a7222e9d10c416dd2dbbf3929fe6646b89 that introduced this requirement AFAICS.
IIUC lock requirement was already there before this commit, only it was on rtnl lock to be taken instead of wiphy one.
This fixes the following kernel NULL dereference:
As already said by Johannes, I am not sure it truly fixes this NULL dereference though.
Have you checked where in ath10k_sta_statistics this is exactly happening? Do you think some sta was partly released and thus fields were NULLified?
ath10k_sta_statistics+0x10 is associated with the arsta->arvif->ar statement. It crashes because arsta->arvif is NULL.
Here is a scenario that explains the crash where the same sta disconnects then reconnects quickly (e.g. OPEN network):
CPU0: CPU1:
batadv_v_elp_periodic_work() queue_work(batadv_v_elp_get_throughput)
ieee80211_del_station() ieee80211_add_station() sta_info_insert() list_add_tail_rcu() ath10k_sta_state() memset(arsta, 0, sizeof(arsta)) batadv_v_elp_get_throughput() cfg80211_get_station() ieee80211_get_station() <-- Find sta with its MAC in list ath10k_sta_statistics() arsta->arvif->ar <-- arsta is still zeroed
In other words if the same sta has enough time to disconnect then reconnect before batadv_v_elp_get_throughput get scheduled, sta could be partially initialized when ath10k_sta_statistics() is called. Locking the wiphy ensure sta is fully initialized if found.
Thanks,