efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is later used to allocate and traverse efx->xdp_tx_queues lookup array. However, we may end up not initializing all the array slots with real queues during probing. This results, for example, in a NULL pointer dereference, when running "# ethtool -S <iface>", similar to below
[2570283.664955][T4126959] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [2570283.681283][T4126959] #PF: supervisor read access in kernel mode [2570283.695678][T4126959] #PF: error_code(0x0000) - not-present page [2570283.710013][T4126959] PGD 0 P4D 0 [2570283.721649][T4126959] Oops: 0000 [#1] SMP PTI [2570283.734108][T4126959] CPU: 23 PID: 4126959 Comm: ethtool Tainted: G O 5.10.20-cloudflare-2021.3.1 #1 [2570283.752641][T4126959] Hardware name: <redacted> [2570283.781408][T4126959] RIP: 0010:efx_ethtool_get_stats+0x2ca/0x330 [sfc] [2570283.796073][T4126959] Code: 00 85 c0 74 39 48 8b 95 a8 0f 00 00 48 85 d2 74 2d 31 c0 eb 07 48 8b 95 a8 0f 00 00 48 63 c8 49 83 c4 08 83 c0 01 48 8b 14 ca <48> 8b 92 f8 00 00 00 49 89 54 24 f8 39 85 a0 0f 00 00 77 d7 48 8b [2570283.831259][T4126959] RSP: 0018:ffffb79a77657ce8 EFLAGS: 00010202 [2570283.845121][T4126959] RAX: 0000000000000019 RBX: ffffb799cd0c9280 RCX: 0000000000000018 [2570283.860872][T4126959] RDX: 0000000000000000 RSI: ffff96dd970ce000 RDI: 0000000000000005 [2570283.876525][T4126959] RBP: ffff96dd86f0a000 R08: ffff96dd970ce480 R09: 000000000000005f [2570283.892014][T4126959] R10: ffffb799cd0c9fff R11: ffffb799cd0c9000 R12: ffffb799cd0c94f8 [2570283.907406][T4126959] R13: ffffffffc11b1090 R14: ffff96dd970ce000 R15: ffffffffc11cd66c [2570283.922705][T4126959] FS: 00007fa7723f8740(0000) GS:ffff96f51fac0000(0000) knlGS:0000000000000000 [2570283.938848][T4126959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2570283.952524][T4126959] CR2: 00000000000000f8 CR3: 0000001a73e6e006 CR4: 00000000007706e0 [2570283.967529][T4126959] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [2570283.982400][T4126959] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [2570283.997308][T4126959] PKRU: 55555554 [2570284.007649][T4126959] Call Trace: [2570284.017598][T4126959] dev_ethtool+0x1832/0x2830
Fix this by adjusting efx->xdp_tx_queue_count after probing to reflect the true value of initialized slots in efx->xdp_tx_queues.
Signed-off-by: Ignat Korchagin ignat@cloudflare.com Fixes: e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues") Cc: stable@vger.kernel.org # 5.12.x --- drivers/net/ethernet/sfc/efx_channels.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c index 1bfeee283ea9..a3ca406a3561 100644 --- a/drivers/net/ethernet/sfc/efx_channels.c +++ b/drivers/net/ethernet/sfc/efx_channels.c @@ -914,6 +914,8 @@ int efx_set_channels(struct efx_nic *efx) } } } + if (xdp_queue_number) + efx->xdp_tx_queue_count = xdp_queue_number;
rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels); if (rc)
Hello:
This patch was applied to netdev/net-next.git (refs/heads/master):
On Tue, 27 Apr 2021 22:09:38 +0100 you wrote:
efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is later used to allocate and traverse efx->xdp_tx_queues lookup array. However, we may end up not initializing all the array slots with real queues during probing. This results, for example, in a NULL pointer dereference, when running "# ethtool -S <iface>", similar to below
[2570283.664955][T4126959] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [2570283.681283][T4126959] #PF: supervisor read access in kernel mode [2570283.695678][T4126959] #PF: error_code(0x0000) - not-present page [2570283.710013][T4126959] PGD 0 P4D 0 [2570283.721649][T4126959] Oops: 0000 [#1] SMP PTI [2570283.734108][T4126959] CPU: 23 PID: 4126959 Comm: ethtool Tainted: G O 5.10.20-cloudflare-2021.3.1 #1 [2570283.752641][T4126959] Hardware name: <redacted> [2570283.781408][T4126959] RIP: 0010:efx_ethtool_get_stats+0x2ca/0x330 [sfc] [2570283.796073][T4126959] Code: 00 85 c0 74 39 48 8b 95 a8 0f 00 00 48 85 d2 74 2d 31 c0 eb 07 48 8b 95 a8 0f 00 00 48 63 c8 49 83 c4 08 83 c0 01 48 8b 14 ca <48> 8b 92 f8 00 00 00 49 89 54 24 f8 39 85 a0 0f 00 00 77 d7 48 8b [2570283.831259][T4126959] RSP: 0018:ffffb79a77657ce8 EFLAGS: 00010202 [2570283.845121][T4126959] RAX: 0000000000000019 RBX: ffffb799cd0c9280 RCX: 0000000000000018 [2570283.860872][T4126959] RDX: 0000000000000000 RSI: ffff96dd970ce000 RDI: 0000000000000005 [2570283.876525][T4126959] RBP: ffff96dd86f0a000 R08: ffff96dd970ce480 R09: 000000000000005f [2570283.892014][T4126959] R10: ffffb799cd0c9fff R11: ffffb799cd0c9000 R12: ffffb799cd0c94f8 [2570283.907406][T4126959] R13: ffffffffc11b1090 R14: ffff96dd970ce000 R15: ffffffffc11cd66c [2570283.922705][T4126959] FS: 00007fa7723f8740(0000) GS:ffff96f51fac0000(0000) knlGS:0000000000000000 [2570283.938848][T4126959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2570283.952524][T4126959] CR2: 00000000000000f8 CR3: 0000001a73e6e006 CR4: 00000000007706e0 [2570283.967529][T4126959] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [2570283.982400][T4126959] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [2570283.997308][T4126959] PKRU: 55555554 [2570284.007649][T4126959] Call Trace: [2570284.017598][T4126959] dev_ethtool+0x1832/0x2830
[...]
Here is the summary with links: - sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues https://git.kernel.org/netdev/net-next/c/99ba0ea616aa
You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
On 27/04/2021 22:09, Ignat Korchagin wrote:
efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is later used to allocate and traverse efx->xdp_tx_queues lookup array. However, we may end up not initializing all the array slots with real queues during probing. This results, for example, in a NULL pointer dereference, when running "# ethtool -S <iface>", similar to below
...
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c index 1bfeee283ea9..a3ca406a3561 100644 --- a/drivers/net/ethernet/sfc/efx_channels.c +++ b/drivers/net/ethernet/sfc/efx_channels.c @@ -914,6 +914,8 @@ int efx_set_channels(struct efx_nic *efx) } } }
- if (xdp_queue_number)
Wait, why is this guard condition needed? What happens if we had nonzero efx->xdp_tx_queue_count initially, but we end up with no TXQs available for XDP at all (so xdp_queue_number == 0)?
-ed
efx->xdp_tx_queue_count = xdp_queue_number;
rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels); if (rc)
On Thu, Apr 29, 2021 at 3:22 PM Edward Cree ecree.xilinx@gmail.com wrote:
On 27/04/2021 22:09, Ignat Korchagin wrote:
efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is later used to allocate and traverse efx->xdp_tx_queues lookup array. However, we may end up not initializing all the array slots with real queues during probing. This results, for example, in a NULL pointer dereference, when running "# ethtool -S <iface>", similar to below
...
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c index 1bfeee283ea9..a3ca406a3561 100644 --- a/drivers/net/ethernet/sfc/efx_channels.c +++ b/drivers/net/ethernet/sfc/efx_channels.c @@ -914,6 +914,8 @@ int efx_set_channels(struct efx_nic *efx) } } }
if (xdp_queue_number)
Wait, why is this guard condition needed? What happens if we had nonzero efx->xdp_tx_queue_count initially, but we end up with no TXQs available for XDP at all (so xdp_queue_number == 0)?
-ed
My thoughts were: efx->xdp_tx_queue_count is originally used to allocate efx->xdp_tx_queues. So, if xdp_queue_number ends up being 0, we should keep efx->xdp_tx_queue_count positive not to forget to release efx->xdp_tx_queues (because most checks are efx->xdp_tx_queue_count && efx->xdp_tx_queues).
I'm not familiar enough with SFC internals to definitely say if it is even possible to have xdp_queue_number == 0 while having efx->xdp_tx_queue_count > 0, but my understanding is that it should not be possible due to the checks in the driver init path, when we actually determine the number of queues, channels, events per channel etc.
Ignat
efx->xdp_tx_queue_count = xdp_queue_number; rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels); if (rc)
On 29/04/2021 15:49, Ignat Korchagin wrote:
On Thu, Apr 29, 2021 at 3:22 PM Edward Cree ecree.xilinx@gmail.com wrote:
On 27/04/2021 22:09, Ignat Korchagin wrote:
if (xdp_queue_number)
Wait, why is this guard condition needed? What happens if we had nonzero efx->xdp_tx_queue_count initially, but we end up with no TXQs available for XDP at all (so xdp_queue_number == 0)?
-ed
My thoughts were: efx->xdp_tx_queue_count is originally used to allocate efx->xdp_tx_queues. So, if xdp_queue_number ends up being 0, we should keep efx->xdp_tx_queue_count positive not to forget to release efx->xdp_tx_queues (because most checks are efx->xdp_tx_queue_count && efx->xdp_tx_queues).
Well, we allocated it in this function, so could we not just free it (and NULL it) if we get here with xdp_queue_number == 0? Assuming it even makes sense for those checks to be that conjunction, and not just efx->xdp_tx_queues.
I'm not familiar enough with SFC internals to definitely say if it is even possible to have xdp_queue_number == 0 while having efx->xdp_tx_queue_count > 0
If it's possible for us to get xdp_queue_number != efx->xdp_tx_queue_count at all (which I can't remember exactly how it happens, but I think it's a case of not getting as many VIs back from firmware as we wanted, which happens after the initial determination of numbers of queues & channels), then it's possible that our number of available TXQs is reduced far enough that we don't have any left for XDP. At least, I think so; this part of the driver confuses me too :S
-ed
linux-stable-mirror@lists.linaro.org