The NCSI state machine as it's currently implemented assumes that
transition to the next logical state is performed either explicitly by
calling `schedule_work(&ndp->work)` to re-queue itself or implicitly
after processing the predefined (ndp->pending_req_num) number of
replies. Thus to avoid the configuration FSM from advancing prematurely
and getting out of sync with the process it's essential to not skip
waiting for a reply.
This patch makes the code wait for reception of the Deselect Package
response for the last package probed before proceeding to channel
configuration.
Thanks go to Potin Lai and Cosmo Chou for the initial investigation and
testing.
Fixes: 8e13f70be05e ("net/ncsi: Probe single packages to avoid conflict")
Cc: stable(a)vger.kernel.org
Signed-off-by: Paul Fertser <fercerpav(a)gmail.com>
---
net/ncsi/ncsi-manage.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index 5cf55bde366d..bf8e27b84a66 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -1373,6 +1373,12 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_package;
break;
case ncsi_dev_state_probe_package:
+ if (ndp->package_probe_id >= 8) {
+ /* Last package probed, finishing */
+ ndp->flags |= NCSI_DEV_PROBED;
+ break;
+ }
+
ndp->pending_req_num = 1;
nca.type = NCSI_PKT_CMD_SP;
@@ -1489,13 +1495,8 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
if (ret)
goto error;
- /* Probe next package */
+ /* Probe next package after receiving response */
ndp->package_probe_id++;
- if (ndp->package_probe_id >= 8) {
- /* Probe finished */
- ndp->flags |= NCSI_DEV_PROBED;
- break;
- }
nd->state = ncsi_dev_state_probe_package;
ndp->active_package = NULL;
break;
--
2.34.1
On Wed, Jan 15, 2025 at 12:03:27PM -0700, Keith Busch wrote:
> On Wed, Jan 15, 2025 at 06:10:05PM +0100, Paolo Bonzini wrote:
> > You can implement something like pthread_once():
>
> ...
>
> > Where to put it I don't know. It doesn't belong in
> > include/linux/once.h. I'm okay with arch/x86/kvm/call_once.h and just
> > pull it with #include "call_once.h".
>
> Thanks for the suggestion, I can work with that. As to where to put it,
> I think the new 'struct once' needs to be a member of struct kvm_arch,
> so I've put it in arch/x86/include/asm/.
>
> Here's the result with that folded in. If this is okay, I'll send a v2,
> and can split out the call_once as a prep patch with your attribution if
> you like.
Has there been any progress here? I'm also affected by the crosvm
regression, and it's been backported to the LTS stable kernel.
(CCing the stable and regressions lists to make sure the regression is
tracked.)
#regzbot introduced: d96c77bd4eeb
From: Nick Child <nnac123(a)linux.ibm.com>
From: Nick Child <nnac123(a)linux.ibm.com>
commit 0983d288caf984de0202c66641577b739caad561 upstream.
Below is a summary of how the driver stores a reference to an skb during
transmit:
tx_buff[free_map[consumer_index]]->skb = new_skb;
free_map[consumer_index] = IBMVNIC_INVALID_MAP;
consumer_index ++;
Where variable data looks like this:
free_map == [4, IBMVNIC_INVALID_MAP, IBMVNIC_INVALID_MAP, 0, 3]
consumer_index^
tx_buff == [skb=null, skb=<ptr>, skb=<ptr>, skb=null, skb=null]
The driver has checks to ensure that free_map[consumer_index] pointed to
a valid index but there was no check to ensure that this index pointed
to an unused/null skb address. So, if, by some chance, our free_map and
tx_buff lists become out of sync then we were previously risking an
skb memory leak. This could then cause tcp congestion control to stop
sending packets, eventually leading to ETIMEDOUT.
Therefore, add a conditional to ensure that the skb address is null. If
not then warn the user (because this is still a bug that should be
patched) and free the old pointer to prevent memleak/tcp problems.
Signed-off-by: Nick Child <nnac123(a)linux.ibm.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
[Denis: minor fix to resolve merge conflict.]
Signed-off-by: Denis Arefev <arefev(a)swemel.ru>
---
Backport fix for CVE-2024-41066
Link: https://nvd.nist.gov/vuln/detail/CVE-2024-41066
---
drivers/net/ethernet/ibm/ibmvnic.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 84da6ccaf339..439796975cbf 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1625,6 +1625,18 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
(tx_pool->consumer_index + 1) % tx_pool->num_buffers;
tx_buff = &tx_pool->tx_buff[index];
+
+ /* Sanity checks on our free map to make sure it points to an index
+ * that is not being occupied by another skb. If skb memory is
+ * not freed then we see congestion control kick in and halt tx.
+ */
+ if (unlikely(tx_buff->skb)) {
+ dev_warn_ratelimited(dev, "TX free map points to untracked skb (%s %d idx=%d)\n",
+ skb_is_gso(skb) ? "tso_pool" : "tx_pool",
+ queue_num, bufidx);
+ dev_kfree_skb_any(tx_buff->skb);
+ }
+
tx_buff->skb = skb;
tx_buff->data_dma[0] = data_dma_addr;
tx_buff->data_len[0] = skb->len;
--
2.43.0
#regzbot introduced v6.6.69..v6.6.70
#regzbot introduced: ad91a2dacbf8c26a446658cdd55e8324dfeff1e7
We hit this regression when updating our guest vm kernel from 6.6.69
to 6.6.70 -- bisecting, this problem was introduced in
ad91a2dacbf8c26a446658cdd55e8324dfeff1e7 -- net: restrict SO_REUSEPORT
to inet sockets
We're getting a timeout when trying to connect to the vsocket in the
guest VM when launching a kata containers 3.10.1 agent which
unsurprisingly ... uses a vsocket to communicate back to the host.
We updated this commit and added an additional sk_is_vsock check and
recompiled and this works correctly for us.
- if (valbool && !sk_is_inet(sk))
+ if (valbool && !(sk_is_inet(sk) || sk_is_vsock(sk)))
My understanding is limited here so I've added Stefano as he is likely
to better understand what makes sense here.
This commit was backported from v6.13 to v6.12.8..6.12.9.
-Simon
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
[ Upstream commit a3616a3c02722d1edb95acc7fceade242f6553ba ]
In the fpsp040 code when copyin or copyout fails call
force_sigsegv(SIGSEGV) instead of do_exit(SIGSEGV).
This solves a couple of problems. Because do_exit embeds the ptrace
stop PTRACE_EVENT_EXIT a complete stack frame needs to be present for
that to work correctly. There is always the information needed for a
ptrace stop where get_signal is called. So exiting with a signal
solves the ptrace issue.
Further exiting with a signal ensures that all of the threads in a
process are killed not just the thread that malfunctioned. Which
avoids confusing userspace.
To make force_sigsegv(SIGSEGV) work in fpsp040_die modify the code to
save all of the registers and jump to ret_from_exception (which
ultimately calls get_signal) after fpsp040_die returns.
v2: Updated the branches to use gas's pseudo ops that automatically
calculate the best branch instruction to use for the purpose.
v1: https://lkml.kernel.org/r/87a6m8kgtx.fsf_-_@disp2133
Link: https://lkml.kernel.org/r/87tukghjfs.fsf_-_@disp2133
Acked-by: Geert Uytterhoeven <geert(a)linux-m68k.org>
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Finn Thain <fthain(a)linux-m68k.org>
---
arch/m68k/fpsp040/skeleton.S | 3 ++-
arch/m68k/kernel/traps.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index 31a9c634c81e..081922c72daa 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -502,7 +502,8 @@ in_ea:
.section .fixup,"ax"
.even
1:
- jbra fpsp040_die
+ jbsr fpsp040_die
+ jbra .Lnotkern
.section __ex_table,"a"
.align 4
diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
index 35f706d836c5..c6f18dc5884b 100644
--- a/arch/m68k/kernel/traps.c
+++ b/arch/m68k/kernel/traps.c
@@ -1155,7 +1155,7 @@ asmlinkage void set_esp0(unsigned long ssp)
*/
asmlinkage void fpsp040_die(void)
{
- do_exit(SIGSEGV);
+ force_sigsegv(SIGSEGV);
}
#ifdef CONFIG_M68KFPU_EMU