On Mon, Jan 15, 2024 at 12:19:06PM +0500, Марк Коренберг wrote:
> Kernel 6.6.9-200.fc39.x86_64
>
> The following bash script demonstrates the problem (run under root):
>
> ```
> #!/bin/bash
>
> set -e -u -x
>
> # Some cleanups
> ip netns delete myspace || :
> ip link del qweqwe1 || :
>
> # The bug happens only with physical interfaces, not with, say, dummy one
> ip link property add dev enp0s20f0u2 altname myname
> ip netns add myspace
> ip link set enp0s20f0u2 netns myspace
>
> # add dummy interface + set the same altname as in background namespace.
> ip link add name qweqwe1 type dummy
> ip link property add dev qweqwe1 altname myname
>
> # Trigger the bug. The kernel will try to return ethernet interface
> back to root namespace, but it can not, because of conflicting
> altnames.
> ip netns delete myspace
>
> # now `ip link` will hang forever !!!!!
> ```
>
> I think, the problem is obvious. Althougn I don't know how to fix.
> Remove conflicting altnames for interfaces that returns from killed
> namespaces ?
As this can only be triggered by root, not much for us to do here,
perhaps discuss it on the netdev mailing list for all network developers
to work on?
> On kernel 6.3.8 (at least) was another bug, that allows dulicate
> altnames, and it was fixed mainline somewhere. I have another script
> to trigger the bug on these old kernels. I did not bisect.
If this is an issue on 6.1.y, that would be good to know so that we can
try to fix the issue there if bisection can find it. Care to share the
script so that I can test?
thanks,
greg k-h
From: Stefan Hajnoczi <stefanha(a)redhat.com>
[ Upstream commit b8e0792449928943c15d1af9f63816911d139267 ]
Commit 4e0400525691 ("virtio-blk: support polling I/O") triggers the
following gcc 13 W=1 warnings:
drivers/block/virtio_blk.c: In function ‘init_vq’:
drivers/block/virtio_blk.c:1077:68: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 7 [-Wformat-truncation=]
1077 | snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%d", i);
| ^~
drivers/block/virtio_blk.c:1077:58: note: directive argument in the range [-2147483648, 65534]
1077 | snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%d", i);
| ^~~~~~~~~~~~~
drivers/block/virtio_blk.c:1077:17: note: ‘snprintf’ output between 11 and 21 bytes into a destination of size 16
1077 | snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%d", i);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is a false positive because the lower bound -2147483648 is
incorrect. The true range of i is [0, num_vqs - 1] where 0 < num_vqs <
65536.
The code mixes int, unsigned short, and unsigned int types in addition
to using "%d" for an unsigned value. Use unsigned short and "%u"
consistently to solve the compiler warning.
Cc: Suwan Kim <suwan.kim027(a)gmail.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312041509.DIyvEt9h-lkp@intel.com/
Signed-off-by: Stefan Hajnoczi <stefanha(a)redhat.com>
Message-Id: <20231204140743.1487843-1-stefanha(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/block/virtio_blk.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index efa5535a8e1d8..3124837aa406f 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -609,12 +609,12 @@ static void virtblk_config_changed(struct virtio_device *vdev)
static int init_vq(struct virtio_blk *vblk)
{
int err;
- int i;
+ unsigned short i;
vq_callback_t **callbacks;
const char **names;
struct virtqueue **vqs;
unsigned short num_vqs;
- unsigned int num_poll_vqs;
+ unsigned short num_poll_vqs;
struct virtio_device *vdev = vblk->vdev;
struct irq_affinity desc = { 0, };
@@ -658,13 +658,13 @@ static int init_vq(struct virtio_blk *vblk)
for (i = 0; i < num_vqs - num_poll_vqs; i++) {
callbacks[i] = virtblk_done;
- snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req.%d", i);
+ snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req.%u", i);
names[i] = vblk->vqs[i].name;
}
for (; i < num_vqs; i++) {
callbacks[i] = NULL;
- snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%d", i);
+ snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%u", i);
names[i] = vblk->vqs[i].name;
}
--
2.43.0
On Sat, Jan 13, 2024 at 11:08:00AM -0600, Steve French wrote:
> I thought that it was "safer" since if it was misapplied to version where
> new folio rc behavior it wouldn't regress anything
There are only three versions where this patch can be applied: 6.7, 6.6
and 6.1. AIUI it's a backport from 6.7, it's already applied to 6.6,
and it misapplies to 6.1. So this kind of belt-and-braces approach is
unnecessary.
With 5.10LTS (e.g., 5.10.206), on a machine using an NVME device, the
following tracing commands will trigger a crash due to a NULL pointer
dereference:
KDIR=/sys/kernel/debug/tracing
echo 1 > $KDIR/tracing_on
echo 1 > $KDIR/events/nvme/enable
echo "Waiting for trace events..."
cat $KDIR/trace_pipe
The backtrace looks something like this:
Call Trace:
<IRQ>
? __die_body+0x6b/0xb0
? __die+0x9e/0xb0
? no_context+0x3eb/0x460
? ttwu_do_activate+0xf0/0x120
? __bad_area_nosemaphore+0x157/0x200
? select_idle_sibling+0x2f/0x410
? bad_area_nosemaphore+0x13/0x20
? do_user_addr_fault+0x2ab/0x360
? exc_page_fault+0x69/0x180
? asm_exc_page_fault+0x1e/0x30
? trace_event_raw_event_nvme_complete_rq+0xba/0x170
? trace_event_raw_event_nvme_complete_rq+0xa3/0x170
nvme_complete_rq+0x168/0x170
nvme_pci_complete_rq+0x16c/0x1f0
nvme_handle_cqe+0xde/0x190
nvme_irq+0x78/0x100
__handle_irq_event_percpu+0x77/0x1e0
handle_irq_event+0x54/0xb0
handle_edge_irq+0xdf/0x230
asm_call_irq_on_stack+0xf/0x20
</IRQ>
common_interrupt+0x9e/0x150
asm_common_interrupt+0x1e/0x40
It looks to me like these two upstream commits were backported to 5.10:
679c54f2de67 ("nvme: use command_id instead of req->tag in trace_nvme_complete_rq()")
e7006de6c238 ("nvme: code command_id with a genctr for use-after-free validation")
But they depend on this upstream commit to initialize the 'cmd' field in
some cases:
f4b9e6c90c57 ("nvme: use driver pdu command for passthrough")
Does it sound like I'm on the right track? The 5.15LTS and later seems to be okay.
For 5.15 attempting to use an ax88179_178a adapter "0b95:1790 ASIX
Electronics Corp. AX88179 Gigabit Ethernet"
started causing crashes.
This did not reproduce in the 6.6 kernel.
The crashes were narrowed down to the following two commits brought
into v5.15.146:
commit d63fafd6cc28 ("net: usb: ax88179_178a: avoid failed operations
when device is disconnected")
commit f860413aa00c ("net: usb: ax88179_178a: wol optimizations")
Those two use an uninitialized pointer `dev->driver_priv`.
In later kernels this pointer is initialized in commit 2bcbd3d8a7b4
("net: usb: ax88179_178a: move priv to driver_priv").
Picking in the two following commits fixed the issue for me on 5.15:
commit 9718f9ce5b86 ("net: usb: ax88179_178a: remove redundant init code")
commit 2bcbd3d8a7b4 ("net: usb: ax88179_178a: move priv to driver_priv")
The commit 9718f9ce5b86 ("net: usb: ax88179_178a: remove redundant
init code") was required for
the fix to apply cleanly.