The patch titled
Subject: nilfs2: fix missing error check for sb_set_blocksize call
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-missing-error-check-for-sb_set_blocksize-call.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix missing error check for sb_set_blocksize call
Date: Wed, 29 Nov 2023 23:15:47 +0900
When mounting a filesystem image with a block size larger than the page
size, nilfs2 repeatedly outputs long error messages with stack traces to
the kernel log, such as the following:
getblk(): invalid block size 8192 requested
logical block size: 512
...
Call Trace:
dump_stack_lvl+0x92/0xd4
dump_stack+0xd/0x10
bdev_getblk+0x33a/0x354
__breadahead+0x11/0x80
nilfs_search_super_root+0xe2/0x704 [nilfs2]
load_nilfs+0x72/0x504 [nilfs2]
nilfs_mount+0x30f/0x518 [nilfs2]
legacy_get_tree+0x1b/0x40
vfs_get_tree+0x18/0xc4
path_mount+0x786/0xa88
__ia32_sys_mount+0x147/0x1a8
__do_fast_syscall_32+0x56/0xc8
do_fast_syscall_32+0x29/0x58
do_SYSENTER_32+0x15/0x18
entry_SYSENTER_32+0x98/0xf1
...
This overloads the system logger. And to make matters worse, it sometimes
crashes the kernel with a memory access violation.
This is because the return value of the sb_set_blocksize() call, which
should be checked for errors, is not checked.
The latter issue is due to out-of-buffer memory being accessed based on a
large block size that caused sb_set_blocksize() to fail for buffers read
with the initial minimum block size that remained unupdated in the
super_block structure.
Since nilfs2 mkfs tool does not accept block sizes larger than the system
page size, this has been overlooked. However, it is possible to create
this situation by intentionally modifying the tool or by passing a
filesystem image created on a system with a large page size to a system
with a smaller page size and mounting it.
Fix this issue by inserting the expected error handling for the call to
sb_set_blocksize().
Link: https://lkml.kernel.org/r/20231129141547.4726-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/the_nilfs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/the_nilfs.c~nilfs2-fix-missing-error-check-for-sb_set_blocksize-call
+++ a/fs/nilfs2/the_nilfs.c
@@ -716,7 +716,11 @@ int init_nilfs(struct the_nilfs *nilfs,
goto failed_sbh;
}
nilfs_release_super_block(nilfs);
- sb_set_blocksize(sb, blocksize);
+ if (!sb_set_blocksize(sb, blocksize)) {
+ nilfs_err(sb, "bad blocksize %d", blocksize);
+ err = -EINVAL;
+ goto out;
+ }
err = nilfs_load_super_block(nilfs, sb, blocksize, &sbp);
if (err)
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-missing-error-check-for-sb_set_blocksize-call.patch
nilfs2-move-page-release-outside-of-nilfs_delete_entry-and-nilfs_set_link.patch
nilfs2-eliminate-staggered-calls-to-kunmap-in-nilfs_rename.patch
+Cc: Cong Wang <cong.wang(a)bytedance.com>
Hi all,
On Tue, Nov 28, 2023 at 09:52:46PM -0500, Sasha Levin wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> bpf: Fix dev's rx stats for bpf_redirect_peer traffic
>
> to the 6.6-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> bpf-fix-dev-s-rx-stats-for-bpf_redirect_peer-traffic.patch
> and it can be found in the queue-6.6 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Seems like only patch 1, 2 and 5 in this [1] series are selected? We
also need patch 4, upstream commit 6f2684bf2b44 ("veth: Use tstats
per-CPU traffic counters"). Otherwise the fix won't work, and the code
will be wrong [2] .
We should've included a "Depends on patch..." note for stable in the
commit message.
Thanks,
Peilin Ye
[1] https://lore.kernel.org/all/170050562585.4532.1588179408610417971.git-patch…
[2] veth still uses @lstats, but patch 5 makes skb_do_redirect() update
it as @tstats.
I'm seeing this too, but on 6.6.3 (6.6.2 is fine).
Bisected it down to commit 2e8b4e0992e16 ("gcc-plugins: randstruct:
Only warn about true flexible arrays"). Reverting that commit on top
of v6.6.3 makes it go away.
I do wonder if content such as that (which *looks* like it's purely
preparing for future changes) is appropriate for the stable trees.
Cheers,
-- Dan
This is the start of the stable review cycle for the 4.14.331 release.
There are 53 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Mon, 27 Nov 2023 16:30:48 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.331-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.331-rc2
Eric Dumazet <edumazet(a)google.com>
net: sched: fix race condition in qdisc_graft()
Dongli Zhang <dongli.zhang(a)oracle.com>
scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids
Kemeng Shi <shikemeng(a)huaweicloud.com>
ext4: remove gdb backup copy for meta bg in setup_new_flex_group_blocks
Kemeng Shi <shikemeng(a)huaweicloud.com>
ext4: correct return value of ext4_convert_meta_bg
Kemeng Shi <shikemeng(a)huaweicloud.com>
ext4: correct offset of gdb backup in non meta_bg group to update_backups
Max Kellermann <max.kellermann(a)ionos.com>
ext4: apply umask if ACL support is disabled
Vikash Garodia <quic_vgarodia(a)quicinc.com>
media: venus: hfi: fix the check to handle session buffer requirement
Sean Young <sean(a)mess.org>
media: sharp: fix sharp encoding
Heiner Kallweit <hkallweit1(a)gmail.com>
i2c: i801: fix potential race in i801_block_transaction_byte_by_byte
Alexander Sverdlin <alexander.sverdlin(a)siemens.com>
net: dsa: lan9303: consequently nested-lock physical MDIO
Takashi Iwai <tiwai(a)suse.de>
ALSA: info: Fix potential deadlock at disconnection
Helge Deller <deller(a)gmx.de>
parisc/pgtable: Do not drop upper 5 address bits of physical address
Helge Deller <deller(a)gmx.de>
parisc: Prevent booting 64-bit kernels on PA1.x machines
Sanjuán García, Jorge <Jorge.SanjuanGarcia(a)duagon.com>
mcb: fix error handling for different scenarios when parsing
Zhihao Cheng <chengzhihao1(a)huawei.com>
jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev
Herve Codina <herve.codina(a)bootlin.com>
genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware
Rong Chen <rong.chen(a)amlogic.com>
mmc: meson-gx: Remove setting of CMD_CFG_ERROR
Brian Geffon <bgeffon(a)google.com>
PM: hibernate: Clean up sync_read handling in snapshot_write_next()
Brian Geffon <bgeffon(a)google.com>
PM: hibernate: Use __get_safe_page() rather than touching the list
Dan Carpenter <dan.carpenter(a)linaro.org>
mmc: vub300: fix an error code
Lukas Wunner <lukas(a)wunner.de>
PCI/sysfs: Protect driver's D3cold preference from user space
David Woodhouse <dwmw(a)amazon.co.uk>
hvc/xen: fix error path in xen_hvc_init() to always register frontend driver
Paul Moore <paul(a)paul-moore.com>
audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare()
Paul Moore <paul(a)paul-moore.com>
audit: don't take task_lock() in audit_exe_compare() code path
Maciej S. Szmigiero <maciej.szmigiero(a)oracle.com>
KVM: x86: Ignore MSR_AMD64_TW_CFG access
Kees Cook <keescook(a)chromium.org>
randstruct: Fix gcc-plugin performance mode to stay in group
Vikash Garodia <quic_vgarodia(a)quicinc.com>
media: venus: hfi: add checks to perform sanity on queue pointers
Dan Carpenter <dan.carpenter(a)linaro.org>
pwm: Fix double shift bug
Bob Peterson <rpeterso(a)redhat.com>
gfs2: ignore negated quota changes
Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
media: vivid: avoid integer overflow
Rajeshwar R Shinde <coolrrsh(a)gmail.com>
media: gspca: cpia1: shift-out-of-bounds in set_flicker
Axel Lin <axel.lin(a)ingics.com>
i2c: sun6i-p2wi: Prevent potential division by zero
Yi Yang <yiyang13(a)huawei.com>
tty: vcc: Add check for kstrdup() in vcc_probe()
Wenchao Hao <haowenchao2(a)huawei.com>
scsi: libfc: Fix potential NULL pointer dereference in fc_lport_ptp_setup()
Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
atm: iphase: Do PCI error checks on own line
Cezary Rojewski <cezary.rojewski(a)intel.com>
ALSA: hda: Fix possible null-ptr-deref when assigning a stream
Manas Ghandat <ghandatmanas(a)gmail.com>
jfs: fix array-index-out-of-bounds in diAlloc
Manas Ghandat <ghandatmanas(a)gmail.com>
jfs: fix array-index-out-of-bounds in dbFindLeaf
Juntong Deng <juntong.deng(a)outlook.com>
fs/jfs: Add validity check for db_maxag and db_agpref
Juntong Deng <juntong.deng(a)outlook.com>
fs/jfs: Add check for negative db_l2nbperpage
Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
RDMA/hfi1: Use FIELD_GET() to extract Link Width
Lu Jialin <lujialin4(a)huawei.com>
crypto: pcrypt - Fix hungtask for PADATA_RESET
zhujun2 <zhujun2(a)cmss.chinamobile.com>
selftests/efivarfs: create-read: fix a resource leak
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7
Eric Dumazet <edumazet(a)google.com>
net: annotate data-races around sk->sk_dst_pending_confirm
Dmitry Antipov <dmantipov(a)yandex.ru>
wifi: ath10k: fix clang-specific fortify warning
Dmitry Antipov <dmantipov(a)yandex.ru>
wifi: ath9k: fix clang-specific fortify warnings
Ping-Ke Shih <pkshih(a)realtek.com>
wifi: mac80211: don't return unset power in ieee80211_get_tx_power()
Mike Rapoport (IBM) <rppt(a)kernel.org>
x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
Ronald Wahl <ronald.wahl(a)raritan.com>
clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware
Jacky Bai <ping.bai(a)nxp.com>
clocksource/drivers/timer-imx-gpt: Fix potential memory leak
John Stultz <jstultz(a)google.com>
locking/ww_mutex/test: Fix potential workqueue corruption
-------------
Diffstat:
Makefile | 4 ++--
arch/parisc/kernel/entry.S | 7 +++---
arch/parisc/kernel/head.S | 5 ++---
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/numa.h | 7 ------
arch/x86/kvm/x86.c | 2 ++
arch/x86/mm/numa.c | 7 ------
crypto/pcrypt.c | 4 ++++
drivers/atm/iphase.c | 20 +++++++++--------
drivers/clocksource/tcb_clksrc.c | 1 +
drivers/clocksource/timer-imx-gpt.c | 18 +++++++++++-----
drivers/gpu/drm/amd/include/pptable.h | 4 ++--
drivers/gpu/drm/amd/powerplay/hwmgr/pptable_v1_0.h | 16 +++++++-------
drivers/i2c/busses/i2c-i801.c | 19 ++++++++--------
drivers/i2c/busses/i2c-sun6i-p2wi.c | 5 +++++
drivers/infiniband/hw/hfi1/pcie.c | 9 ++------
drivers/mcb/mcb-core.c | 1 +
drivers/mcb/mcb-parse.c | 2 +-
drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +-
drivers/media/platform/qcom/venus/hfi_venus.c | 10 +++++++++
drivers/media/platform/vivid/vivid-rds-gen.c | 2 +-
drivers/media/rc/ir-sharp-decoder.c | 8 ++++---
drivers/media/usb/gspca/cpia1.c | 3 +++
drivers/mmc/host/meson-gx-mmc.c | 1 -
drivers/mmc/host/vub300.c | 1 +
drivers/net/dsa/lan9303_mdio.c | 4 ++--
drivers/net/wireless/ath/ath10k/debug.c | 2 +-
drivers/net/wireless/ath/ath9k/debug.c | 2 +-
drivers/net/wireless/ath/ath9k/htc_drv_debug.c | 2 +-
drivers/pci/pci-acpi.c | 2 +-
drivers/pci/pci-sysfs.c | 5 +----
drivers/scsi/libfc/fc_lport.c | 6 ++++++
drivers/scsi/virtio_scsi.c | 1 +
drivers/tty/hvc/hvc_xen.c | 5 +++--
drivers/tty/vcc.c | 16 +++++++++++---
fs/ext4/acl.h | 5 +++++
fs/ext4/resize.c | 19 ++++++----------
fs/gfs2/quota.c | 11 ++++++++++
fs/jbd2/recovery.c | 8 +++++++
fs/jfs/jfs_dmap.c | 23 +++++++++++++++-----
fs/jfs/jfs_imap.c | 5 ++++-
include/linux/pwm.h | 4 ++--
include/net/sock.h | 6 +++---
kernel/audit_watch.c | 9 +++++++-
kernel/irq/generic-chip.c | 25 ++++++++++++++++------
kernel/locking/test-ww_mutex.c | 20 ++++++++++-------
kernel/padata.c | 2 +-
kernel/power/snapshot.c | 16 ++++++--------
net/core/sock.c | 2 +-
net/ipv4/tcp_output.c | 2 +-
net/mac80211/cfg.c | 4 ++++
net/sched/sch_api.c | 5 +++--
scripts/gcc-plugins/randomize_layout_plugin.c | 11 +++++++---
sound/core/info.c | 21 +++++++++++-------
sound/hda/hdac_stream.c | 6 ++++--
tools/testing/selftests/efivarfs/create-read.c | 2 ++
56 files changed, 259 insertions(+), 151 deletions(-)
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs
Dm tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, we found there are many readahead IOs are
submitted to dm_verity from filesystem. Then the kverity works are busy
doing FEC process which cost too much time to finish dm-verity IO. And
cause system stuck.
We add some debug log and find that each readahead IO need around 10s to
finish when this situation occurred. Because here has a IO
amplification:
dm-snapshot suspend
erofs_readahead // 300+ io is submitted
dm_submit_bio (dm_verity)
dm_submit_bio (dm_snapshot)
bio return EIO
bio got nothing, it's empty
verity_end_io
verity_verify_io
forloop range(0, io->n_blocks) // each io->nblocks ~= 20
verity_fec_decode
fec_decode_rsb
fec_read_bufs
forloop range(0, v->fec->rsn) // v->fec->rsn = 253
new_read
submit_bio (dm_snapshot)
end loop
end loop
dm-snapshot resume
Readahead BIO got nothing during dm-snapshot suspended. So all of them
will do FEC.
Each readahead BIO need to do io->n_blocks ~= 20 times verify.
Each block need to do fec, and every block need to do v->fec->rsn = 253
times read.
So during the suspend interval(~200ms), 300 readahead BIO make
300*20*253 IOs on dm-snapshot.
As readahead IO is not required by user space, and to fix this issue,
I think it would be better to pass it to upper layer to handle it.
Cc: stable(a)vger.kernel.org
Fixes: a739ff3f543a ("dm verity: add support for forward error correction")
Signed-off-by: Wu Bo <bo.wu(a)vivo.com>
---
drivers/md/dm-verity-target.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index beec14b6b044..14e58ae70521 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -667,7 +667,9 @@ static void verity_end_io(struct bio *bio)
struct dm_verity_io *io = bio->bi_private;
if (bio->bi_status &&
- (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
+ (!verity_fec_is_enabled(io->v) ||
+ verity_is_system_shutting_down() ||
+ (bio->bi_opf & REQ_RAHEAD))) {
verity_finish_io(io, bio->bi_status);
return;
}
--
2.25.1