The patch below does not apply to the 4.190-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.190.y
git checkout FETCH_HEAD
git cherry-pick -x f5eff5591b8f9c5effd25c92c758a127765f74c1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023050655-calamity-status-2655@gregkh' --subject-prefix 'PATCH 4.190.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f5eff5591b8f9c5effd25c92c758a127765f74c1 Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 11 Apr 2023 08:21:02 +0200
Subject: [PATCH] PCI: pciehp: Fix AB-BA deadlock between reset_lock and
device_lock
In 2013, commits
2e35afaefe64 ("PCI: pciehp: Add reset_slot() method")
608c388122c7 ("PCI: Add slot reset option to pci_dev_reset()")
amended PCIe hotplug to mask Presence Detect Changed events during a
Secondary Bus Reset. The reset thus no longer causes gratuitous slot
bringdown and bringup.
However the commits neglected to serialize reset with code paths reading
slot registers. For instance, a slot bringup due to an earlier hotplug
event may see the Presence Detect State bit cleared during a concurrent
Secondary Bus Reset.
In 2018, commit
5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset")
retrofitted the missing locking. It introduced a reset_lock which
serializes a Secondary Bus Reset with other parts of pciehp.
Unfortunately the locking turns out to be overzealous: reset_lock is
held for the entire enumeration and de-enumeration of hotplugged devices,
including driver binding and unbinding.
Driver binding and unbinding acquires device_lock while the reset_lock
of the ancestral hotplug port is held. A concurrent Secondary Bus Reset
acquires the ancestral reset_lock while already holding the device_lock.
The asymmetric locking order in the two code paths can lead to AB-BA
deadlocks.
Michael Haeuptle reports such deadlocks on simultaneous hot-removal and
vfio release (the latter implies a Secondary Bus Reset):
pciehp_ist() # down_read(reset_lock)
pciehp_handle_presence_or_link_change()
pciehp_disable_slot()
__pciehp_disable_slot()
remove_board()
pciehp_unconfigure_device()
pci_stop_and_remove_bus_device()
pci_stop_bus_device()
pci_stop_dev()
device_release_driver()
device_release_driver_internal()
__device_driver_lock() # device_lock()
SYS_munmap()
vfio_device_fops_release()
vfio_device_group_close()
vfio_device_close()
vfio_device_last_close()
vfio_pci_core_close_device()
vfio_pci_core_disable() # device_lock()
__pci_reset_function_locked()
pci_reset_bus_function()
pci_dev_reset_slot_function()
pci_reset_hotplug_slot()
pciehp_reset_slot() # down_write(reset_lock)
Ian May reports the same deadlock on simultaneous hot-removal and an
AER-induced Secondary Bus Reset:
aer_recover_work_func()
pcie_do_recovery()
aer_root_reset()
pci_bus_error_reset()
pci_slot_reset()
pci_slot_lock() # device_lock()
pci_reset_hotplug_slot()
pciehp_reset_slot() # down_write(reset_lock)
Fix by releasing the reset_lock during driver binding and unbinding,
thereby splitting and shrinking the critical section.
Driver binding and unbinding is protected by the device_lock() and thus
serialized with a Secondary Bus Reset. There's no need to additionally
protect it with the reset_lock. However, pciehp does not bind and
unbind devices directly, but rather invokes PCI core functions which
also perform certain enumeration and de-enumeration steps.
The reset_lock's purpose is to protect slot registers, not enumeration
and de-enumeration of hotplugged devices. That would arguably be the
job of the PCI core, not the PCIe hotplug driver. After all, an
AER-induced Secondary Bus Reset may as well happen during boot-time
enumeration of the PCI hierarchy and there's no locking to prevent that
either.
Exempting *de-enumeration* from the reset_lock is relatively harmless:
A concurrent Secondary Bus Reset may foil config space accesses such as
PME interrupt disablement. But if the device is physically gone, those
accesses are pointless anyway. If the device is physically present and
only logically removed through an Attention Button press or the sysfs
"power" attribute, PME interrupts as well as DMA cannot come through
because pciehp_unconfigure_device() disables INTx and Bus Master bits.
That's still protected by the reset_lock in the present commit.
Exempting *enumeration* from the reset_lock also has limited impact:
The exempted call to pci_bus_add_device() may perform device accesses
through pcibios_bus_add_device() and pci_fixup_device() which are now
no longer protected from a concurrent Secondary Bus Reset. Otherwise
there should be no impact.
In essence, the present commit seeks to fix the AB-BA deadlocks while
still retaining a best-effort reset protection for enumeration and
de-enumeration of hotplugged devices -- until a general solution is
implemented in the PCI core.
Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@…
Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical…
Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.…
Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huaw…
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590
Fixes: 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset")
Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.16811919…
Reported-by: Michael Haeuptle <michael.haeuptle(a)hpe.com>
Reported-by: Ian May <ian.may(a)canonical.com>
Reported-by: Andrey Grodzovsky <andrey2805(a)gmail.com>
Reported-by: Rahul Kumar <rahul.kumar1(a)amd.com>
Reported-by: Jialin Zhang <zhangjialin11(a)huawei.com>
Tested-by: Anatoli Antonovitch <Anatoli.Antonovitch(a)amd.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org # v4.19+
Cc: Dan Stein <dstein(a)hpe.com>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: Alex Michon <amichon(a)kalrayinc.com>
Cc: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Cc: Alex Williamson <alex.williamson(a)redhat.com>
Cc: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy(a)linux.intel.com>
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index d17f3bf36f70..ad12515a4a12 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -63,7 +63,14 @@ int pciehp_configure_device(struct controller *ctrl)
pci_assign_unassigned_bridge_resources(bridge);
pcie_bus_configure_settings(parent);
+
+ /*
+ * Release reset_lock during driver binding
+ * to avoid AB-BA deadlock with device_lock.
+ */
+ up_read(&ctrl->reset_lock);
pci_bus_add_devices(parent);
+ down_read_nested(&ctrl->reset_lock, ctrl->depth);
out:
pci_unlock_rescan_remove();
@@ -104,7 +111,15 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
list_for_each_entry_safe_reverse(dev, temp, &parent->devices,
bus_list) {
pci_dev_get(dev);
+
+ /*
+ * Release reset_lock during driver unbinding
+ * to avoid AB-BA deadlock with device_lock.
+ */
+ up_read(&ctrl->reset_lock);
pci_stop_and_remove_bus_device(dev);
+ down_read_nested(&ctrl->reset_lock, ctrl->depth);
+
/*
* Ensure that no new Requests will be generated from
* the device.
Recently we found a bug related with ext4 buffer head is fixed by
commit 0b73284c564d("ext4: ext4_read_bh_lock() should submit IO if the
buffer isn't uptodate")[1].
This bug is fixed on some kernel long term versions, such as 5.10 and 5.15.
However, on 5.4 stable version, we can still easily reproduce this bug by
adding some delay after buffer_migrate_lock_buffers() in __buffer_migrate_page()
and do fsstress on the ext4 filesystem. We can get some errors in dmesg like:
EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
comm fsstress: reading directory lblock 0
EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
comm fsstress: reading directory lblock 0
About how to fix this bug in 5.4 version, currently I have three ideas.
But I don't know which one is better or is there any other feasible way to
fix this bug elegantly based on the 5.4 stable branch?
The first idea comes from this thread[2]. In __buffer_migrate_page(),
we can let it fallback to migrate_page that are not uptodate like
fallback_migrate_page(), those pages that has buffers may probably do
read operation soon. From [3], we can see this solution is not good enough
because there are other places that lock the buffer without doing IO.
I think this solution can be a candidate option to fix if we do not want to
change a lot. Also based on my test results, the ext4 filesystem remains
stable after one week stress test with this patch applied.
The second idea is backport a series of commits from upstream, such as
2d069c0889ef ("ext4: use common helpers in all places reading metadata buffers")
0b73284c564d ("ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate")
79f597842069 ("fs/buffer: remove ll_rw_block() helper")
This will lead to many lines of code change and should be carefully conducted,
but it looks like the most reasonable solution so far.
The third idea is replace trylock_buffer in ll_rw_block() with lock_buffer and
change ll_rw_block() in __breadahead_gfp() to trylock_buffer. However,
this will change the semantic of ll_rw_block(), and will not be suitable for
some readahead circumstances. Besides, the ll_rw_block() has many occurences
among many filesystems other than ext4, I think it is better to limit the
fix in the ext4 filesystem without affecting other filesystems.
Here I send the patch based on the first idea, hope someone can give more ideas
about how to fix this bug in kernel 5.4 version, thanks.
[1] https://lore.kernel.org/linux-mm/20220825080146.2021641-1-chengzhihao1@huaw…
[2] https://lore.kernel.org/all/20220831074629.3755110-1-yi.zhang@huawei.com/T/
[3] https://lore.kernel.org/linux-mm/20220825105704.e46hz6dp6opawsjk@quack3/
Yue Zhao (1):
mm: migrate: buffer_migrate_page_norefs() fallback migrate not
uptodate pages
mm/migrate.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
--
2.17.1
This triggers a -Wdeclaration-after-statement as the code has changed a
bit since upstream. It might be better to hoist the whole block up, but
this is a smaller change so I went with it.
arch/riscv/mm/init.c:755:16: warning: mixing declarations and code is a C99 extension [-Wdeclaration-after-statement]
unsigned long idx = pgd_index(__fix_to_virt(FIX_FDT));
^
1 warning generated.
Reported-by: kernel test robot <lkp(a)intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202304300429.SXZOA5up-lkp@intel.com/
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
---
I haven't even build tested this one, but it looks simple enough that I figured
I'd just send it. Be warned, though: I broke glibc and missed a merged
conflict yesterday...
---
arch/riscv/mm/init.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index e800d7981e99..8d67f43f1865 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -717,6 +717,7 @@ static void __init setup_vm_final(void)
uintptr_t va, map_size;
phys_addr_t pa, start, end;
u64 i;
+ unsigned long idx;
/**
* MMU is enabled at this point. But page table setup is not complete yet.
@@ -735,7 +736,7 @@ static void __init setup_vm_final(void)
* directly in swapper_pg_dir in addition to the pgd entry that points
* to fixmap_pte.
*/
- unsigned long idx = pgd_index(__fix_to_virt(FIX_FDT));
+ idx = pgd_index(__fix_to_virt(FIX_FDT));
set_pgd(&swapper_pg_dir[idx], early_pg_dir[idx]);
#endif
--
2.40.0
Hi,
Some Pink Sardine platforms have some stability problems with reboot
cycling and it has been root caused to a misconfigured mux for audio.
It's been fixed in this commit:
a4d432e9132c ("ASoC: amd: ps: update the acp clock source.")
Can you please backport this to 6.1.y +
Thanks,
Hi,
A number of laptops with Mediatek wifi the wifi doesn't work unless you
turn off fast boot in BIOS setup.
These laptops all ship with fast boot as the default.
It's been fixed by this commit:
09d4d6da1b65 ("wifi: mt76: mt7921e: Set memory space enable in
PCI_COMMAND if unset")
Can you please bring it to 5.15.y and later?
Thanks,
The driver have a race, experienced only with PREEMPT_RT patchset:
CPU0 | CPU1
==================================================================
qcom_geni_serial_probe |
uart_add_one_port |
| serdev_drv_probe
| qca_serdev_probe
| serdev_device_open
| uart_open
| uart_startup
| qcom_geni_serial_startup
| enable_irq
| __irq_startup
| WARN_ON()
| IRQ not activated
request_threaded_irq |
irq_domain_activate_irq |
The warning:
894000.serial: ttyHS1 at MMIO 0x894000 (irq = 144, base_baud = 0) is a MSM
serial serial0: tty port ttyHS1 registered
WARNING: CPU: 7 PID: 107 at kernel/irq/chip.c:241 __irq_startup+0x78/0xd8
...
qcom_geni_serial 894000.serial: serial engine reports 0 RX bytes in!
Adding UART port triggers probe of child serial devices - serdev and
eventually Qualcomm Bluetooth hci_qca driver. This opens UART port
which enables the interrupt before it got activated in
request_threaded_irq(). The issue originates in commit f3974413cf02
("tty: serial: qcom_geni_serial: Wakeup IRQ cleanup") and discussion on
mailing list [1]. However the above commit does not explain why the
uart_add_one_port() is moved above requesting interrupt.
[1] https://lore.kernel.org/all/5d9f3dfa.1c69fb81.84c4b.30bf@mx.google.com/
Fixes: f3974413cf02 ("tty: serial: qcom_geni_serial: Wakeup IRQ cleanup")
Cc: <stable(a)vger.kernel.org>
Cc: Stephen Boyd <swboyd(a)chromium.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
---
drivers/tty/serial/qcom_geni_serial.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 08dc3e2a729c..8582479f0211 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1664,19 +1664,18 @@ static int qcom_geni_serial_probe(struct platform_device *pdev)
uport->private_data = &port->private_data;
platform_set_drvdata(pdev, port);
- ret = uart_add_one_port(drv, uport);
- if (ret)
- return ret;
-
irq_set_status_flags(uport->irq, IRQ_NOAUTOEN);
ret = devm_request_irq(uport->dev, uport->irq, qcom_geni_serial_isr,
IRQF_TRIGGER_HIGH, port->name, uport);
if (ret) {
dev_err(uport->dev, "Failed to get IRQ ret %d\n", ret);
- uart_remove_one_port(drv, uport);
return ret;
}
+ ret = uart_add_one_port(drv, uport);
+ if (ret)
+ return ret;
+
/*
* Set pm_runtime status as ACTIVE so that wakeup_irq gets
* enabled/disabled from dev_pm_arm_wake_irq during system
--
2.34.1
In binder_transaction_buffer_release() the 'failed_at' offset indicates
the number of objects to clean up. However, this function was changed by
commit 44d8047f1d87 ("binder: use standard functions to allocate fds"),
to release all the objects in the buffer when 'failed_at' is zero.
This introduced an issue when a transaction buffer is released without
any objects having been processed so far. In this case, 'failed_at' is
indeed zero yet it is misinterpreted as releasing the entire buffer.
This leads to use-after-free errors where nodes are incorrectly freed
and subsequently accessed. Such is the case in the following KASAN
report:
==================================================================
BUG: KASAN: slab-use-after-free in binder_thread_read+0xc40/0x1f30
Read of size 8 at addr ffff4faf037cfc58 by task poc/474
CPU: 6 PID: 474 Comm: poc Not tainted 6.3.0-12570-g7df047b3f0aa #5
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x94/0xec
show_stack+0x18/0x24
dump_stack_lvl+0x48/0x60
print_report+0xf8/0x5b8
kasan_report+0xb8/0xfc
__asan_load8+0x9c/0xb8
binder_thread_read+0xc40/0x1f30
binder_ioctl+0xd9c/0x1768
__arm64_sys_ioctl+0xd4/0x118
invoke_syscall+0x60/0x188
[...]
Allocated by task 474:
kasan_save_stack+0x3c/0x64
kasan_set_track+0x2c/0x40
kasan_save_alloc_info+0x24/0x34
__kasan_kmalloc+0xb8/0xbc
kmalloc_trace+0x48/0x5c
binder_new_node+0x3c/0x3a4
binder_transaction+0x2b58/0x36f0
binder_thread_write+0x8e0/0x1b78
binder_ioctl+0x14a0/0x1768
__arm64_sys_ioctl+0xd4/0x118
invoke_syscall+0x60/0x188
[...]
Freed by task 475:
kasan_save_stack+0x3c/0x64
kasan_set_track+0x2c/0x40
kasan_save_free_info+0x38/0x5c
__kasan_slab_free+0xe8/0x154
__kmem_cache_free+0x128/0x2bc
kfree+0x58/0x70
binder_dec_node_tmpref+0x178/0x1fc
binder_transaction_buffer_release+0x430/0x628
binder_transaction+0x1954/0x36f0
binder_thread_write+0x8e0/0x1b78
binder_ioctl+0x14a0/0x1768
__arm64_sys_ioctl+0xd4/0x118
invoke_syscall+0x60/0x188
[...]
==================================================================
In order to avoid these issues, let's always calculate the intended
'failed_at' offset beforehand. This is wrapped in a helper function to
make it clear and convenient.
Fixes: 32e9f56a96d8 ("binder: don't detect sender/target during buffer cleanup")
Reported-by: Zi Fan Tan <zifantan(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
drivers/android/binder.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index fb56bfc45096..6678a862ea84 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1938,7 +1938,7 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
bool is_failure)
{
int debug_id = buffer->debug_id;
- binder_size_t off_start_offset, buffer_offset, off_end_offset;
+ binder_size_t off_start_offset, buffer_offset;
binder_debug(BINDER_DEBUG_TRANSACTION,
"%d buffer release %d, size %zd-%zd, failed at %llx\n",
@@ -1950,9 +1950,8 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
binder_dec_node(buffer->target_node, 1, 0);
off_start_offset = ALIGN(buffer->data_size, sizeof(void *));
- off_end_offset = is_failure && failed_at ? failed_at :
- off_start_offset + buffer->offsets_size;
- for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
+
+ for (buffer_offset = off_start_offset; buffer_offset < failed_at;
buffer_offset += sizeof(binder_size_t)) {
struct binder_object_header *hdr;
size_t object_size = 0;
@@ -2111,6 +2110,25 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
}
}
+/* Clean up all the objects in the buffer */
+static inline void binder_release_entire_buffer(struct binder_proc *proc,
+ struct binder_thread *thread,
+ struct binder_buffer *buffer,
+ bool is_failure)
+{
+ binder_size_t off_end_offset;
+
+ off_end_offset = ALIGN(buffer->data_size, sizeof(void *));
+ off_end_offset += buffer->offsets_size;
+
+ /* We always pass the end of the buffer here to make sure that
+ * binder_transaction_buffer_release() loops through all the
+ * objects in the buffer.
+ */
+ binder_transaction_buffer_release(proc, thread, buffer,
+ off_end_offset, is_failure);
+}
+
static int binder_translate_binder(struct flat_binder_object *fp,
struct binder_transaction *t,
struct binder_thread *thread)
@@ -2806,7 +2824,7 @@ static int binder_proc_transaction(struct binder_transaction *t,
t_outdated->buffer = NULL;
buffer->transaction = NULL;
trace_binder_transaction_update_buffer_release(buffer);
- binder_transaction_buffer_release(proc, NULL, buffer, 0, 0);
+ binder_release_entire_buffer(proc, NULL, buffer, false);
binder_alloc_free_buf(&proc->alloc, buffer);
kfree(t_outdated);
binder_stats_deleted(BINDER_STAT_TRANSACTION);
@@ -3775,7 +3793,7 @@ binder_free_buf(struct binder_proc *proc,
binder_node_inner_unlock(buf_node);
}
trace_binder_transaction_buffer_release(buffer);
- binder_transaction_buffer_release(proc, thread, buffer, 0, is_failure);
+ binder_release_entire_buffer(proc, thread, buffer, is_failure);
binder_alloc_free_buf(&proc->alloc, buffer);
}
--
2.40.1.521.gf1e218fcd8-goog
The quilt patch titled
Subject: nilfs2: do not write dirty data after degenerating to read-only
has been removed from the -mm tree. Its filename was
nilfs2-do-not-write-dirty-data-after-degenerating-to-read-only.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: do not write dirty data after degenerating to read-only
Date: Thu, 27 Apr 2023 10:15:26 +0900
According to syzbot's report, mark_buffer_dirty() called from
nilfs_segctor_do_construct() outputs a warning with some patterns after
nilfs2 detects metadata corruption and degrades to read-only mode.
After such read-only degeneration, page cache data may be cleared through
nilfs_clear_dirty_page() which may also clear the uptodate flag for their
buffer heads. However, even after the degeneration, log writes are still
performed by unmount processing etc., which causes mark_buffer_dirty() to
be called for buffer heads without the "uptodate" flag and causes the
warning.
Since any writes should not be done to a read-only file system in the
first place, this fixes the warning in mark_buffer_dirty() by letting
nilfs_segctor_do_construct() abort early if in read-only mode.
This also changes the retry check of nilfs_segctor_write_out() to avoid
unnecessary log write retries if it detects -EROFS that
nilfs_segctor_do_construct() returned.
Link: https://lkml.kernel.org/r/20230427011526.13457-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+2af3bc9585be7f23f290(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=2af3bc9585be7f23f290
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/segment.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/segment.c~nilfs2-do-not-write-dirty-data-after-degenerating-to-read-only
+++ a/fs/nilfs2/segment.c
@@ -2041,6 +2041,9 @@ static int nilfs_segctor_do_construct(st
struct the_nilfs *nilfs = sci->sc_super->s_fs_info;
int err;
+ if (sb_rdonly(sci->sc_super))
+ return -EROFS;
+
nilfs_sc_cstage_set(sci, NILFS_ST_INIT);
sci->sc_cno = nilfs->ns_cno;
@@ -2724,7 +2727,7 @@ static void nilfs_segctor_write_out(stru
flush_work(&sci->sc_iput_work);
- } while (ret && retrycount-- > 0);
+ } while (ret && ret != -EROFS && retrycount-- > 0);
}
/**
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
The quilt patch titled
Subject: mm: do not reclaim private data from pinned page
has been removed from the -mm tree. Its filename was
mm-do-not-reclaim-private-data-from-pinned-page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jan Kara <jack(a)suse.cz>
Subject: mm: do not reclaim private data from pinned page
Date: Fri, 28 Apr 2023 14:41:40 +0200
If the page is pinned, there's no point in trying to reclaim it.
Furthermore if the page is from the page cache we don't want to reclaim
fs-private data from the page because the pinning process may be writing
to the page at any time and reclaiming fs private info on a dirty page can
upset the filesystem (see link below).
Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
Link: https://lkml.kernel.org/r/20230428124140.30166-1-jack@suse.cz
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reviewed-by: Lorenzo Stoakes <lstoakes(a)gmail.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/mm/vmscan.c~mm-do-not-reclaim-private-data-from-pinned-page
+++ a/mm/vmscan.c
@@ -1967,6 +1967,16 @@ retry:
}
}
+ /*
+ * Folio is unmapped now so it cannot be newly pinned anymore.
+ * No point in trying to reclaim folio if it is pinned.
+ * Furthermore we don't want to reclaim underlying fs metadata
+ * if the folio is pinned and thus potentially modified by the
+ * pinning process as that may upset the filesystem.
+ */
+ if (folio_maybe_dma_pinned(folio))
+ goto activate_locked;
+
mapping = folio_mapping(folio);
if (folio_test_dirty(folio)) {
/*
_
Patches currently in -mm which might be from jack(a)suse.cz are