Use power state to decide whether we can enter or leave IPS accurately,
and then prevent to power on/off twice.
The commit 6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS")
would like to prevent this as well, but it still can't entirely handle all
cases. The exception is that WiFi gets connected and does suspend/resume,
it will power on twice and cause it failed to power on after resuming,
like:
rtw_8723de 0000:03:00.0: failed to poll offset=0x6 mask=0x2 value=0x2
rtw_8723de 0000:03:00.0: mac power on failed
rtw_8723de 0000:03:00.0: failed to power on mac
rtw_8723de 0000:03:00.0: leave idle state failed
rtw_8723de 0000:03:00.0: failed to leave ips state
rtw_8723de 0000:03:00.0: failed to leave idle state
rtw_8723de 0000:03:00.0: failed to send h2c command
To fix this, introduce new flag RTW_FLAG_POWERON to reflect power state,
and call rtw_mac_pre_system_cfg() to configure registers properly between
power-off/-on.
Reported-by: Paul Gover <pmw.gover(a)yahoo.co.uk>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217016
Fixes: 6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Ping-Ke Shih <pkshih(a)realtek.com>
---
Hi Kalle,
This patch is to fix 8723DE failed to power on after system resume. Please
queue this to 6.3
Thank you
Ping-Ke
---
drivers/net/wireless/realtek/rtw88/coex.c | 2 +-
drivers/net/wireless/realtek/rtw88/mac.c | 10 ++++++++++
drivers/net/wireless/realtek/rtw88/main.h | 2 +-
drivers/net/wireless/realtek/rtw88/ps.c | 4 ++--
drivers/net/wireless/realtek/rtw88/wow.c | 2 +-
5 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/coex.c b/drivers/net/wireless/realtek/rtw88/coex.c
index 38697237ee5f0..86467d2f8888c 100644
--- a/drivers/net/wireless/realtek/rtw88/coex.c
+++ b/drivers/net/wireless/realtek/rtw88/coex.c
@@ -4056,7 +4056,7 @@ void rtw_coex_display_coex_info(struct rtw_dev *rtwdev, struct seq_file *m)
rtwdev->stats.tx_throughput, rtwdev->stats.rx_throughput);
seq_printf(m, "%-40s = %u/ %u/ %u\n",
"IPS/ Low Power/ PS mode",
- test_bit(RTW_FLAG_INACTIVE_PS, rtwdev->flags),
+ !test_bit(RTW_FLAG_POWERON, rtwdev->flags),
test_bit(RTW_FLAG_LEISURE_PS_DEEP, rtwdev->flags),
rtwdev->lps_conf.mode);
diff --git a/drivers/net/wireless/realtek/rtw88/mac.c b/drivers/net/wireless/realtek/rtw88/mac.c
index 4e5c194aac299..dae64901bac5a 100644
--- a/drivers/net/wireless/realtek/rtw88/mac.c
+++ b/drivers/net/wireless/realtek/rtw88/mac.c
@@ -273,6 +273,11 @@ static int rtw_mac_power_switch(struct rtw_dev *rtwdev, bool pwr_on)
if (rtw_pwr_seq_parser(rtwdev, pwr_seq))
return -EINVAL;
+ if (pwr_on)
+ set_bit(RTW_FLAG_POWERON, rtwdev->flags);
+ else
+ clear_bit(RTW_FLAG_POWERON, rtwdev->flags);
+
return 0;
}
@@ -335,6 +340,11 @@ int rtw_mac_power_on(struct rtw_dev *rtwdev)
ret = rtw_mac_power_switch(rtwdev, true);
if (ret == -EALREADY) {
rtw_mac_power_switch(rtwdev, false);
+
+ ret = rtw_mac_pre_system_cfg(rtwdev);
+ if (ret)
+ goto err;
+
ret = rtw_mac_power_switch(rtwdev, true);
if (ret)
goto err;
diff --git a/drivers/net/wireless/realtek/rtw88/main.h b/drivers/net/wireless/realtek/rtw88/main.h
index 165f299e8e1f9..d4a53d5567451 100644
--- a/drivers/net/wireless/realtek/rtw88/main.h
+++ b/drivers/net/wireless/realtek/rtw88/main.h
@@ -356,7 +356,7 @@ enum rtw_flags {
RTW_FLAG_RUNNING,
RTW_FLAG_FW_RUNNING,
RTW_FLAG_SCANNING,
- RTW_FLAG_INACTIVE_PS,
+ RTW_FLAG_POWERON,
RTW_FLAG_LEISURE_PS,
RTW_FLAG_LEISURE_PS_DEEP,
RTW_FLAG_DIG_DISABLE,
diff --git a/drivers/net/wireless/realtek/rtw88/ps.c b/drivers/net/wireless/realtek/rtw88/ps.c
index 11594940d6b00..996365575f44f 100644
--- a/drivers/net/wireless/realtek/rtw88/ps.c
+++ b/drivers/net/wireless/realtek/rtw88/ps.c
@@ -25,7 +25,7 @@ static int rtw_ips_pwr_up(struct rtw_dev *rtwdev)
int rtw_enter_ips(struct rtw_dev *rtwdev)
{
- if (test_and_set_bit(RTW_FLAG_INACTIVE_PS, rtwdev->flags))
+ if (!test_bit(RTW_FLAG_POWERON, rtwdev->flags))
return 0;
rtw_coex_ips_notify(rtwdev, COEX_IPS_ENTER);
@@ -50,7 +50,7 @@ int rtw_leave_ips(struct rtw_dev *rtwdev)
{
int ret;
- if (!test_and_clear_bit(RTW_FLAG_INACTIVE_PS, rtwdev->flags))
+ if (test_bit(RTW_FLAG_POWERON, rtwdev->flags))
return 0;
rtw_hci_link_ps(rtwdev, false);
diff --git a/drivers/net/wireless/realtek/rtw88/wow.c b/drivers/net/wireless/realtek/rtw88/wow.c
index 89dc595094d5c..16ddee577efec 100644
--- a/drivers/net/wireless/realtek/rtw88/wow.c
+++ b/drivers/net/wireless/realtek/rtw88/wow.c
@@ -592,7 +592,7 @@ static int rtw_wow_leave_no_link_ps(struct rtw_dev *rtwdev)
if (rtw_get_lps_deep_mode(rtwdev) != LPS_DEEP_MODE_NONE)
rtw_leave_lps_deep(rtwdev);
} else {
- if (test_bit(RTW_FLAG_INACTIVE_PS, rtwdev->flags)) {
+ if (!test_bit(RTW_FLAG_POWERON, rtwdev->flags)) {
rtw_wow->ips_enabled = true;
ret = rtw_leave_ips(rtwdev);
if (ret)
--
2.25.1
The quilt patch titled
Subject: mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
has been removed from the -mm tree. Its filename was
mm-madv_collapse-set-eagain-on-unexpected-page-refcount.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Zach O'Keefe" <zokeefe(a)google.com>
Subject: mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
Date: Tue, 24 Jan 2023 17:57:37 -0800
During collapse, in a few places we check to see if a given small page has
any unaccounted references. If the refcount on the page doesn't match our
expectations, it must be there is an unknown user concurrently interested
in the page, and so it's not safe to move the contents elsewhere.
However, the unaccounted pins are likely an ephemeral state.
In this situation, MADV_COLLAPSE returns -EINVAL when it should return
-EAGAIN. This could cause userspace to conclude that the syscall
failed, when it in fact could succeed by retrying.
Link: https://lkml.kernel.org/r/20230125015738.912924-1-zokeefe@google.com
Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Reported-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/khugepaged.c~mm-madv_collapse-set-eagain-on-unexpected-page-refcount
+++ a/mm/khugepaged.c
@@ -2611,6 +2611,7 @@ static int madvise_collapse_errno(enum s
case SCAN_CGROUP_CHARGE_FAIL:
return -EBUSY;
/* Resource temporary unavailable - trying again might succeed */
+ case SCAN_PAGE_COUNT:
case SCAN_PAGE_LOCK:
case SCAN_PAGE_LRU:
case SCAN_DEL_PAGE_LRU:
_
Patches currently in -mm which might be from zokeefe(a)google.com are
The quilt patch titled
Subject: mm/filemap: fix page end in filemap_get_read_batch
has been removed from the -mm tree. Its filename was
mm-filemap-fix-page-end-in-filemap_get_read_batch.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Qian Yingjin <qian(a)ddn.com>
Subject: mm/filemap: fix page end in filemap_get_read_batch
Date: Wed, 8 Feb 2023 10:24:00 +0800
I was running traces of the read code against an RAID storage system to
understand why read requests were being misaligned against the underlying
RAID strips. I found that the page end offset calculation in
filemap_get_read_batch() was off by one.
When a read is submitted with end offset 1048575, then it calculates the
end page for read of 256 when it should be 255. "last_index" is the index
of the page beyond the end of the read and it should be skipped when get a
batch of pages for read in @filemap_get_read_batch().
The below simple patch fixes the problem. This code was introduced in
kernel 5.12.
Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com
Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read")
Signed-off-by: Qian Yingjin <qian(a)ddn.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/filemap.c~mm-filemap-fix-page-end-in-filemap_get_read_batch
+++ a/mm/filemap.c
@@ -2588,18 +2588,19 @@ static int filemap_get_pages(struct kioc
struct folio *folio;
int err = 0;
+ /* "last_index" is the index of the page beyond the end of the read */
last_index = DIV_ROUND_UP(iocb->ki_pos + iter->count, PAGE_SIZE);
retry:
if (fatal_signal_pending(current))
return -EINTR;
- filemap_get_read_batch(mapping, index, last_index, fbatch);
+ filemap_get_read_batch(mapping, index, last_index - 1, fbatch);
if (!folio_batch_count(fbatch)) {
if (iocb->ki_flags & IOCB_NOIO)
return -EAGAIN;
page_cache_sync_readahead(mapping, ra, filp, index,
last_index - index);
- filemap_get_read_batch(mapping, index, last_index, fbatch);
+ filemap_get_read_batch(mapping, index, last_index - 1, fbatch);
}
if (!folio_batch_count(fbatch)) {
if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ))
_
Patches currently in -mm which might be from qian(a)ddn.com are
From: Dave Ertman <david.m.ertman(a)intel.com>
RDMA is not supported in ice on a PF that has been added to a bonded
interface. To enforce this, when an interface enters a bond, we unplug
the auxiliary device that supports RDMA functionality. This unplug
currently happens in the context of handling the netdev bonding event.
This event is sent to the ice driver under RTNL context. This is causing
a deadlock where the RDMA driver is waiting for the RTNL lock to complete
the removal.
Defer the unplugging/re-plugging of the auxiliary device to the service
task so that it is not performed under the RTNL lock context.
Cc: stable(a)vger.kernel.org # 6.1.x
Reported-by: Jaroslav Pulchart <jaroslav.pulchart(a)gooddata.com>
Link: https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi…
Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
Fixes: 4eace75e0853 ("RDMA/irdma: Report the correct link speed")
Signed-off-by: Dave Ertman <david.m.ertman(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
---
v2:
(Removed from original pull request)
- Reversed order of bit processing in ice_service_task for PLUG/UNPLUG
v1: https://lore.kernel.org/netdev/20230131213703.1347761-2-anthony.l.nguyen@in…
drivers/net/ethernet/intel/ice/ice.h | 14 +++++---------
drivers/net/ethernet/intel/ice/ice_main.c | 19 ++++++++-----------
2 files changed, 13 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 713069f809ec..3cad5e6b2ad1 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -506,6 +506,7 @@ enum ice_pf_flags {
ICE_FLAG_VF_VLAN_PRUNING,
ICE_FLAG_LINK_LENIENT_MODE_ENA,
ICE_FLAG_PLUG_AUX_DEV,
+ ICE_FLAG_UNPLUG_AUX_DEV,
ICE_FLAG_MTU_CHANGED,
ICE_FLAG_GNSS, /* GNSS successfully initialized */
ICE_PF_FLAGS_NBITS /* must be last */
@@ -950,16 +951,11 @@ static inline void ice_set_rdma_cap(struct ice_pf *pf)
*/
static inline void ice_clear_rdma_cap(struct ice_pf *pf)
{
- /* We can directly unplug aux device here only if the flag bit
- * ICE_FLAG_PLUG_AUX_DEV is not set because ice_unplug_aux_dev()
- * could race with ice_plug_aux_dev() called from
- * ice_service_task(). In this case we only clear that bit now and
- * aux device will be unplugged later once ice_plug_aux_device()
- * called from ice_service_task() finishes (see ice_service_task()).
+ /* defer unplug to service task to avoid RTNL lock and
+ * clear PLUG bit so that pending plugs don't interfere
*/
- if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
- ice_unplug_aux_dev(pf);
-
+ clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags);
+ set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
}
#endif /* _ICE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8ec24f6cf6be..10d1c5b10d2a 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2316,18 +2316,15 @@ static void ice_service_task(struct work_struct *work)
}
}
- if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
- /* Plug aux device per request */
- ice_plug_aux_dev(pf);
+ /* unplug aux dev per request, if an unplug request came in
+ * while processing a plug request, this will handle it
+ */
+ if (test_and_clear_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags))
+ ice_unplug_aux_dev(pf);
- /* Mark plugging as done but check whether unplug was
- * requested during ice_plug_aux_dev() call
- * (e.g. from ice_clear_rdma_cap()) and if so then
- * plug aux device.
- */
- if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
- ice_unplug_aux_dev(pf);
- }
+ /* Plug aux device per request */
+ if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+ ice_plug_aux_dev(pf);
if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
struct iidc_event *event;
--
2.38.1
On a heavily loaded machine there can be lock contention on the
global buffers lock. Add a percpu list to cache buffers on when
lock contention is encountered.
When allocating buffers attempt to use cached buffers first,
before taking the global buffers lock. When freeing buffers
try to put them back to the global list but if contention is
encountered, put the buffer on the percpu list.
The length of time a buffer is held on the percpu list is dynamically
adjusted based on lock contention. The amount of hold time is rapidly
increased and slow ramped down.
Fixes: df323337e507 ("apparmor: Use a memory pool instead per-CPU caches")
Link: https://lore.kernel.org/lkml/cfd5cc6f-5943-2e06-1dbe-f4b4ad5c1fa1@canonical…
Signed-off-by: John Johansen <john.johansen(a)canonical.com>
Reported-by: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Signed-off-by: Anil Altinay <aaltinay(a)google.com>
Cc: stable(a)vger.kernel.org
---
security/apparmor/lsm.c | 73 ++++++++++++++++++++++++++++++++++++++---
1 file changed, 68 insertions(+), 5 deletions(-)
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index c6728a629437..56b22e2def4c 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -49,12 +49,19 @@ union aa_buffer {
char buffer[1];
};
+struct aa_local_cache {
+ unsigned int contention;
+ unsigned int hold;
+ struct list_head head;
+};
+
#define RESERVE_COUNT 2
static int reserve_count = RESERVE_COUNT;
static int buffer_count;
static LIST_HEAD(aa_global_buffers);
static DEFINE_SPINLOCK(aa_buffers_lock);
+static DEFINE_PER_CPU(struct aa_local_cache, aa_local_buffers);
/*
* LSM hook functions
@@ -1634,14 +1641,43 @@ static int param_set_mode(const char *val, const struct kernel_param *kp)
return 0;
}
+static void update_contention(struct aa_local_cache *cache)
+{
+ cache->contention += 3;
+ if (cache->contention > 9)
+ cache->contention = 9;
+ cache->hold += 1 << cache->contention; /* 8, 64, 512 */
+}
+
char *aa_get_buffer(bool in_atomic)
{
union aa_buffer *aa_buf;
+ struct aa_local_cache *cache;
bool try_again = true;
gfp_t flags = (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+ /* use per cpu cached buffers first */
+ cache = get_cpu_ptr(&aa_local_buffers);
+ if (!list_empty(&cache->head)) {
+ aa_buf = list_first_entry(&cache->head, union aa_buffer, list);
+ list_del(&aa_buf->list);
+ cache->hold--;
+ put_cpu_ptr(&aa_local_buffers);
+ return &aa_buf->buffer[0];
+ }
+ put_cpu_ptr(&aa_local_buffers);
+ if (!spin_trylock(&aa_buffers_lock)) {
+ cache = get_cpu_ptr(&aa_local_buffers);
+ update_contention(cache);
+ put_cpu_ptr(&aa_local_buffers);
+ spin_lock(&aa_buffers_lock);
+ } else {
+ cache = get_cpu_ptr(&aa_local_buffers);
+ if (cache->contention)
+ cache->contention--;
+ put_cpu_ptr(&aa_local_buffers);
+ }
retry:
- spin_lock(&aa_buffers_lock);
if (buffer_count > reserve_count ||
(in_atomic && !list_empty(&aa_global_buffers))) {
aa_buf = list_first_entry(&aa_global_buffers, union aa_buffer,
@@ -1667,6 +1703,7 @@ char *aa_get_buffer(bool in_atomic)
if (!aa_buf) {
if (try_again) {
try_again = false;
+ spin_lock(&aa_buffers_lock);
goto retry;
}
pr_warn_once("AppArmor: Failed to allocate a memory buffer.\n");
@@ -1678,15 +1715,32 @@ char *aa_get_buffer(bool in_atomic)
void aa_put_buffer(char *buf)
{
union aa_buffer *aa_buf;
+ struct aa_local_cache *cache;
if (!buf)
return;
aa_buf = container_of(buf, union aa_buffer, buffer[0]);
- spin_lock(&aa_buffers_lock);
- list_add(&aa_buf->list, &aa_global_buffers);
- buffer_count++;
- spin_unlock(&aa_buffers_lock);
+ cache = get_cpu_ptr(&aa_local_buffers);
+ if (!cache->hold) {
+ put_cpu_ptr(&aa_local_buffers);
+ if (spin_trylock(&aa_buffers_lock)) {
+ list_add(&aa_buf->list, &aa_global_buffers);
+ buffer_count++;
+ spin_unlock(&aa_buffers_lock);
+ cache = get_cpu_ptr(&aa_local_buffers);
+ if (cache->contention)
+ cache->contention--;
+ put_cpu_ptr(&aa_local_buffers);
+ return;
+ }
+ cache = get_cpu_ptr(&aa_local_buffers);
+ update_contention(cache);
+ }
+
+ /* cache in percpu list */
+ list_add(&aa_buf->list, &cache->head);
+ put_cpu_ptr(&aa_local_buffers);
}
/*
@@ -1728,6 +1782,15 @@ static int __init alloc_buffers(void)
union aa_buffer *aa_buf;
int i, num;
+ /*
+ * per cpu set of cached allocated buffers used to help reduce
+ * lock contention
+ */
+ for_each_possible_cpu(i) {
+ per_cpu(aa_local_buffers, i).contention = 0;
+ per_cpu(aa_local_buffers, i).hold = 0;
+ INIT_LIST_HEAD(&per_cpu(aa_local_buffers, i).head);
+ }
/*
* A function may require two buffers at once. Usually the buffers are
* used for a short period of time and are shared. On UP kernel buffers
--
2.39.2.637.g21b0678d19-goog
The patch titled
Subject: nilfs2: fix underflow in second superblock position calculations
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-underflow-in-second-superblock-position-calculations.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix underflow in second superblock position calculations
Date: Wed, 15 Feb 2023 07:40:43 +0900
Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second
superblock, underflows when the argument device size is less than 4096
bytes. Therefore, when using this macro, it is necessary to check in
advance that the device size is not less than a lower limit, or at least
that underflow does not occur.
The current nilfs2 implementation lacks this check, causing out-of-bound
block access when mounting devices smaller than 4096 bytes:
I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0
phys_seg 1 prio class 2
NILFS (loop0): unable to read secondary superblock (blocksize = 1024)
In addition, when trying to resize the filesystem to a size below 4096
bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number
of segments to nilfs_sufile_resize(), corrupting parameters such as the
number of segments in superblocks. This causes excessive loop iterations
in nilfs_sufile_resize() during a subsequent resize ioctl, causing
semaphore ns_segctor_sem to block for a long time and hang the writer
thread:
INFO: task segctord:5067 blocked for more than 143 seconds.
Not tainted 6.2.0-rc8-syzkaller-00015-gf6feea56f66d #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:segctord state:D stack:23456 pid:5067 ppid:2
flags:0x00004000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5293 [inline]
__schedule+0x1409/0x43f0 kernel/sched/core.c:6606
schedule+0xc3/0x190 kernel/sched/core.c:6682
rwsem_down_write_slowpath+0xfcf/0x14a0 kernel/locking/rwsem.c:1190
nilfs_transaction_lock+0x25c/0x4f0 fs/nilfs2/segment.c:357
nilfs_segctor_thread_construct fs/nilfs2/segment.c:2486 [inline]
nilfs_segctor_thread+0x52f/0x1140 fs/nilfs2/segment.c:2570
kthread+0x270/0x300 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>
...
Call Trace:
<TASK>
folio_mark_accessed+0x51c/0xf00 mm/swap.c:515
__nilfs_get_page_block fs/nilfs2/page.c:42 [inline]
nilfs_grab_buffer+0x3d3/0x540 fs/nilfs2/page.c:61
nilfs_mdt_submit_block+0xd7/0x8f0 fs/nilfs2/mdt.c:121
nilfs_mdt_read_block+0xeb/0x430 fs/nilfs2/mdt.c:176
nilfs_mdt_get_block+0x12d/0xbb0 fs/nilfs2/mdt.c:251
nilfs_sufile_get_segment_usage_block fs/nilfs2/sufile.c:92 [inline]
nilfs_sufile_truncate_range fs/nilfs2/sufile.c:679 [inline]
nilfs_sufile_resize+0x7a3/0x12b0 fs/nilfs2/sufile.c:777
nilfs_resize_fs+0x20c/0xed0 fs/nilfs2/super.c:422
nilfs_ioctl_resize fs/nilfs2/ioctl.c:1033 [inline]
nilfs_ioctl+0x137c/0x2440 fs/nilfs2/ioctl.c:1301
...
This fixes these issues by inserting appropriate minimum device size
checks or anti-underflow checks, depending on where the macro is used.
Link: https://lkml.kernel.org/r/0000000000004e1dfa05f4a48e6b@google.com
Link: https://lkml.kernel.org/r/20230214224043.24141-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: <syzbot+f0c4082ce5ebebdac63b(a)syzkaller.appspotmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/nilfs2/ioctl.c~nilfs2-fix-underflow-in-second-superblock-position-calculations
+++ a/fs/nilfs2/ioctl.c
@@ -1114,7 +1114,14 @@ static int nilfs_ioctl_set_alloc_range(s
minseg = range[0] + segbytes - 1;
do_div(minseg, segbytes);
+
+ if (range[1] < 4096)
+ goto out;
+
maxseg = NILFS_SB2_OFFSET_BYTES(range[1]);
+ if (maxseg < segbytes)
+ goto out;
+
do_div(maxseg, segbytes);
maxseg--;
--- a/fs/nilfs2/super.c~nilfs2-fix-underflow-in-second-superblock-position-calculations
+++ a/fs/nilfs2/super.c
@@ -409,6 +409,15 @@ int nilfs_resize_fs(struct super_block *
goto out;
/*
+ * Prevent underflow in second superblock position calculation.
+ * The exact minimum size check is done in nilfs_sufile_resize().
+ */
+ if (newsize < 4096) {
+ ret = -ENOSPC;
+ goto out;
+ }
+
+ /*
* Write lock is required to protect some functions depending
* on the number of segments, the number of reserved segments,
* and so forth.
--- a/fs/nilfs2/the_nilfs.c~nilfs2-fix-underflow-in-second-superblock-position-calculations
+++ a/fs/nilfs2/the_nilfs.c
@@ -544,9 +544,15 @@ static int nilfs_load_super_block(struct
{
struct nilfs_super_block **sbp = nilfs->ns_sbp;
struct buffer_head **sbh = nilfs->ns_sbh;
- u64 sb2off = NILFS_SB2_OFFSET_BYTES(bdev_nr_bytes(nilfs->ns_bdev));
+ u64 sb2off, devsize = bdev_nr_bytes(nilfs->ns_bdev);
int valid[2], swp = 0;
+ if (devsize < NILFS_SEG_MIN_BLOCKS * NILFS_MIN_BLOCK_SIZE + 4096) {
+ nilfs_err(sb, "device size too small");
+ return -EINVAL;
+ }
+ sb2off = NILFS_SB2_OFFSET_BYTES(devsize);
+
sbp[0] = nilfs_read_super_block(sb, NILFS_SB_OFFSET_BYTES, blocksize,
&sbh[0]);
sbp[1] = nilfs_read_super_block(sb, sb2off, blocksize, &sbh[1]);
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-underflow-in-second-superblock-position-calculations.patch