July 2023 - Linux-stable-mirror

[PATCHv2 v6.5-rc1 0/3] fs: dlm: workarounds and cancellation

by Alexander Aring

Hi, This patch-series trying to avoid issues when plock ops with DLM_PLOCK_FL_CLOSE flag is set sends a reply back which should never be the case. This problem getting more serious when introducing a new plock op and an answer was not expected as I changed in v2 to check on DLM_PLOCK_FL_CLOSE flag for stable as this can also being used to fix the potential issue for older kernels and it does not change the UAPI. For newer user space applications the new flag DLM_PLOCK_FL_NO_REPLY will tell the user space application to never send an result back, it will handle this filter earlier in user space. For older user space software we will filter the result in ther kernel. This requires the behaviour that the flags are the same for the request and the reply which is the case for dlm_controld. Also fix the wrapped string and don't spam the user ignoring replies. - Alex Alexander Aring (3): fs: dlm: ignore DLM_PLOCK_FL_CLOSE flag results fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag fs: dlm: allow to F_SETLKW getting interrupted fs/dlm/plock.c | 107 ++++++++++++++++++++++++--------- include/uapi/linux/dlm_plock.h | 2 + 2 files changed, 81 insertions(+), 28 deletions(-) -- 2.31.1

1 year, 11 months

1
4
0 0

[PATCH] exfat: release s_lock before calling dir_emit()

by Sungjong Seo

There is a potential deadlock reported by syzbot as below: ====================================================== WARNING: possible circular locking dependency detected 6.4.0-next-20230707-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor330/5073 is trying to acquire lock: ffff8880218527a0 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock_killable include/linux/mmap_lock.h:151 [inline] ffff8880218527a0 (&mm->mmap_lock){++++}-{3:3}, at: get_mmap_lock_carefully mm/memory.c:5293 [inline] ffff8880218527a0 (&mm->mmap_lock){++++}-{3:3}, at: lock_mm_and_find_vma+0x369/0x510 mm/memory.c:5344 but task is already holding lock: ffff888019f760e0 (&sbi->s_lock){+.+.}-{3:3}, at: exfat_iterate+0x117/0xb50 fs/exfat/dir.c:232 which lock already depends on the new lock. Chain exists of: &mm->mmap_lock --> mapping.invalidate_lock#3 --> &sbi->s_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->s_lock); lock(mapping.invalidate_lock#3); lock(&sbi->s_lock); rlock(&mm->mmap_lock); Let's try to avoid above potential deadlock condition by moving dir_emit*() out of sbi->s_lock coverage. Fixes: ca06197382bd ("exfat: add directory operations") Cc: stable(a)vger.kernel.org #v5.7+ Reported-by: syzbot+1741a5d9b79989c10bdc(a)syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/00000000000078ee7e060066270b@google.com/T/#u Signed-off-by: Sungjong Seo <sj1557.seo(a)samsung.com> --- fs/exfat/dir.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c index 957574180a5e..4e3743341ce7 100644 --- a/fs/exfat/dir.c +++ b/fs/exfat/dir.c @@ -214,7 +214,10 @@ static void exfat_free_namebuf(struct exfat_dentry_namebuf *nb) exfat_init_namebuf(nb); } -/* skip iterating emit_dots when dir is empty */ +/* + * Before calling dir_emit*(), sbi->s_lock should be released + * because page fault can occur in dir_emit*(). + */ #define ITER_POS_FILLED_DOTS (2) static int exfat_iterate(struct file *file, struct dir_context *ctx) { @@ -229,11 +232,10 @@ static int exfat_iterate(struct file *file, struct dir_context *ctx) int err = 0, fake_offset = 0; exfat_init_namebuf(nb); - mutex_lock(&EXFAT_SB(sb)->s_lock); cpos = ctx->pos; if (!dir_emit_dots(file, ctx)) - goto unlock; + goto out; if (ctx->pos == ITER_POS_FILLED_DOTS) { cpos = 0; @@ -245,16 +247,18 @@ static int exfat_iterate(struct file *file, struct dir_context *ctx) /* name buffer should be allocated before use */ err = exfat_alloc_namebuf(nb); if (err) - goto unlock; + goto out; get_new: + mutex_lock(&EXFAT_SB(sb)->s_lock); + if (ei->flags == ALLOC_NO_FAT_CHAIN && cpos >= i_size_read(inode)) goto end_of_dir; err = exfat_readdir(inode, &cpos, &de); if (err) { /* - * At least we tried to read a sector. Move cpos to next sector - * position (should be aligned). + * At least we tried to read a sector. + * Move cpos to next sector position (should be aligned). */ if (err == -EIO) { cpos += 1 << (sb->s_blocksize_bits); @@ -277,16 +281,10 @@ static int exfat_iterate(struct file *file, struct dir_context *ctx) inum = iunique(sb, EXFAT_ROOT_INO); } - /* - * Before calling dir_emit(), sb_lock should be released. - * Because page fault can occur in dir_emit() when the size - * of buffer given from user is larger than one page size. - */ mutex_unlock(&EXFAT_SB(sb)->s_lock); if (!dir_emit(ctx, nb->lfn, strlen(nb->lfn), inum, (de.attr & ATTR_SUBDIR) ? DT_DIR : DT_REG)) - goto out_unlocked; - mutex_lock(&EXFAT_SB(sb)->s_lock); + goto out; ctx->pos = cpos; goto get_new; @@ -294,9 +292,8 @@ static int exfat_iterate(struct file *file, struct dir_context *ctx) if (!cpos && fake_offset) cpos = ITER_POS_FILLED_DOTS; ctx->pos = cpos; -unlock: mutex_unlock(&EXFAT_SB(sb)->s_lock); -out_unlocked: +out: /* * To improve performance, free namebuf after unlock sb_lock. * If namebuf is not allocated, this function do nothing -- 2.25.1

1 year, 11 months

2
1
0 0

[PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag

by Alexander Aring

This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions. Fixes: 901025d2f319 ("dlm: make plock operation killable") Cc: stable(a)vger.kernel.org Signed-off-by: Alexander Aring <aahringo(a)redhat.com> --- fs/dlm/plock.c | 23 +++++++++++++++++++---- include/uapi/linux/dlm_plock.h | 1 + 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 70a4752ed913..7fe9f4b922d3 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info) op->info.end = OFFSET_MAX; op->info.owner = info->owner; - op->info.flags |= DLM_PLOCK_FL_CLOSE; + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY); send_op(op); } @@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op->info.owner = (__u64)(long) fl->fl_owner; if (fl->fl_flags & FL_CLOSE) { - op->info.flags |= DLM_PLOCK_FL_CLOSE; + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY); send_op(op); rv = 0; goto out; @@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, spin_lock(&ops_lock); if (!list_empty(&send_list)) { op = list_first_entry(&send_list, struct plock_op, list); - if (op->info.flags & DLM_PLOCK_FL_CLOSE) + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) list_del(&op->list); else list_move_tail(&op->list, &recv_list); @@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, that were generated by the vfs cleaning up for a close (the process did not make an unlock call). */ - if (op->info.flags & DLM_PLOCK_FL_CLOSE) + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) dlm_release_plock_op(op); if (copy_to_user(u, &info, sizeof(info))) @@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (check_version(&info)) return -EINVAL; + /* Some old dlm user space software will send replies back, + * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is + * new) e.g. if a error occur. We can't match them in recv_list + * because they were never be part of it. We filter it here, + * new dlm user space software will filter it in user space. + * + * In future this handling can be removed. + */ + if (info.flags & DLM_PLOCK_FL_NO_REPLY) { + pr_info("Received unexpected reply from op %d, " + "please update DLM user space software!\n", + info.optype); + return count; + } + /* * The results for waiting ops (SETLKW) can be returned in any * order, so match all fields to find the op. The results for diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h index 63b6c1fd9169..8dfa272c929a 100644 --- a/include/uapi/linux/dlm_plock.h +++ b/include/uapi/linux/dlm_plock.h @@ -25,6 +25,7 @@ enum { }; #define DLM_PLOCK_FL_CLOSE 1 +#define DLM_PLOCK_FL_NO_REPLY 2 struct dlm_plock_info { __u32 version[3]; -- 2.31.1

1 year, 11 months

3
6
0 0

[PATCH] tracing/histograms: Return an error if we fail to add histogram to hist_vars list

by Mohamed Khalfella

If the code fails to add histogram to hist_vars list, then ret should contain error code before jumping to unregister histogram. Cc: stable(a)vger.kernel.org Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") Signed-off-by: Mohamed Khalfella <mkhalfella(a)purestorage.com> --- kernel/trace/trace_events_hist.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index c8c61381eba4..d06938ae0717 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -6668,7 +6668,8 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops, goto out_unreg; if (has_hist_vars(hist_data) || hist_data->n_var_refs) { - if (save_hist_vars(hist_data)) + ret = save_hist_vars(hist_data); + if (ret) goto out_unreg; } -- 2.34.1

1 year, 11 months

2
1
0 0

[PATCH v2] tracing/histograms: Return an error if we fail to add histogram to hist_vars list

by Mohamed Khalfella

Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") added a check to fail histogram creation if save_hist_vars() failed to add histogram to hist_vars list. But the commit failed to set ret to failed return code before jumping to unregister histogram, fix it. Cc: stable(a)vger.kernel.org Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") Signed-off-by: Mohamed Khalfella <mkhalfella(a)purestorage.com> --- kernel/trace/trace_events_hist.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index c8c61381eba4..d06938ae0717 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -6668,7 +6668,8 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops, goto out_unreg; if (has_hist_vars(hist_data) || hist_data->n_var_refs) { - if (save_hist_vars(hist_data)) + ret = save_hist_vars(hist_data); + if (ret) goto out_unreg; } -- 2.34.1

1 year, 11 months

1
0
0 0

[PATCH v4] mtd: spi-nor: Correct flags for Winbond w25q128

by Linus Walleij

The Winbond "w25q128" (actual vendor name W25Q128JV) has exactly the same flags as the sibling device "w25q128jv". The devices both require unlocking to enable write access. The actual product naming between devices vs the Linux strings in winbond.c: 0xef4018: "w25q128" W25Q128JV-IN/IQ/JQ 0xef7018: "w25q128jv" W25Q128JV-IM/JM The latter device, "w25q128jv" supports features named DTQ and QPI, otherwise it is the same. Not having the right flags has the annoying side effect that write access does not work. After this patch I can write to the flash on the Inteno XG6846 router. The flash memory also supports dual and quad SPI modes. This does not currently manifest, but by turning on SFDP parsing, the right SPI modes are emitted in /sys/kernel/debug/spi-nor/spi1.0/capabilities for this chip, so we also turn on this. Since we suspect that older chips may be using the same device ID, we need to keep NO_SFDP_FLAGS(SECT_4K) as these older chips may not support SFDP. cat jedec_id ef4018 cat manufacturer winbond cat partname w25q128 hexdump -v -C sfdp 00000000 53 46 44 50 05 01 00 ff 00 05 01 10 80 00 00 ff 00000010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000080 e5 20 f9 ff ff ff ff 07 44 eb 08 6b 08 3b 42 bb 00000090 fe ff ff ff ff ff 00 00 ff ff 40 eb 0c 20 0f 52 000000a0 10 d8 00 00 36 02 a6 00 82 ea 14 c9 e9 63 76 33 000000b0 7a 75 7a 75 f7 a2 d5 5c 19 f7 4d ff e9 30 f8 80 Cc: stable(a)vger.kernel.org Suggested-by: Michael Walle <michael(a)walle.cc> Reviewed-by: Michael Walle <michael(a)walle.cc> Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org> --- Changes in v4: - Fix up error in commit message. - Pick up Michael's ACK. - Link to v3: https://lore.kernel.org/r/20230714-spi-nor-winbond-w25q128-v3-1-bdb2192f079… Changes in v3: - Keep NO_SFDP_FLAGS(SECT_4K) around. - Update commit message - Link to v2: https://lore.kernel.org/r/20230712-spi-nor-winbond-w25q128-v2-1-50c9f1d58d6… Changes in v2: - Only add the write access flags. - Use SFDP parsing to properly detect the various available SPI modes. - Link to v1: https://lore.kernel.org/r/20230712-spi-nor-winbond-w25q128-v1-1-f78f3bb42a1… --- drivers/mtd/spi-nor/winbond.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mtd/spi-nor/winbond.c b/drivers/mtd/spi-nor/winbond.c index 834d6ba5ce70..8f30a67cd27a 100644 --- a/drivers/mtd/spi-nor/winbond.c +++ b/drivers/mtd/spi-nor/winbond.c @@ -121,6 +121,8 @@ static const struct flash_info winbond_nor_parts[] = { { "w25q80bl", INFO(0xef4014, 0, 64 * 1024, 16) NO_SFDP_FLAGS(SECT_4K) }, { "w25q128", INFO(0xef4018, 0, 64 * 1024, 256) + PARSE_SFDP + FLAGS(SPI_NOR_HAS_LOCK | SPI_NOR_HAS_TB) NO_SFDP_FLAGS(SECT_4K) }, { "w25q256", INFO(0xef4019, 0, 64 * 1024, 512) NO_SFDP_FLAGS(SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) --- base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5 change-id: 20230711-spi-nor-winbond-w25q128-321a602ee267 Best regards, -- Linus Walleij <linus.walleij(a)linaro.org>

1 year, 11 months

1
0
0 0

[PATCH v3] mtd: spi-nor: Correct flags for Winbond w25q128

by Linus Walleij

The Winbond "w25q128" (actual vendor name W25Q128JV) has exactly the same flags as the sibling device "w25q128jv". The devices both require unlocking to enable write access. The actual product naming between devices vs the Linux strings in winbond.c: 0xef4018: "w25q128" W25Q128JV-IM/JM 0xef7018: "w25q128jv" W25Q128JV-IN/IQ/JQ The latter device, "w25q128jv" supports features named DTQ and QPI, otherwise it is the same. Not having the right flags has the annoying side effect that write access does not work. After this patch I can write to the flash on the Inteno XG6846 router. The flash memory also supports dual and quad SPI modes. This does not currently manifest, but by turning on SFDP parsing, the right SPI modes are emitted in /sys/kernel/debug/spi-nor/spi1.0/capabilities for this chip, so we also turn on this. Since we suspect that older chips may be using the same device ID, we need to keep NO_SFDP_FLAGS(SECT_4K) as these older chips may not support SFDP. cat jedec_id ef4018 cat manufacturer winbond cat partname w25q128 hexdump -v -C sfdp 00000000 53 46 44 50 05 01 00 ff 00 05 01 10 80 00 00 ff 00000010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000080 e5 20 f9 ff ff ff ff 07 44 eb 08 6b 08 3b 42 bb 00000090 fe ff ff ff ff ff 00 00 ff ff 40 eb 0c 20 0f 52 000000a0 10 d8 00 00 36 02 a6 00 82 ea 14 c9 e9 63 76 33 000000b0 7a 75 7a 75 f7 a2 d5 5c 19 f7 4d ff e9 30 f8 80 Cc: stable(a)vger.kernel.org Suggested-by: Michael Walle <michael(a)walle.cc> Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org> --- Changes in v3: - Keep NO_SFDP_FLAGS(SECT_4K) around. - Update commit message - Link to v2: https://lore.kernel.org/r/20230712-spi-nor-winbond-w25q128-v2-1-50c9f1d58d6… Changes in v2: - Only add the write access flags. - Use SFDP parsing to properly detect the various available SPI modes. - Link to v1: https://lore.kernel.org/r/20230712-spi-nor-winbond-w25q128-v1-1-f78f3bb42a1… --- drivers/mtd/spi-nor/winbond.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mtd/spi-nor/winbond.c b/drivers/mtd/spi-nor/winbond.c index 834d6ba5ce70..8f30a67cd27a 100644 --- a/drivers/mtd/spi-nor/winbond.c +++ b/drivers/mtd/spi-nor/winbond.c @@ -121,6 +121,8 @@ static const struct flash_info winbond_nor_parts[] = { { "w25q80bl", INFO(0xef4014, 0, 64 * 1024, 16) NO_SFDP_FLAGS(SECT_4K) }, { "w25q128", INFO(0xef4018, 0, 64 * 1024, 256) + PARSE_SFDP + FLAGS(SPI_NOR_HAS_LOCK | SPI_NOR_HAS_TB) NO_SFDP_FLAGS(SECT_4K) }, { "w25q256", INFO(0xef4019, 0, 64 * 1024, 512) NO_SFDP_FLAGS(SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) --- base-commit: 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5 change-id: 20230711-spi-nor-winbond-w25q128-321a602ee267 Best regards, -- Linus Walleij <linus.walleij(a)linaro.org>

1 year, 11 months

2
2
0 0

+ lib-test_meminit-allocate-pages-up-to-order-max_order.patch added to mm-unstable branch

by Andrew Morton

The patch titled Subject: lib/test_meminit: allocate pages up to order MAX_ORDER has been added to the -mm mm-unstable branch. Its filename is lib-test_meminit-allocate-pages-up-to-order-max_order.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Andrew Donnellan <ajd(a)linux.ibm.com> Subject: lib/test_meminit: allocate pages up to order MAX_ORDER Date: Fri, 14 Jul 2023 11:52:38 +1000 test_pages() tests the page allocator by calling alloc_pages() with different orders up to order 10. However, different architectures and platforms support different maximum contiguous allocation sizes. The default maximum allocation order (MAX_ORDER) is 10, but architectures can use CONFIG_ARCH_FORCE_MAX_ORDER to override this. On platforms where this is less than 10, test_meminit() will blow up with a WARN(). This is expected, so let's not do that. Replace the hardcoded "10" with the MAX_ORDER macro so that we test allocations up to the expected platform limit. Link: https://lkml.kernel.org/r/20230714015238.47931-1-ajd@linux.ibm.com Fixes: 5015a300a522 ("lib: introduce test_meminit module") Signed-off-by: Andrew Donnellan <ajd(a)linux.ibm.com> Reviewed-by: Alexander Potapenko <glider(a)google.com> Cc: Xiaoke Wang <xkernel.wang(a)foxmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- lib/test_meminit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/lib/test_meminit.c~lib-test_meminit-allocate-pages-up-to-order-max_order +++ a/lib/test_meminit.c @@ -93,7 +93,7 @@ static int __init test_pages(int *total_ int failures = 0, num_tests = 0; int i; - for (i = 0; i < 10; i++) + for (i = 0; i <= MAX_ORDER; i++) num_tests += do_alloc_pages_order(i, &failures); REPORT_FAILURES_IN_FN(); _ Patches currently in -mm which might be from ajd(a)linux.ibm.com are lib-test_meminit-allocate-pages-up-to-order-max_order.patch

1 year, 11 months

1
0
0 0

I wish for a prompt response

by Taras Volodymyr

D e a r Sir, I am M r. Taras Volodymyr from U k r a i n e but lived in Russia for many years, I am a successful business?m a n here in Russia as I have been involved oil serving business. Based on what is going here I found you very capable of handling this h u g e business magnitude, there is a genuine need for an investment of a substantial amount in your country. If you are willing to p a r t n e r with me I will advise you to get back to me for proceedings-details on the way forward. I wish for a prompt response from you regarding my letter. Warm regards, Mr. Taras Volodymyr

1 year, 11 months

1
0
0 0

[PATCH] KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption

by Marc Zyngier

Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when running a preemptible kernel, as it is possible that a vCPU is blocked without requesting a doorbell interrupt. The issue is that any preemption that occurs between vgic_v4_put() and schedule() on the block path will mark the vPE as nonresident and *not* request a doorbell irq. This occurs because when the vcpu thread is resumed on its way to block, vcpu_load() will make the vPE resident again. Once the vcpu actually blocks, we don't request a doorbell anymore, and the vcpu won't be woken up on interrupt delivery. Fix it by tracking that we're entering WFI, and key the doorbell request on that flag. This allows us not to make the vPE resident when going through a preempt/schedule cycle, meaning we don't lose any state. Cc: stable(a)vger.kernel.org Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put") Reported-by: Xiang Chen <chenxiang66(a)hisilicon.com> Suggested-by: Zenghui Yu <yuzenghui(a)huawei.com> Tested-by: Xiang Chen <chenxiang66(a)hisilicon.com> Co-developed-by: Oliver Upton <oliver.upton(a)linux.dev> Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev> Signed-off-by: Marc Zyngier <maz(a)kernel.org> --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 6 ++++-- arch/arm64/kvm/vgic/vgic-v3.c | 2 +- arch/arm64/kvm/vgic/vgic-v4.c | 7 +++++-- include/kvm/arm_vgic.h | 2 +- 5 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1e768481f62f..914fc9c26e40 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -817,6 +817,8 @@ struct kvm_vcpu_arch { #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5)) /* PMUSERENR for the guest EL0 is on physical CPU */ #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6)) +/* WFI instruction trapped */ +#define IN_WFI __vcpu_single_flag(sflags, BIT(7)) /* vcpu entered with HCR_EL2.E2H set */ #define VCPU_HCR_E2H __vcpu_single_flag(oflags, BIT(0)) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 236c5f1c9090..cf208d30a9ea 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -725,13 +725,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu) */ preempt_disable(); kvm_vgic_vmcr_sync(vcpu); - vgic_v4_put(vcpu, true); + vcpu_set_flag(vcpu, IN_WFI); + vgic_v4_put(vcpu); preempt_enable(); kvm_vcpu_halt(vcpu); vcpu_clear_flag(vcpu, IN_WFIT); preempt_disable(); + vcpu_clear_flag(vcpu, IN_WFI); vgic_v4_load(vcpu); preempt_enable(); } @@ -799,7 +801,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) { /* The distributor enable bits were changed */ preempt_disable(); - vgic_v4_put(vcpu, false); + vgic_v4_put(vcpu); vgic_v4_load(vcpu); preempt_enable(); } diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 49d35618d576..df61ead7c757 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -780,7 +780,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu) * done a vgic_v4_put) and when running a nested guest (the * vPE was never resident in order to generate a doorbell). */ - WARN_ON(vgic_v4_put(vcpu, false)); + WARN_ON(vgic_v4_put(vcpu)); vgic_v3_vmcr_sync(vcpu); diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c index c1c28fe680ba..339a55194b2c 100644 --- a/arch/arm64/kvm/vgic/vgic-v4.c +++ b/arch/arm64/kvm/vgic/vgic-v4.c @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm) its_vm->vpes = NULL; } -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db) +int vgic_v4_put(struct kvm_vcpu *vcpu) { struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe; if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident) return 0; - return its_make_vpe_non_resident(vpe, need_db); + return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI)); } int vgic_v4_load(struct kvm_vcpu *vcpu) @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu) if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident) return 0; + if (vcpu_get_flag(vcpu, IN_WFI)) + return 0; + /* * Before making the VPE resident, make sure the redistributor * corresponding to our current CPU expects us here. See the diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 9b91a8135dac..765d801d1ddc 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -446,7 +446,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq, int vgic_v4_load(struct kvm_vcpu *vcpu); void vgic_v4_commit(struct kvm_vcpu *vcpu); -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db); +int vgic_v4_put(struct kvm_vcpu *vcpu); bool vgic_state_is_nested(struct kvm_vcpu *vcpu); -- 2.34.1

1 year, 11 months

3
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2023