December 2023 - Linux-stable-mirror

[merged mm-hotfixes-stable] mm-mglru-try-to-stop-at-high-watermarks.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/mglru: try to stop at high watermarks has been removed from the -mm tree. Its filename was mm-mglru-try-to-stop-at-high-watermarks.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Yu Zhao <yuzhao(a)google.com> Subject: mm/mglru: try to stop at high watermarks Date: Thu, 7 Dec 2023 23:14:05 -0700 The initial MGLRU patchset didn't include the memcg LRU support, and it relied on should_abort_scan(), added by commit f76c83378851 ("mm: multi-gen LRU: optimize multiple memcgs"), to "backoff to avoid overshooting their aggregate reclaim target by too much". Later on when the memcg LRU was added, should_abort_scan() was deemed unnecessary, and the test results [1] showed no side effects after it was removed by commit a579086c99ed ("mm: multi-gen LRU: remove eviction fairness safeguard"). However, that test used memory.reclaim, which sets nr_to_reclaim to SWAP_CLUSTER_MAX. So it can overshoot only by SWAP_CLUSTER_MAX-1 pages, i.e., from nr_reclaimed=nr_to_reclaim-1 to nr_reclaimed=nr_to_reclaim+SWAP_CLUSTER_MAX-1. Compared with the batch size kswapd sets to nr_to_reclaim, SWAP_CLUSTER_MAX is tiny. Therefore that test isn't able to reproduce the worst case scenario, i.e., kswapd overshooting GBs on large systems and "consuming 100% CPU" (see the Closes tag). Bring back a simplified version of should_abort_scan() on top of the memcg LRU, so that kswapd stops when all eligible zones are above their respective high watermarks plus a small delta to lower the chance of KSWAPD_HIGH_WMARK_HIT_QUICKLY. Note that this only applies to order-0 reclaim, meaning compaction-induced reclaim can still run wild (which is a different problem). On Android, launching 55 apps sequentially: Before After Change pgpgin 838377172 802955040 -4% pgpgout 38037080 34336300 -10% [1] https://lore.kernel.org/20221222041905.2431096-1-yuzhao@google.com/ Link: https://lkml.kernel.org/r/20231208061407.2125867-2-yuzhao@google.com Fixes: a579086c99ed ("mm: multi-gen LRU: remove eviction fairness safeguard") Signed-off-by: Yu Zhao <yuzhao(a)google.com> Reported-by: Charan Teja Kalla <quic_charante(a)quicinc.com> Reported-by: Jaroslav Pulchart <jaroslav.pulchart(a)gooddata.com> Closes: https://lore.kernel.org/CAK8fFZ4DY+GtBA40Pm7Nn5xCHy+51w3sfxPqkqpqakSXYyX+Wg… Tested-by: Jaroslav Pulchart <jaroslav.pulchart(a)gooddata.com> Tested-by: Kalesh Singh <kaleshsingh(a)google.com> Cc: Hillf Danton <hdanton(a)sina.com> Cc: Kairui Song <ryncsn(a)gmail.com> Cc: T.J. Mercier <tjmercier(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/vmscan.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) --- a/mm/vmscan.c~mm-mglru-try-to-stop-at-high-watermarks +++ a/mm/vmscan.c @@ -4648,20 +4648,41 @@ static long get_nr_to_scan(struct lruvec return try_to_inc_max_seq(lruvec, max_seq, sc, can_swap, false) ? -1 : 0; } -static unsigned long get_nr_to_reclaim(struct scan_control *sc) +static bool should_abort_scan(struct lruvec *lruvec, struct scan_control *sc) { + int i; + enum zone_watermarks mark; + /* don't abort memcg reclaim to ensure fairness */ if (!root_reclaim(sc)) - return -1; + return false; + + if (sc->nr_reclaimed >= max(sc->nr_to_reclaim, compact_gap(sc->order))) + return true; + + /* check the order to exclude compaction-induced reclaim */ + if (!current_is_kswapd() || sc->order) + return false; + + mark = sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING ? + WMARK_PROMO : WMARK_HIGH; + + for (i = 0; i <= sc->reclaim_idx; i++) { + struct zone *zone = lruvec_pgdat(lruvec)->node_zones + i; + unsigned long size = wmark_pages(zone, mark) + MIN_LRU_BATCH; + + if (managed_zone(zone) && !zone_watermark_ok(zone, 0, size, sc->reclaim_idx, 0)) + return false; + } - return max(sc->nr_to_reclaim, compact_gap(sc->order)); + /* kswapd should abort if all eligible zones are safe */ + return true; } static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) { long nr_to_scan; unsigned long scanned = 0; - unsigned long nr_to_reclaim = get_nr_to_reclaim(sc); int swappiness = get_swappiness(lruvec, sc); /* clean file folios are more likely to exist */ @@ -4683,7 +4704,7 @@ static bool try_to_shrink_lruvec(struct if (scanned >= nr_to_scan) break; - if (sc->nr_reclaimed >= nr_to_reclaim) + if (should_abort_scan(lruvec, sc)) break; cond_resched(); @@ -4744,7 +4765,6 @@ static void shrink_many(struct pglist_da struct lru_gen_folio *lrugen; struct mem_cgroup *memcg; const struct hlist_nulls_node *pos; - unsigned long nr_to_reclaim = get_nr_to_reclaim(sc); bin = first_bin = get_random_u32_below(MEMCG_NR_BINS); restart: @@ -4777,7 +4797,7 @@ restart: rcu_read_lock(); - if (sc->nr_reclaimed >= nr_to_reclaim) + if (should_abort_scan(lruvec, sc)) break; } @@ -4788,7 +4808,7 @@ restart: mem_cgroup_put(memcg); - if (sc->nr_reclaimed >= nr_to_reclaim) + if (!is_a_nulls(pos)) return; /* restart if raced with lru_gen_rotate_memcg() */ _ Patches currently in -mm which might be from yuzhao(a)google.com are

2 years

1
0
0 0

[merged mm-hotfixes-stable] mm-mglru-fix-underprotected-page-cache.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/mglru: fix underprotected page cache has been removed from the -mm tree. Its filename was mm-mglru-fix-underprotected-page-cache.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Yu Zhao <yuzhao(a)google.com> Subject: mm/mglru: fix underprotected page cache Date: Thu, 7 Dec 2023 23:14:04 -0700 Unmapped folios accessed through file descriptors can be underprotected. Those folios are added to the oldest generation based on: 1. The fact that they are less costly to reclaim (no need to walk the rmap and flush the TLB) and have less impact on performance (don't cause major PFs and can be non-blocking if needed again). 2. The observation that they are likely to be single-use. E.g., for client use cases like Android, its apps parse configuration files and store the data in heap (anon); for server use cases like MySQL, it reads from InnoDB files and holds the cached data for tables in buffer pools (anon). However, the oldest generation can be very short lived, and if so, it doesn't provide the PID controller with enough time to respond to a surge of refaults. (Note that the PID controller uses weighted refaults and those from evicted generations only take a half of the whole weight.) In other words, for a short lived generation, the moving average smooths out the spike quickly. To fix the problem: 1. For folios that are already on LRU, if they can be beyond the tracking range of tiers, i.e., five accesses through file descriptors, move them to the second oldest generation to give them more time to age. (Note that tiers are used by the PID controller to statistically determine whether folios accessed multiple times through file descriptors are worth protecting.) 2. When adding unmapped folios to LRU, adjust the placement of them so that they are not too close to the tail. The effect of this is similar to the above. On Android, launching 55 apps sequentially: Before After Change workingset_refault_anon 25641024 25598972 0% workingset_refault_file 115016834 106178438 -8% Link: https://lkml.kernel.org/r/20231208061407.2125867-1-yuzhao@google.com Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Signed-off-by: Yu Zhao <yuzhao(a)google.com> Reported-by: Charan Teja Kalla <quic_charante(a)quicinc.com> Tested-by: Kalesh Singh <kaleshsingh(a)google.com> Cc: T.J. Mercier <tjmercier(a)google.com> Cc: Kairui Song <ryncsn(a)gmail.com> Cc: Hillf Danton <hdanton(a)sina.com> Cc: Jaroslav Pulchart <jaroslav.pulchart(a)gooddata.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/mm_inline.h | 23 ++++++++++++++--------- mm/vmscan.c | 2 +- mm/workingset.c | 6 +++--- 3 files changed, 18 insertions(+), 13 deletions(-) --- a/include/linux/mm_inline.h~mm-mglru-fix-underprotected-page-cache +++ a/include/linux/mm_inline.h @@ -232,22 +232,27 @@ static inline bool lru_gen_add_folio(str if (folio_test_unevictable(folio) || !lrugen->enabled) return false; /* - * There are three common cases for this page: - * 1. If it's hot, e.g., freshly faulted in or previously hot and - * migrated, add it to the youngest generation. - * 2. If it's cold but can't be evicted immediately, i.e., an anon page - * not in swapcache or a dirty page pending writeback, add it to the - * second oldest generation. - * 3. Everything else (clean, cold) is added to the oldest generation. + * There are four common cases for this page: + * 1. If it's hot, i.e., freshly faulted in, add it to the youngest + * generation, and it's protected over the rest below. + * 2. If it can't be evicted immediately, i.e., a dirty page pending + * writeback, add it to the second youngest generation. + * 3. If it should be evicted first, e.g., cold and clean from + * folio_rotate_reclaimable(), add it to the oldest generation. + * 4. Everything else falls between 2 & 3 above and is added to the + * second oldest generation if it's considered inactive, or the + * oldest generation otherwise. See lru_gen_is_active(). */ if (folio_test_active(folio)) seq = lrugen->max_seq; else if ((type == LRU_GEN_ANON && !folio_test_swapcache(folio)) || (folio_test_reclaim(folio) && (folio_test_dirty(folio) || folio_test_writeback(folio)))) - seq = lrugen->min_seq[type] + 1; - else + seq = lrugen->max_seq - 1; + else if (reclaiming || lrugen->min_seq[type] + MIN_NR_GENS >= lrugen->max_seq) seq = lrugen->min_seq[type]; + else + seq = lrugen->min_seq[type] + 1; gen = lru_gen_from_seq(seq); flags = (gen + 1UL) << LRU_GEN_PGOFF; --- a/mm/vmscan.c~mm-mglru-fix-underprotected-page-cache +++ a/mm/vmscan.c @@ -4232,7 +4232,7 @@ static bool sort_folio(struct lruvec *lr } /* protected */ - if (tier > tier_idx) { + if (tier > tier_idx || refs == BIT(LRU_REFS_WIDTH)) { int hist = lru_hist_from_seq(lrugen->min_seq[type]); gen = folio_inc_gen(lruvec, folio, false); --- a/mm/workingset.c~mm-mglru-fix-underprotected-page-cache +++ a/mm/workingset.c @@ -313,10 +313,10 @@ static void lru_gen_refault(struct folio * 1. For pages accessed through page tables, hotter pages pushed out * hot pages which refaulted immediately. * 2. For pages accessed multiple times through file descriptors, - * numbers of accesses might have been out of the range. + * they would have been protected by sort_folio(). */ - if (lru_gen_in_fault() || refs == BIT(LRU_REFS_WIDTH)) { - folio_set_workingset(folio); + if (lru_gen_in_fault() || refs >= BIT(LRU_REFS_WIDTH) - 1) { + set_mask_bits(&folio->flags, 0, LRU_REFS_MASK | BIT(PG_workingset)); mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + type, delta); } unlock: _ Patches currently in -mm which might be from yuzhao(a)google.com are

2 years

1
0
0 0

[merged mm-hotfixes-stable] mm-shmem-fix-race-in-shmem_undo_range-w-thp.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/shmem: fix race in shmem_undo_range w/THP has been removed from the -mm tree. Its filename was mm-shmem-fix-race-in-shmem_undo_range-w-thp.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: David Stevens <stevensd(a)chromium.org> Subject: mm/shmem: fix race in shmem_undo_range w/THP Date: Tue, 18 Apr 2023 17:40:31 +0900 Split folios during the second loop of shmem_undo_range. It's not sufficient to only split folios when dealing with partial pages, since it's possible for a THP to be faulted in after that point. Calling truncate_inode_folio in that situation can result in throwing away data outside of the range being targeted. [akpm(a)linux-foundation.org: tidy up comment layout] Link: https://lkml.kernel.org/r/20230418084031.3439795-1-stevensd@google.com Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios") Signed-off-by: David Stevens <stevensd(a)chromium.org> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Suleiman Souhlal <suleiman(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/shmem.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) --- a/mm/shmem.c~mm-shmem-fix-race-in-shmem_undo_range-w-thp +++ a/mm/shmem.c @@ -1080,7 +1080,24 @@ whole_folios: } VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); - truncate_inode_folio(mapping, folio); + + if (!folio_test_large(folio)) { + truncate_inode_folio(mapping, folio); + } else if (truncate_inode_partial_folio(folio, lstart, lend)) { + /* + * If we split a page, reset the loop so + * that we pick up the new sub pages. + * Otherwise the THP was entirely + * dropped or the target range was + * zeroed, so just continue the loop as + * is. + */ + if (!folio_test_large(folio)) { + folio_unlock(folio); + index = start; + break; + } + } } folio_unlock(folio); } _ Patches currently in -mm which might be from stevensd(a)chromium.org are

2 years

1
0
0 0

[merged mm-hotfixes-stable] revert-selftests-error-out-if-kernel-header-files-are-not-yet-built.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: Revert "selftests: error out if kernel header files are not yet built" has been removed from the -mm tree. Its filename was revert-selftests-error-out-if-kernel-header-files-are-not-yet-built.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: John Hubbard <jhubbard(a)nvidia.com> Subject: Revert "selftests: error out if kernel header files are not yet built" Date: Fri, 8 Dec 2023 18:01:44 -0800 This reverts commit 9fc96c7c19df ("selftests: error out if kernel header files are not yet built"). It turns out that requiring the kernel headers to be built as a prerequisite to building selftests, does not work in many cases. For example, Peter Zijlstra writes: "My biggest beef with the whole thing is that I simply do not want to use 'make headers', it doesn't work for me. I have a ton of output directories and I don't care to build tools into the output dirs, in fact some of them flat out refuse to work that way (bpf comes to mind)." [1] Therefore, stop erroring out on the selftests build. Additional patches will be required in order to change over to not requiring the kernel headers. [1] https://lore.kernel.org/20231208221007.GO28727@noisy.programming.kicks-ass.… Link: https://lkml.kernel.org/r/20231209020144.244759-1-jhubbard@nvidia.com Fixes: 9fc96c7c19df ("selftests: error out if kernel header files are not yet built") Signed-off-by: John Hubbard <jhubbard(a)nvidia.com> Cc: Anders Roxell <anders.roxell(a)linaro.org> Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: Jonathan Corbet <corbet(a)lwn.net> Cc: Nathan Chancellor <nathan(a)kernel.org> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Marcos Paulo de Souza <mpdesouza(a)suse.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- tools/testing/selftests/Makefile | 21 --------------- tools/testing/selftests/lib.mk | 40 ++--------------------------- 2 files changed, 4 insertions(+), 57 deletions(-) --- a/tools/testing/selftests/lib.mk~revert-selftests-error-out-if-kernel-header-files-are-not-yet-built +++ a/tools/testing/selftests/lib.mk @@ -44,26 +44,10 @@ endif selfdir = $(realpath $(dir $(filter %/lib.mk,$(MAKEFILE_LIST)))) top_srcdir = $(selfdir)/../../.. -ifeq ("$(origin O)", "command line") - KBUILD_OUTPUT := $(O) +ifeq ($(KHDR_INCLUDES),) +KHDR_INCLUDES := -isystem $(top_srcdir)/usr/include endif -ifneq ($(KBUILD_OUTPUT),) - # Make's built-in functions such as $(abspath ...), $(realpath ...) cannot - # expand a shell special character '~'. We use a somewhat tedious way here. - abs_objtree := $(shell cd $(top_srcdir) && mkdir -p $(KBUILD_OUTPUT) && cd $(KBUILD_OUTPUT) && pwd) - $(if $(abs_objtree),, \ - $(error failed to create output directory "$(KBUILD_OUTPUT)")) - # $(realpath ...) resolves symlinks - abs_objtree := $(realpath $(abs_objtree)) - KHDR_DIR := ${abs_objtree}/usr/include -else - abs_srctree := $(shell cd $(top_srcdir) && pwd) - KHDR_DIR := ${abs_srctree}/usr/include -endif - -KHDR_INCLUDES := -isystem $(KHDR_DIR) - # The following are built by lib.mk common compile rules. # TEST_CUSTOM_PROGS should be used by tests that require # custom build rule and prevent common build rule use. @@ -74,25 +58,7 @@ TEST_GEN_PROGS := $(patsubst %,$(OUTPUT) TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED)) TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES)) -all: kernel_header_files $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) \ - $(TEST_GEN_FILES) - -kernel_header_files: - @ls $(KHDR_DIR)/linux/*.h >/dev/null 2>/dev/null; \ - if [ $$? -ne 0 ]; then \ - RED='\033[1;31m'; \ - NOCOLOR='\033[0m'; \ - echo; \ - echo -e "$${RED}error$${NOCOLOR}: missing kernel header files."; \ - echo "Please run this and try again:"; \ - echo; \ - echo " cd $(top_srcdir)"; \ - echo " make headers"; \ - echo; \ - exit 1; \ - fi - -.PHONY: kernel_header_files +all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) define RUN_TESTS BASE_DIR="$(selfdir)"; \ --- a/tools/testing/selftests/Makefile~revert-selftests-error-out-if-kernel-header-files-are-not-yet-built +++ a/tools/testing/selftests/Makefile @@ -155,12 +155,10 @@ ifneq ($(KBUILD_OUTPUT),) abs_objtree := $(realpath $(abs_objtree)) BUILD := $(abs_objtree)/kselftest KHDR_INCLUDES := -isystem ${abs_objtree}/usr/include - KHDR_DIR := ${abs_objtree}/usr/include else BUILD := $(CURDIR) abs_srctree := $(shell cd $(top_srcdir) && pwd) KHDR_INCLUDES := -isystem ${abs_srctree}/usr/include - KHDR_DIR := ${abs_srctree}/usr/include DEFAULT_INSTALL_HDR_PATH := 1 endif @@ -174,7 +172,7 @@ export KHDR_INCLUDES # all isn't the first target in the file. .DEFAULT_GOAL := all -all: kernel_header_files +all: @ret=1; \ for TARGET in $(TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ @@ -185,23 +183,6 @@ all: kernel_header_files ret=$$((ret * $$?)); \ done; exit $$ret; -kernel_header_files: - @ls $(KHDR_DIR)/linux/*.h >/dev/null 2>/dev/null; \ - if [ $$? -ne 0 ]; then \ - RED='\033[1;31m'; \ - NOCOLOR='\033[0m'; \ - echo; \ - echo -e "$${RED}error$${NOCOLOR}: missing kernel header files."; \ - echo "Please run this and try again:"; \ - echo; \ - echo " cd $(top_srcdir)"; \ - echo " make headers"; \ - echo; \ - exit 1; \ - fi - -.PHONY: kernel_header_files - run_tests: all @for TARGET in $(TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ _ Patches currently in -mm which might be from jhubbard(a)nvidia.com are

2 years

1
0
0 0

[merged mm-hotfixes-stable] mm-damon-core-make-damon_start-waits-until-kdamond_fn-starts.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/damon/core: make damon_start() waits until kdamond_fn() starts has been removed from the -mm tree. Its filename was mm-damon-core-make-damon_start-waits-until-kdamond_fn-starts.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: SeongJae Park <sj(a)kernel.org> Subject: mm/damon/core: make damon_start() waits until kdamond_fn() starts Date: Fri, 8 Dec 2023 17:50:18 +0000 The cleanup tasks of kdamond threads including reset of corresponding DAMON context's ->kdamond field and decrease of global nr_running_ctxs counter is supposed to be executed by kdamond_fn(). However, commit 0f91d13366a4 ("mm/damon: simplify stop mechanism") made neither damon_start() nor damon_stop() ensure the corresponding kdamond has started the execution of kdamond_fn(). As a result, the cleanup can be skipped if damon_stop() is called fast enough after the previous damon_start(). Especially the skipped reset of ->kdamond could cause a use-after-free. Fix it by waiting for start of kdamond_fn() execution from damon_start(). Link: https://lkml.kernel.org/r/20231208175018.63880-1-sj@kernel.org Fixes: 0f91d13366a4 ("mm/damon: simplify stop mechanism") Signed-off-by: SeongJae Park <sj(a)kernel.org> Reported-by: Jakub Acs <acsjakub(a)amazon.de> Cc: Changbin Du <changbin.du(a)intel.com> Cc: Jakub Acs <acsjakub(a)amazon.de> Cc: <stable(a)vger.kernel.org> # 5.15.x Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/damon.h | 2 ++ mm/damon/core.c | 6 ++++++ 2 files changed, 8 insertions(+) --- a/include/linux/damon.h~mm-damon-core-make-damon_start-waits-until-kdamond_fn-starts +++ a/include/linux/damon.h @@ -559,6 +559,8 @@ struct damon_ctx { * update */ unsigned long next_ops_update_sis; + /* for waiting until the execution of the kdamond_fn is started */ + struct completion kdamond_started; /* public: */ struct task_struct *kdamond; --- a/mm/damon/core.c~mm-damon-core-make-damon_start-waits-until-kdamond_fn-starts +++ a/mm/damon/core.c @@ -445,6 +445,8 @@ struct damon_ctx *damon_new_ctx(void) if (!ctx) return NULL; + init_completion(&ctx->kdamond_started); + ctx->attrs.sample_interval = 5 * 1000; ctx->attrs.aggr_interval = 100 * 1000; ctx->attrs.ops_update_interval = 60 * 1000 * 1000; @@ -668,11 +670,14 @@ static int __damon_start(struct damon_ct mutex_lock(&ctx->kdamond_lock); if (!ctx->kdamond) { err = 0; + reinit_completion(&ctx->kdamond_started); ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond.%d", nr_running_ctxs); if (IS_ERR(ctx->kdamond)) { err = PTR_ERR(ctx->kdamond); ctx->kdamond = NULL; + } else { + wait_for_completion(&ctx->kdamond_started); } } mutex_unlock(&ctx->kdamond_lock); @@ -1433,6 +1438,7 @@ static int kdamond_fn(void *data) pr_debug("kdamond (%d) starts\n", current->pid); + complete(&ctx->kdamond_started); kdamond_init_intervals_sis(ctx); if (ctx->ops.init) _ Patches currently in -mm which might be from sj(a)kernel.org are selftests-damon-implement-a-python-module-for-test-purpose-damon-sysfs-controls.patch selftests-damon-_damon_sysfs-implement-kdamonds-start-function.patch selftests-damon-_damon_sysfs-implement-updat_schemes_tried_bytes-command.patch selftests-damon-add-a-test-for-update_schemes_tried_regions-sysfs-command.patch selftests-damon-add-a-test-for-update_schemes_tried_regions-hang-bug.patch

2 years

1
0
0 0

[merged mm-hotfixes-stable] kexec-drop-dependency-on-arch_supports_kexec-from-crash_dump.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP has been removed from the -mm tree. Its filename was kexec-drop-dependency-on-arch_supports_kexec-from-crash_dump.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Ignat Korchagin <ignat(a)cloudflare.com> Subject: kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP Date: Wed, 29 Nov 2023 22:04:09 +0000 In commit f8ff23429c62 ("kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP") we tried to fix a config regression, where CONFIG_CRASH_DUMP required CONFIG_KEXEC. However, it was not enough at least for arm64 platforms. While further testing the patch with our arm64 config I noticed that CONFIG_CRASH_DUMP is unavailable in menuconfig. This is because CONFIG_CRASH_DUMP still depends on the new CONFIG_ARCH_SUPPORTS_KEXEC introduced in commit 91506f7e5d21 ("arm64/kexec: refactor for kernel/Kconfig.kexec") and on arm64 CONFIG_ARCH_SUPPORTS_KEXEC requires CONFIG_PM_SLEEP_SMP=y, which in turn requires either CONFIG_SUSPEND=y or CONFIG_HIBERNATION=y neither of which are set in our config. Given that we already established that CONFIG_KEXEC (which is a switch for kexec system call itself) is not required for CONFIG_CRASH_DUMP drop CONFIG_ARCH_SUPPORTS_KEXEC dependency as well. The arm64 kernel builds just fine with CONFIG_CRASH_DUMP=y and with both CONFIG_KEXEC=n and CONFIG_KEXEC_FILE=n after f8ff23429c62 ("kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP") and this patch are applied given that the necessary shared bits are included via CONFIG_KEXEC_CORE dependency. [bhe(a)redhat.com: don't export some symbols when CONFIG_MMU=n] Link: https://lkml.kernel.org/r/ZW03ODUKGGhP1ZGU@MiWiFi-R3L-srv [bhe(a)redhat.com: riscv, kexec: fix dependency of two items] Link: https://lkml.kernel.org/r/ZW04G/SKnhbE5mnX@MiWiFi-R3L-srv Link: https://lkml.kernel.org/r/20231129220409.55006-1-ignat@cloudflare.com Fixes: 91506f7e5d21 ("arm64/kexec: refactor for kernel/Kconfig.kexec") Signed-off-by: Ignat Korchagin <ignat(a)cloudflare.com> Signed-off-by: Baoquan He <bhe(a)redhat.com> Acked-by: Baoquan He <bhe(a)redhat.com> Cc: Alexander Gordeev <agordeev(a)linux.ibm.com> Cc: <stable(a)vger.kernel.org> # 6.6+: f8ff234: kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP Cc: <stable(a)vger.kernel.org> # 6.6+ Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- arch/riscv/Kconfig | 4 ++-- arch/riscv/kernel/crash_core.c | 4 +++- kernel/Kconfig.kexec | 1 - 3 files changed, 5 insertions(+), 4 deletions(-) --- a/arch/riscv/Kconfig~kexec-drop-dependency-on-arch_supports_kexec-from-crash_dump +++ a/arch/riscv/Kconfig @@ -685,7 +685,7 @@ config RISCV_BOOT_SPINWAIT If unsure what to do here, say N. config ARCH_SUPPORTS_KEXEC - def_bool MMU + def_bool y config ARCH_SELECTS_KEXEC def_bool y @@ -693,7 +693,7 @@ config ARCH_SELECTS_KEXEC select HOTPLUG_CPU if SMP config ARCH_SUPPORTS_KEXEC_FILE - def_bool 64BIT && MMU + def_bool 64BIT config ARCH_SELECTS_KEXEC_FILE def_bool y --- a/arch/riscv/kernel/crash_core.c~kexec-drop-dependency-on-arch_supports_kexec-from-crash_dump +++ a/arch/riscv/kernel/crash_core.c @@ -5,18 +5,20 @@ void arch_crash_save_vmcoreinfo(void) { - VMCOREINFO_NUMBER(VA_BITS); VMCOREINFO_NUMBER(phys_ram_base); vmcoreinfo_append_str("NUMBER(PAGE_OFFSET)=0x%lx\n", PAGE_OFFSET); vmcoreinfo_append_str("NUMBER(VMALLOC_START)=0x%lx\n", VMALLOC_START); vmcoreinfo_append_str("NUMBER(VMALLOC_END)=0x%lx\n", VMALLOC_END); +#ifdef CONFIG_MMU + VMCOREINFO_NUMBER(VA_BITS); vmcoreinfo_append_str("NUMBER(VMEMMAP_START)=0x%lx\n", VMEMMAP_START); vmcoreinfo_append_str("NUMBER(VMEMMAP_END)=0x%lx\n", VMEMMAP_END); #ifdef CONFIG_64BIT vmcoreinfo_append_str("NUMBER(MODULES_VADDR)=0x%lx\n", MODULES_VADDR); vmcoreinfo_append_str("NUMBER(MODULES_END)=0x%lx\n", MODULES_END); #endif +#endif vmcoreinfo_append_str("NUMBER(KERNEL_LINK_ADDR)=0x%lx\n", KERNEL_LINK_ADDR); vmcoreinfo_append_str("NUMBER(va_kernel_pa_offset)=0x%lx\n", kernel_map.va_kernel_pa_offset); --- a/kernel/Kconfig.kexec~kexec-drop-dependency-on-arch_supports_kexec-from-crash_dump +++ a/kernel/Kconfig.kexec @@ -94,7 +94,6 @@ config KEXEC_JUMP config CRASH_DUMP bool "kernel crash dumps" depends on ARCH_SUPPORTS_CRASH_DUMP - depends on ARCH_SUPPORTS_KEXEC select CRASH_CORE select KEXEC_CORE help _ Patches currently in -mm which might be from ignat(a)cloudflare.com are

2 years

1
0
0 0

[PATCH v2] PM: hibernate: use acquire/release ordering when compress/decompress image

by Hongchen Zhang

When we test S4(suspend to disk) on LoongArch 3A6000 platform, the test case sometimes fails. The dmesg log shows the following error: Invalid LZO compressed length After we dig into the code, we find out that: When compress/decompress the image, the synchronization operation between the control thread and the compress/decompress/crc thread uses relaxed ordering interface, which is unreliable, and the following situation may occur: CPU 0 CPU 1 save_image_lzo lzo_compress_threadfn atomic_set(&d->stop, 1); atomic_read(&data[thr].stop) data[thr].cmp = data[thr].cmp_len; WRITE data[thr].cmp_len Then CPU0 get a old cmp_len and write to disk. When cpu resume from S4, wrong cmp_len is loaded. To maintain data consistency between two threads, we should use the acquire/release ordering interface. So we change atomic_read/atomic_set to atomic_read_acquire/atomic_set_release. Fixes: 081a9d043c98 ("PM / Hibernate: Improve performance of LZO/plain hibernation, checksum image") Cc: stable(a)vger.kernel.org Signed-off-by: Hongchen Zhang <zhanghongchen(a)loongson.cn> Signed-off-by: Weihao Li <liweihao(a)loongson.cn> --- v1 -> v2: 1. add Cc: stable(a)vger.kernel.org in commit log 2. add Fixes: line in commit log --- kernel/power/swap.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/kernel/power/swap.c b/kernel/power/swap.c index a2cb0babb5ec..d44f5937f1e5 100644 --- a/kernel/power/swap.c +++ b/kernel/power/swap.c @@ -606,11 +606,11 @@ static int crc32_threadfn(void *data) unsigned i; while (1) { - wait_event(d->go, atomic_read(&d->ready) || + wait_event(d->go, atomic_read_acquire(&d->ready) || kthread_should_stop()); if (kthread_should_stop()) { d->thr = NULL; - atomic_set(&d->stop, 1); + atomic_set_release(&d->stop, 1); wake_up(&d->done); break; } @@ -619,7 +619,7 @@ static int crc32_threadfn(void *data) for (i = 0; i < d->run_threads; i++) *d->crc32 = crc32_le(*d->crc32, d->unc[i], *d->unc_len[i]); - atomic_set(&d->stop, 1); + atomic_set_release(&d->stop, 1); wake_up(&d->done); } return 0; @@ -649,12 +649,12 @@ static int lzo_compress_threadfn(void *data) struct cmp_data *d = data; while (1) { - wait_event(d->go, atomic_read(&d->ready) || + wait_event(d->go, atomic_read_acquire(&d->ready) || kthread_should_stop()); if (kthread_should_stop()) { d->thr = NULL; d->ret = -1; - atomic_set(&d->stop, 1); + atomic_set_release(&d->stop, 1); wake_up(&d->done); break; } @@ -663,7 +663,7 @@ static int lzo_compress_threadfn(void *data) d->ret = lzo1x_1_compress(d->unc, d->unc_len, d->cmp + LZO_HEADER, &d->cmp_len, d->wrk); - atomic_set(&d->stop, 1); + atomic_set_release(&d->stop, 1); wake_up(&d->done); } return 0; @@ -798,7 +798,7 @@ static int save_image_lzo(struct swap_map_handle *handle, data[thr].unc_len = off; - atomic_set(&data[thr].ready, 1); + atomic_set_release(&data[thr].ready, 1); wake_up(&data[thr].go); } @@ -806,12 +806,12 @@ static int save_image_lzo(struct swap_map_handle *handle, break; crc->run_threads = thr; - atomic_set(&crc->ready, 1); + atomic_set_release(&crc->ready, 1); wake_up(&crc->go); for (run_threads = thr, thr = 0; thr < run_threads; thr++) { wait_event(data[thr].done, - atomic_read(&data[thr].stop)); + atomic_read_acquire(&data[thr].stop)); atomic_set(&data[thr].stop, 0); ret = data[thr].ret; @@ -850,7 +850,7 @@ static int save_image_lzo(struct swap_map_handle *handle, } } - wait_event(crc->done, atomic_read(&crc->stop)); + wait_event(crc->done, atomic_read_acquire(&crc->stop)); atomic_set(&crc->stop, 0); } @@ -1132,12 +1132,12 @@ static int lzo_decompress_threadfn(void *data) struct dec_data *d = data; while (1) { - wait_event(d->go, atomic_read(&d->ready) || + wait_event(d->go, atomic_read_acquire(&d->ready) || kthread_should_stop()); if (kthread_should_stop()) { d->thr = NULL; d->ret = -1; - atomic_set(&d->stop, 1); + atomic_set_release(&d->stop, 1); wake_up(&d->done); break; } @@ -1150,7 +1150,7 @@ static int lzo_decompress_threadfn(void *data) flush_icache_range((unsigned long)d->unc, (unsigned long)d->unc + d->unc_len); - atomic_set(&d->stop, 1); + atomic_set_release(&d->stop, 1); wake_up(&d->done); } return 0; @@ -1335,7 +1335,7 @@ static int load_image_lzo(struct swap_map_handle *handle, } if (crc->run_threads) { - wait_event(crc->done, atomic_read(&crc->stop)); + wait_event(crc->done, atomic_read_acquire(&crc->stop)); atomic_set(&crc->stop, 0); crc->run_threads = 0; } @@ -1371,7 +1371,7 @@ static int load_image_lzo(struct swap_map_handle *handle, pg = 0; } - atomic_set(&data[thr].ready, 1); + atomic_set_release(&data[thr].ready, 1); wake_up(&data[thr].go); } @@ -1390,7 +1390,7 @@ static int load_image_lzo(struct swap_map_handle *handle, for (run_threads = thr, thr = 0; thr < run_threads; thr++) { wait_event(data[thr].done, - atomic_read(&data[thr].stop)); + atomic_read_acquire(&data[thr].stop)); atomic_set(&data[thr].stop, 0); ret = data[thr].ret; @@ -1421,7 +1421,7 @@ static int load_image_lzo(struct swap_map_handle *handle, ret = snapshot_write_next(snapshot); if (ret <= 0) { crc->run_threads = thr + 1; - atomic_set(&crc->ready, 1); + atomic_set_release(&crc->ready, 1); wake_up(&crc->go); goto out_finish; } @@ -1429,13 +1429,13 @@ static int load_image_lzo(struct swap_map_handle *handle, } crc->run_threads = thr; - atomic_set(&crc->ready, 1); + atomic_set_release(&crc->ready, 1); wake_up(&crc->go); } out_finish: if (crc->run_threads) { - wait_event(crc->done, atomic_read(&crc->stop)); + wait_event(crc->done, atomic_read_acquire(&crc->stop)); atomic_set(&crc->stop, 0); } stop = ktime_get(); -- 2.33.0

2 years

2
2
0 0

[PATCH 5.4 00/67] 5.4.264-rc1 review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 5.4.264 release. There are 67 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed, 13 Dec 2023 18:19:59 +0000. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.264-rc… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 5.4.264-rc1 Mukesh Ojha <quic_mojha(a)quicinc.com> devcoredump: Send uevent once devcd is ready Mukesh Ojha <quic_mojha(a)quicinc.com> devcoredump : Serialize devcd_del work Paulo Alcantara <pc(a)manguebit.com> smb: client: fix potential NULL deref in parse_dfs_referrals() David Howells <dhowells(a)redhat.com> cifs: Fix non-availability of dedup breaking generic/304 Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Revert "btrfs: add dmesg output for first mount and last unmount of a filesystem" Namhyung Kim <namhyung(a)kernel.org> tools headers UAPI: Sync linux/perf_event.h with the kernel sources Ido Schimmel <idosch(a)nvidia.com> drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group Ido Schimmel <idosch(a)nvidia.com> psample: Require 'CAP_NET_ADMIN' when joining "packets" group Ido Schimmel <idosch(a)nvidia.com> genetlink: add CAP_NET_ADMIN test for multicast bind Ido Schimmel <idosch(a)nvidia.com> netlink: don't call ->netlink_bind with table lock held Pavel Begunkov <asml.silence(a)gmail.com> io_uring/af_unix: disable sending io_uring over sockets Ryusuke Konishi <konishi.ryusuke(a)gmail.com> nilfs2: fix missing error check for sb_set_blocksize call Claudio Imbrenda <imbrenda(a)linux.ibm.com> KVM: s390/mm: Properly reset no-dat Borislav Petkov (AMD) <bp(a)alien8.de> x86/CPU/AMD: Check vendor in the AMD microcode callback Ronald Wahl <ronald.wahl(a)raritan.com> serial: 8250_omap: Add earlycon support for the AM654 UART controller Daniel Mack <daniel(a)zonque.org> serial: sc16is7xx: address RX timeout interrupt errata Arnd Bergmann <arnd(a)arndb.de> ARM: PL011: Fix DMA support RD Babiera <rdbabiera(a)google.com> usb: typec: class: fix typec_altmode_put_partner to put plugs Cameron Williams <cang1(a)live.co.uk> parport: Add support for Brainboxes IX/UC/PX parallel cards Konstantin Aladyshev <aladyshev22(a)gmail.com> usb: gadget: f_hid: fix report descriptor allocation Wenchao Chen <wenchao.chen(a)unisoc.com> mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled Heiner Kallweit <hkallweit1(a)gmail.com> mmc: core: add helpers mmc_regulator_enable/disable_vqmmc Boerge Struempfel <boerge.struempfel(a)gmail.com> gpiolib: sysfs: Fix error handling on failed export Peter Zijlstra <peterz(a)infradead.org> perf: Fix perf_event_validate_size() Namhyung Kim <namhyung(a)kernel.org> perf/core: Add a new read format to get a number of lost samples AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com> arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names Eugen Hristev <eugen.hristev(a)collabora.com> arm64: dts: mediatek: mt7622: fix memory node warning check Daniel Borkmann <daniel(a)iogearbox.net> packet: Move reference count in packet_sock to atomic_long_t Petr Pavlu <petr.pavlu(a)suse.com> tracing: Fix a possible race when disabling buffered events Petr Pavlu <petr.pavlu(a)suse.com> tracing: Fix incomplete locking when disabling buffered events Steven Rostedt (Google) <rostedt(a)goodmis.org> tracing: Always update snapshot buffer size Ryusuke Konishi <konishi.ryusuke(a)gmail.com> nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() Jason Zhang <jason.zhang(a)rock-chips.com> ALSA: pcm: fix out-of-bounds in snd_pcm_state_names Philipp Zabel <p.zabel(a)pengutronix.de> ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt Anson Huang <Anson.Huang(a)nxp.com> ARM: dts: imx: make gpt node name generic Kunwu Chan <chentao(a)kylinos.cn> ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init Dinghao Liu <dinghao.liu(a)zju.edu.cn> scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() Petr Pavlu <petr.pavlu(a)suse.com> tracing: Fix a warning when allocating buffered events fails Dinghao Liu <dinghao.liu(a)zju.edu.cn> ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate Armin Wolf <W_Armin(a)gmx.de> hwmon: (acpi_power_meter) Fix 4.29 MW bug Kalesh AP <kalesh-anakkur.purayil(a)broadcom.com> RDMA/bnxt_re: Correct module description string John Fastabend <john.fastabend(a)gmail.com> bpf: sockmap, updating the sg structure should also update curr Eric Dumazet <edumazet(a)google.com> tcp: do not accept ACK of bytes we never sent Phil Sutter <phil(a)nwl.cc> netfilter: xt_owner: Fix for unsafe access of sk->sk_socket Yonglong Liu <liuyonglong(a)huawei.com> net: hns: fix fake link up on xge port Shigeru Yoshida <syoshida(a)redhat.com> ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() Thomas Reichinger <thomas.reichinger(a)sohard.de> arcnet: restoring support for multiple Sohard Arcnet cards Tong Zhang <ztong0001(a)gmail.com> net: arcnet: com20020 fix error handling Ahmed S. Darwish <a.darwish(a)linutronix.de> net: arcnet: Fix RESET flag handling Randy Dunlap <rdunlap(a)infradead.org> hv_netvsc: rndis_filter needs to select NLS Eric Dumazet <edumazet(a)google.com> ipv6: fix potential NULL deref in fib6_add() Luca Ceresoli <luca.ceresoli(a)bootlin.com> of: dynamic: Fix of_reconfig_get_state_change() return value documentation Rob Herring <robh(a)kernel.org> of: Add missing 'Return' section in kerneldoc comments Rob Herring <robh(a)kernel.org> of: Fix kerneldoc output formatting Lee Jones <lee.jones(a)linaro.org> of: base: Fix some formatting issues and provide missing descriptions Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com> of/irq: Make of_msi_map_rid() PCI bus agnostic Diana Craciun <diana.craciun(a)oss.nxp.com> of/irq: make of_msi_map_get_device_domain() bus agnostic Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com> of/iommu: Make of_map_rid() PCI agnostic Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com> ACPI/IORT: Make iort_msi_map_rid() PCI agnostic Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com> ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic Ulf Hansson <ulf.hansson(a)linaro.org> of: base: Add of_get_cpu_state_node() to get idle states for a CPU node YuanShang <YuanShang.Mao(a)amd.com> drm/amdgpu: correct chunk_ptr to a pointer to chunk. Masahiro Yamada <masahiroy(a)kernel.org> kconfig: fix memory leak from range properties Alex Pakhunov <alexey.pakhunov(a)spacex.com> tg3: Increment tx_dropped in tg3_tso_bug() Alex Pakhunov <alexey.pakhunov(a)spacex.com> tg3: Move the [rt]x_dropped counters to tg3_napi Jozsef Kadlecsik <kadlec(a)netfilter.org> netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Thomas Gleixner <tglx(a)linutronix.de> hrtimers: Push pending hrtimers away from outgoing CPU earlier ------------- Diffstat: Makefile | 4 +- arch/arm/boot/dts/imx6qdl.dtsi | 2 +- arch/arm/boot/dts/imx6sl.dtsi | 2 +- arch/arm/boot/dts/imx6sx.dtsi | 2 +- arch/arm/boot/dts/imx6ul.dtsi | 4 +- arch/arm/boot/dts/imx7s.dtsi | 16 +- arch/arm/mach-imx/mmdc.c | 7 +- .../boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts | 2 +- arch/arm64/boot/dts/mediatek/mt7622-rfb1.dts | 2 +- arch/arm64/boot/dts/mediatek/mt8173-evb.dts | 4 +- arch/s390/mm/pgtable.c | 2 +- arch/x86/kernel/cpu/amd.c | 3 + drivers/acpi/arm64/iort.c | 26 +- drivers/base/devcoredump.c | 86 ++++- drivers/gpio/gpiolib-sysfs.c | 15 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/hwmon/acpi_power_meter.c | 4 + drivers/infiniband/hw/bnxt_re/main.c | 2 +- drivers/iommu/of_iommu.c | 4 +- drivers/mmc/core/regulator.c | 41 +++ drivers/mmc/host/sdhci-sprd.c | 25 ++ drivers/net/arcnet/arc-rimi.c | 4 +- drivers/net/arcnet/arcdevice.h | 8 + drivers/net/arcnet/arcnet.c | 66 +++- drivers/net/arcnet/com20020-isa.c | 4 +- drivers/net/arcnet/com20020-pci.c | 119 ++++--- drivers/net/arcnet/com20020_cs.c | 2 +- drivers/net/arcnet/com90io.c | 4 +- drivers/net/arcnet/com90xx.c | 4 +- drivers/net/ethernet/broadcom/tg3.c | 42 ++- drivers/net/ethernet/broadcom/tg3.h | 4 +- drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c | 29 ++ drivers/net/hyperv/Kconfig | 1 + drivers/of/base.c | 394 ++++++++++++--------- drivers/of/dynamic.c | 22 +- drivers/of/fdt.c | 17 +- drivers/of/irq.c | 46 +-- drivers/of/overlay.c | 16 +- drivers/of/platform.c | 10 +- drivers/of/property.c | 66 ++-- drivers/parport/parport_pc.c | 21 ++ drivers/pci/msi.c | 9 +- drivers/scsi/be2iscsi/be_main.c | 1 + drivers/tty/serial/8250/8250_early.c | 1 + drivers/tty/serial/amba-pl011.c | 112 +++--- drivers/tty/serial/sc16is7xx.c | 12 + drivers/usb/gadget/function/f_hid.c | 7 +- drivers/usb/typec/class.c | 5 +- fs/btrfs/disk-io.c | 1 - fs/btrfs/super.c | 5 +- fs/cifs/cifsfs.c | 4 +- fs/cifs/smb2ops.c | 2 + fs/io_uring.c | 101 +----- fs/nilfs2/sufile.c | 44 ++- fs/nilfs2/the_nilfs.c | 6 +- include/linux/acpi_iort.h | 13 +- include/linux/cpuhotplug.h | 1 + include/linux/hrtimer.h | 4 +- include/linux/mmc/host.h | 3 + include/linux/of.h | 75 ++-- include/linux/of_irq.h | 13 +- include/linux/perf_event.h | 2 + include/net/genetlink.h | 3 + include/uapi/linux/perf_event.h | 5 +- kernel/cpu.c | 8 +- kernel/events/core.c | 80 +++-- kernel/events/ring_buffer.c | 5 +- kernel/time/hrtimer.c | 33 +- kernel/trace/trace.c | 42 ++- net/core/drop_monitor.c | 4 +- net/core/filter.c | 19 + net/core/scm.c | 6 + net/ipv4/ip_gre.c | 11 +- net/ipv4/tcp_input.c | 6 +- net/ipv6/ip6_fib.c | 6 +- net/netfilter/ipset/ip_set_core.c | 14 +- net/netfilter/xt_owner.c | 16 +- net/netlink/af_netlink.c | 4 +- net/netlink/genetlink.c | 35 ++ net/packet/af_packet.c | 16 +- net/packet/internal.h | 2 +- net/psample/psample.c | 3 +- scripts/kconfig/symbol.c | 14 +- sound/core/pcm.c | 1 + sound/soc/codecs/wm_adsp.c | 8 +- tools/include/uapi/linux/perf_event.h | 5 +- 86 files changed, 1191 insertions(+), 710 deletions(-)

2 years

7
73
0 0

[PATCH v2 5/6] s390/vfio-ap: reset queues associated with adapter for queue unbound from driver

by Tony Krowiak

When a queue is unbound from the vfio_ap device driver, if that queue is assigned to a guest's AP configuration, its associated adapter is removed because queues are defined to a guest via a matrix of adapters and domains; so, it is not possible to remove a single queue. If an adapter is removed from the guest's AP configuration, all associated queues must be reset to prevent leaking crypto data should any of them be assigned to a different guest or device driver. The one caveat is that if the queue is being removed because the adapter or domain has been removed from the host's AP configuration, then an attempt to reset the queue will fail with response code 01, AP-queue number not valid; so resetting these queues should be skipped. Acked-by: Halil Pasic <pasic(a)linux.ibm.com> Signed-off-by: Tony Krowiak <akrowiak(a)linux.ibm.com> Fixes: 09d31ff78793 ("s390/vfio-ap: hot plug/unplug of AP devices when probed/removed") Cc: <stable(a)vger.kernel.org> --- drivers/s390/crypto/vfio_ap_ops.c | 78 ++++++++++++++++--------------- 1 file changed, 41 insertions(+), 37 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 11f8f0bcc7ed..e014108067dc 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -935,45 +935,45 @@ static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev *matrix_mdev, AP_MKQID(apid, apqi)); } +static void collect_queues_to_reset(struct ap_matrix_mdev *matrix_mdev, + unsigned long apid, + struct list_head *qlist) +{ + struct vfio_ap_queue *q; + unsigned long apqi; + + for_each_set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm, AP_DOMAINS) { + q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); + if (q) + list_add_tail(&q->reset_qnode, qlist); + } +} + +static void reset_queues_for_apid(struct ap_matrix_mdev *matrix_mdev, + unsigned long apid) +{ + struct list_head qlist; + + INIT_LIST_HEAD(&qlist); + collect_queues_to_reset(matrix_mdev, apid, &qlist); + vfio_ap_mdev_reset_qlist(&qlist); +} + static int reset_queues_for_apids(struct ap_matrix_mdev *matrix_mdev, unsigned long *apm_reset) { - struct vfio_ap_queue *q, *tmpq; struct list_head qlist; - unsigned long apid, apqi; - int apqn, ret = 0; + unsigned long apid; if (bitmap_empty(apm_reset, AP_DEVICES)) return 0; INIT_LIST_HEAD(&qlist); - for_each_set_bit_inv(apid, apm_reset, AP_DEVICES) { - for_each_set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm, - AP_DOMAINS) { - /* - * If the domain is not in the host's AP configuration, - * then resetting it will fail with response code 01 - * (APQN not valid). - */ - if (!test_bit_inv(apqi, - (unsigned long *)matrix_dev->info.aqm)) - continue; - - apqn = AP_MKQID(apid, apqi); - q = vfio_ap_mdev_get_queue(matrix_mdev, apqn); - - if (q) - list_add_tail(&q->reset_qnode, &qlist); - } - } + for_each_set_bit_inv(apid, apm_reset, AP_DEVICES) + collect_queues_to_reset(matrix_mdev, apid, &qlist); - ret = vfio_ap_mdev_reset_qlist(&qlist); - - list_for_each_entry_safe(q, tmpq, &qlist, reset_qnode) - list_del(&q->reset_qnode); - - return ret; + return vfio_ap_mdev_reset_qlist(&qlist); } /** @@ -2199,24 +2199,28 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev) matrix_mdev = q->matrix_mdev; if (matrix_mdev) { - vfio_ap_unlink_queue_fr_mdev(q); - - apid = AP_QID_CARD(q->apqn); - apqi = AP_QID_QUEUE(q->apqn); - - /* - * If the queue is assigned to the guest's APCB, then remove - * the adapter's APID from the APCB and hot it into the guest. - */ + /* If the queue is assigned to the guest's AP configuration */ if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm) && test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) { + /* + * Since the queues are defined via a matrix of adapters + * and domains, it is not possible to hot unplug a + * single queue; so, let's unplug the adapter. + */ clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm); vfio_ap_mdev_update_guest_apcb(matrix_mdev); + reset_queues_for_apid(matrix_mdev, apid); + goto done; } } vfio_ap_mdev_reset_queue(q); flush_work(&q->reset_work); + +done: + if (matrix_mdev) + vfio_ap_unlink_queue_fr_mdev(q); + dev_set_drvdata(&apdev->device, NULL); kfree(q); release_update_locks_for_mdev(matrix_mdev); -- 2.43.0

2 years

1
0
0 0

[PATCH v2 4/6] s390/vfio-ap: reset queues filtered from the guest's AP config

by Tony Krowiak

When filtering the adapters from the configuration profile for a guest to create or update a guest's AP configuration, if the APID of an adapter and the APQI of a domain identify a queue device that is not bound to the vfio_ap device driver, the APID of the adapter will be filtered because an individual APQN can not be filtered due to the fact the APQNs are assigned to an AP configuration as a matrix of APIDs and APQIs. Consequently, a guest will not have access to all of the queues associated with the filtered adapter. If the queues are subsequently made available again to the guest, they should re-appear in a reset state; so, let's make sure all queues associated with an adapter unplugged from the guest are reset. In order to identify the set of queues that need to be reset, let's allow a vfio_ap_queue object to be simultaneously stored in both a hashtable and a list: A hashtable used to store all of the queues assigned to a matrix mdev; and/or, a list used to store a subset of the queues that need to be reset. For example, when an adapter is hot unplugged from a guest, all guest queues associated with that adapter must be reset. Since that may be a subset of those assigned to the matrix mdev, they can be stored in a list that can be passed to the vfio_ap_mdev_reset_queues function. Signed-off-by: Tony Krowiak <akrowiak(a)linux.ibm.com> Acked-by: Halil Pasic <pasic(a)linux.ibm.com> Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: <stable(a)vger.kernel.org> --- drivers/s390/crypto/vfio_ap_ops.c | 171 +++++++++++++++++++------- drivers/s390/crypto/vfio_ap_private.h | 11 +- 2 files changed, 133 insertions(+), 49 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 26bd4aca497a..11f8f0bcc7ed 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -32,7 +32,8 @@ #define AP_RESET_INTERVAL 20 /* Reset sleep interval (20ms) */ -static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable); +static int vfio_ap_mdev_reset_queues(struct ap_matrix_mdev *matrix_mdev); +static int vfio_ap_mdev_reset_qlist(struct list_head *qlist); static struct vfio_ap_queue *vfio_ap_find_queue(int apqn); static const struct vfio_device_ops vfio_ap_matrix_dev_ops; static void vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q); @@ -661,16 +662,23 @@ static bool vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) * device driver. * * @matrix_mdev: the matrix mdev whose matrix is to be filtered. + * @apm_filtered: a 256-bit bitmap for storing the APIDs filtered from the + * guest's AP configuration that are still in the host's AP + * configuration. * * Note: If an APQN referencing a queue device that is not bound to the vfio_ap * driver, its APID will be filtered from the guest's APCB. The matrix * structure precludes filtering an individual APQN, so its APID will be - * filtered. + * filtered. Consequently, all queues associated with the adapter that + * are in the host's AP configuration must be reset. If queues are + * subsequently made available again to the guest, they should re-appear + * in a reset state * * Return: a boolean value indicating whether the KVM guest's APCB was changed * by the filtering or not. */ -static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) +static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev, + unsigned long *apm_filtered) { unsigned long apid, apqi, apqn; DECLARE_BITMAP(prev_shadow_apm, AP_DEVICES); @@ -680,6 +688,7 @@ static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) bitmap_copy(prev_shadow_apm, matrix_mdev->shadow_apcb.apm, AP_DEVICES); bitmap_copy(prev_shadow_aqm, matrix_mdev->shadow_apcb.aqm, AP_DOMAINS); vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb); + bitmap_clear(apm_filtered, 0, AP_DEVICES); /* * Copy the adapters, domains and control domains to the shadow_apcb @@ -705,8 +714,16 @@ static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) apqn = AP_MKQID(apid, apqi); q = vfio_ap_mdev_get_queue(matrix_mdev, apqn); if (!q || q->reset_status.response_code) { - clear_bit_inv(apid, - matrix_mdev->shadow_apcb.apm); + clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm); + + /* + * If the adapter was previously plugged into + * the guest, let's let the caller know that + * the APID was filtered. + */ + if (test_bit_inv(apid, prev_shadow_apm)) + set_bit_inv(apid, apm_filtered); + break; } } @@ -808,7 +825,7 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev) mutex_lock(&matrix_dev->guests_lock); mutex_lock(&matrix_dev->mdevs_lock); - vfio_ap_mdev_reset_queues(&matrix_mdev->qtable); + vfio_ap_mdev_reset_queues(matrix_mdev); vfio_ap_mdev_unlink_fr_queues(matrix_mdev); list_del(&matrix_mdev->node); mutex_unlock(&matrix_dev->mdevs_lock); @@ -918,6 +935,47 @@ static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev *matrix_mdev, AP_MKQID(apid, apqi)); } +static int reset_queues_for_apids(struct ap_matrix_mdev *matrix_mdev, + unsigned long *apm_reset) +{ + struct vfio_ap_queue *q, *tmpq; + struct list_head qlist; + unsigned long apid, apqi; + int apqn, ret = 0; + + if (bitmap_empty(apm_reset, AP_DEVICES)) + return 0; + + INIT_LIST_HEAD(&qlist); + + for_each_set_bit_inv(apid, apm_reset, AP_DEVICES) { + for_each_set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm, + AP_DOMAINS) { + /* + * If the domain is not in the host's AP configuration, + * then resetting it will fail with response code 01 + * (APQN not valid). + */ + if (!test_bit_inv(apqi, + (unsigned long *)matrix_dev->info.aqm)) + continue; + + apqn = AP_MKQID(apid, apqi); + q = vfio_ap_mdev_get_queue(matrix_mdev, apqn); + + if (q) + list_add_tail(&q->reset_qnode, &qlist); + } + } + + ret = vfio_ap_mdev_reset_qlist(&qlist); + + list_for_each_entry_safe(q, tmpq, &qlist, reset_qnode) + list_del(&q->reset_qnode); + + return ret; +} + /** * assign_adapter_store - parses the APID from @buf and sets the * corresponding bit in the mediated matrix device's APM @@ -958,6 +1016,7 @@ static ssize_t assign_adapter_store(struct device *dev, { int ret; unsigned long apid; + DECLARE_BITMAP(apm_filtered, AP_DEVICES); struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); mutex_lock(&ap_perms_mutex); @@ -987,8 +1046,10 @@ static ssize_t assign_adapter_store(struct device *dev, vfio_ap_mdev_link_adapter(matrix_mdev, apid); - if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + if (vfio_ap_mdev_filter_matrix(matrix_mdev, apm_filtered)) { vfio_ap_mdev_update_guest_apcb(matrix_mdev); + reset_queues_for_apids(matrix_mdev, apm_filtered); + } ret = count; done: @@ -1019,11 +1080,12 @@ static struct vfio_ap_queue * adapter was assigned. * @matrix_mdev: the matrix mediated device to which the adapter was assigned. * @apid: the APID of the unassigned adapter. - * @qtable: table for storing queues associated with unassigned adapter. + * @qlist: list for storing queues associated with unassigned adapter that + * need to be reset. */ static void vfio_ap_mdev_unlink_adapter(struct ap_matrix_mdev *matrix_mdev, unsigned long apid, - struct ap_queue_table *qtable) + struct list_head *qlist) { unsigned long apqi; struct vfio_ap_queue *q; @@ -1031,11 +1093,10 @@ static void vfio_ap_mdev_unlink_adapter(struct ap_matrix_mdev *matrix_mdev, for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) { q = vfio_ap_unlink_apqn_fr_mdev(matrix_mdev, apid, apqi); - if (q && qtable) { + if (q && qlist) { if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm) && test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) - hash_add(qtable->queues, &q->mdev_qnode, - q->apqn); + list_add_tail(&q->reset_qnode, qlist); } } } @@ -1043,26 +1104,23 @@ static void vfio_ap_mdev_unlink_adapter(struct ap_matrix_mdev *matrix_mdev, static void vfio_ap_mdev_hot_unplug_adapter(struct ap_matrix_mdev *matrix_mdev, unsigned long apid) { - int loop_cursor; - struct vfio_ap_queue *q; - struct ap_queue_table *qtable = kzalloc(sizeof(*qtable), GFP_KERNEL); + struct vfio_ap_queue *q, *tmpq; + struct list_head qlist; - hash_init(qtable->queues); - vfio_ap_mdev_unlink_adapter(matrix_mdev, apid, qtable); + INIT_LIST_HEAD(&qlist); + vfio_ap_mdev_unlink_adapter(matrix_mdev, apid, &qlist); if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm)) { clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm); vfio_ap_mdev_update_guest_apcb(matrix_mdev); } - vfio_ap_mdev_reset_queues(qtable); + vfio_ap_mdev_reset_qlist(&qlist); - hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) { + list_for_each_entry_safe(q, tmpq, &qlist, reset_qnode) { vfio_ap_unlink_mdev_fr_queue(q); - hash_del(&q->mdev_qnode); + list_del(&q->reset_qnode); } - - kfree(qtable); } /** @@ -1163,6 +1221,7 @@ static ssize_t assign_domain_store(struct device *dev, { int ret; unsigned long apqi; + DECLARE_BITMAP(apm_filtered, AP_DEVICES); struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); mutex_lock(&ap_perms_mutex); @@ -1192,8 +1251,10 @@ static ssize_t assign_domain_store(struct device *dev, vfio_ap_mdev_link_domain(matrix_mdev, apqi); - if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + if (vfio_ap_mdev_filter_matrix(matrix_mdev, apm_filtered)) { vfio_ap_mdev_update_guest_apcb(matrix_mdev); + reset_queues_for_apids(matrix_mdev, apm_filtered); + } ret = count; done: @@ -1206,7 +1267,7 @@ static DEVICE_ATTR_WO(assign_domain); static void vfio_ap_mdev_unlink_domain(struct ap_matrix_mdev *matrix_mdev, unsigned long apqi, - struct ap_queue_table *qtable) + struct list_head *qlist) { unsigned long apid; struct vfio_ap_queue *q; @@ -1214,11 +1275,10 @@ static void vfio_ap_mdev_unlink_domain(struct ap_matrix_mdev *matrix_mdev, for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) { q = vfio_ap_unlink_apqn_fr_mdev(matrix_mdev, apid, apqi); - if (q && qtable) { + if (q && qlist) { if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm) && test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) - hash_add(qtable->queues, &q->mdev_qnode, - q->apqn); + list_add_tail(&q->reset_qnode, qlist); } } } @@ -1226,26 +1286,23 @@ static void vfio_ap_mdev_unlink_domain(struct ap_matrix_mdev *matrix_mdev, static void vfio_ap_mdev_hot_unplug_domain(struct ap_matrix_mdev *matrix_mdev, unsigned long apqi) { - int loop_cursor; - struct vfio_ap_queue *q; - struct ap_queue_table *qtable = kzalloc(sizeof(*qtable), GFP_KERNEL); + struct vfio_ap_queue *q, *tmpq; + struct list_head qlist; - hash_init(qtable->queues); - vfio_ap_mdev_unlink_domain(matrix_mdev, apqi, qtable); + INIT_LIST_HEAD(&qlist); + vfio_ap_mdev_unlink_domain(matrix_mdev, apqi, &qlist); if (test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) { clear_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm); vfio_ap_mdev_update_guest_apcb(matrix_mdev); } - vfio_ap_mdev_reset_queues(qtable); + vfio_ap_mdev_reset_qlist(&qlist); - hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) { + list_for_each_entry_safe(q, tmpq, &qlist, reset_qnode) { vfio_ap_unlink_mdev_fr_queue(q); - hash_del(&q->mdev_qnode); + list_del(&q->reset_qnode); } - - kfree(qtable); } /** @@ -1600,7 +1657,7 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) get_update_locks_for_kvm(kvm); kvm_arch_crypto_clear_masks(kvm); - vfio_ap_mdev_reset_queues(&matrix_mdev->qtable); + vfio_ap_mdev_reset_queues(matrix_mdev); kvm_put_kvm(kvm); matrix_mdev->kvm = NULL; @@ -1736,15 +1793,33 @@ static void vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q) } } -static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable) +static int vfio_ap_mdev_reset_queues(struct ap_matrix_mdev *matrix_mdev) { int ret = 0, loop_cursor; struct vfio_ap_queue *q; - hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) + hash_for_each(matrix_mdev->qtable.queues, loop_cursor, q, mdev_qnode) vfio_ap_mdev_reset_queue(q); - hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) { + hash_for_each(matrix_mdev->qtable.queues, loop_cursor, q, mdev_qnode) { + flush_work(&q->reset_work); + + if (q->reset_status.response_code) + ret = -EIO; + } + + return ret; +} + +static int vfio_ap_mdev_reset_qlist(struct list_head *qlist) +{ + int ret = 0; + struct vfio_ap_queue *q; + + list_for_each_entry(q, qlist, reset_qnode) + vfio_ap_mdev_reset_queue(q); + + list_for_each_entry(q, qlist, reset_qnode) { flush_work(&q->reset_work); if (q->reset_status.response_code) @@ -1930,7 +2005,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct vfio_device *vdev, ret = vfio_ap_mdev_get_device_info(arg); break; case VFIO_DEVICE_RESET: - ret = vfio_ap_mdev_reset_queues(&matrix_mdev->qtable); + ret = vfio_ap_mdev_reset_queues(matrix_mdev); break; case VFIO_DEVICE_GET_IRQ_INFO: ret = vfio_ap_get_irq_info(arg); @@ -2062,6 +2137,7 @@ int vfio_ap_mdev_probe_queue(struct ap_device *apdev) { int ret; struct vfio_ap_queue *q; + DECLARE_BITMAP(apm_filtered, AP_DEVICES); struct ap_matrix_mdev *matrix_mdev; ret = sysfs_create_group(&apdev->device.kobj, &vfio_queue_attr_group); @@ -2094,15 +2170,17 @@ int vfio_ap_mdev_probe_queue(struct ap_device *apdev) !bitmap_empty(matrix_mdev->aqm_add, AP_DOMAINS)) goto done; - if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + if (vfio_ap_mdev_filter_matrix(matrix_mdev, apm_filtered)) { vfio_ap_mdev_update_guest_apcb(matrix_mdev); + reset_queues_for_apids(matrix_mdev, apm_filtered); + } } done: dev_set_drvdata(&apdev->device, q); release_update_locks_for_mdev(matrix_mdev); - return 0; + return ret; err_remove_group: sysfs_remove_group(&apdev->device.kobj, &vfio_queue_attr_group); @@ -2446,6 +2524,7 @@ void vfio_ap_on_cfg_changed(struct ap_config_info *cur_cfg_info, static void vfio_ap_mdev_hot_plug_cfg(struct ap_matrix_mdev *matrix_mdev) { + DECLARE_BITMAP(apm_filtered, AP_DEVICES); bool filter_domains, filter_adapters, filter_cdoms, do_hotplug = false; mutex_lock(&matrix_mdev->kvm->lock); @@ -2459,7 +2538,7 @@ static void vfio_ap_mdev_hot_plug_cfg(struct ap_matrix_mdev *matrix_mdev) matrix_mdev->adm_add, AP_DOMAINS); if (filter_adapters || filter_domains) - do_hotplug = vfio_ap_mdev_filter_matrix(matrix_mdev); + do_hotplug = vfio_ap_mdev_filter_matrix(matrix_mdev, apm_filtered); if (filter_cdoms) do_hotplug |= vfio_ap_mdev_filter_cdoms(matrix_mdev); @@ -2467,6 +2546,8 @@ static void vfio_ap_mdev_hot_plug_cfg(struct ap_matrix_mdev *matrix_mdev) if (do_hotplug) vfio_ap_mdev_update_guest_apcb(matrix_mdev); + reset_queues_for_apids(matrix_mdev, apm_filtered); + mutex_unlock(&matrix_dev->mdevs_lock); mutex_unlock(&matrix_mdev->kvm->lock); } diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 88aff8b81f2f..20eac8b0f0b9 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -83,10 +83,10 @@ struct ap_matrix { }; /** - * struct ap_queue_table - a table of queue objects. - * - * @queues: a hashtable of queues (struct vfio_ap_queue). - */ + * struct ap_queue_table - a table of queue objects. + * + * @queues: a hashtable of queues (struct vfio_ap_queue). + */ struct ap_queue_table { DECLARE_HASHTABLE(queues, 8); }; @@ -133,6 +133,8 @@ struct ap_matrix_mdev { * @apqn: the APQN of the AP queue device * @saved_isc: the guest ISC registered with the GIB interface * @mdev_qnode: allows the vfio_ap_queue struct to be added to a hashtable + * @reset_qnode: allows the vfio_ap_queue struct to be added to a list of queues + * that need to be reset * @reset_status: the status from the last reset of the queue * @reset_work: work to wait for queue reset to complete */ @@ -143,6 +145,7 @@ struct vfio_ap_queue { #define VFIO_AP_ISC_INVALID 0xff unsigned char saved_isc; struct hlist_node mdev_qnode; + struct list_head reset_qnode; struct ap_queue_status reset_status; struct work_struct reset_work; }; -- 2.43.0

2 years

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror December 2023