The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x f2d3155e2a6bac44d16f04415a321e8707d895c6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167810000922119(a)kroah.com' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
f2d3155e2a6b ("KVM: s390: disable migration mode when dirty tracking is disabled")
ec5c86976674 ("KVM: s390: Skip gfn/size sanity checks on memslot DELETE or FLAGS_ONLY")
cf5b486922dc ("KVM: s390: Use "new" memslot instead of userspace memory region")
106ee47dc633 ("docs: kvm: Convert api.txt to ReST format")
6c972ba685d5 ("docs: kvm: convert devices/vm.txt to ReST")
aff7aeea5483 ("docs: kvm: convert devices/vfio.txt to ReST")
e777a5bd98c6 ("docs: kvm: convert devices/vcpu.txt to ReST")
e94474300361 ("docs: kvm: convert devices/s390_flic.txt to ReST")
05c47036c62e ("docs: kvm: convert devices/mpic.txt to ReST")
c0d1c8a0af59 ("docs: kvm: devices/arm-vgit-v3.txt to ReST")
d371c011fc5e ("docs: kvm: devices/arm-vgic-its.txt to ReST format")
7bd460fc1dfa ("docs: kvm: add arm/pvtime.rst to index.rst")
290a6bb06de9 ("arm64: KVM: Add UAPI notes for swapped registers")
22945688acd4 ("KVM: PPC: Book3S HV: Support reset of secure guest")
008e359c76d8 ("KVM: PPC: Book3S HV: Radix changes for secure guest")
60f0a643aa44 ("KVM: PPC: Book3S HV: Shared pages support for secure guests")
ca9f4942670c ("KVM: PPC: Book3S HV: Support for running secure guests")
a4b28f5c6798 ("Merge remote-tracking branch 'kvmarm/kvm-arm64/stolen-time' into kvmarm-master/next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f2d3155e2a6bac44d16f04415a321e8707d895c6 Mon Sep 17 00:00:00 2001
From: Nico Boehr <nrb(a)linux.ibm.com>
Date: Fri, 27 Jan 2023 15:05:32 +0100
Subject: [PATCH] KVM: s390: disable migration mode when dirty tracking is
disabled
Migration mode is a VM attribute which enables tracking of changes in
storage attributes (PGSTE). It assumes dirty tracking is enabled on all
memslots to keep a dirty bitmap of pages with changed storage attributes.
When enabling migration mode, we currently check that dirty tracking is
enabled for all memslots. However, userspace can disable dirty tracking
without disabling migration mode.
Since migration mode is pointless with dirty tracking disabled, disable
migration mode whenever userspace disables dirty tracking on any slot.
Also update the documentation to clarify that dirty tracking must be
enabled when enabling migration mode, which is already enforced by the
code in kvm_s390_vm_start_migration().
Also highlight in the documentation for KVM_S390_GET_CMMA_BITS that it
can now fail with -EINVAL when dirty tracking is disabled while
migration mode is on. Move all the error codes to a table so this stays
readable.
To disable migration mode, slots_lock should be held, which is taken
in kvm_set_memory_region() and thus held in
kvm_arch_prepare_memory_region().
Restructure the prepare code a bit so all the sanity checking is done
before disabling migration mode. This ensures migration mode isn't
disabled when some sanity check fails.
Cc: stable(a)vger.kernel.org
Fixes: 190df4a212a7 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Signed-off-by: Nico Boehr <nrb(a)linux.ibm.com>
Reviewed-by: Janosch Frank <frankja(a)linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Link: https://lore.kernel.org/r/20230127140532.230651-2-nrb@linux.ibm.com
Message-Id: <20230127140532.230651-2-nrb(a)linux.ibm.com>
[frankja(a)linux.ibm.com: fixed commit message typo, moved api.rst error table upwards]
Signed-off-by: Janosch Frank <frankja(a)linux.ibm.com>
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index deb494f759ed..8cd7fd05d53b 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4449,6 +4449,18 @@ not holding a previously reported uncorrected error).
:Parameters: struct kvm_s390_cmma_log (in, out)
:Returns: 0 on success, a negative value on error
+Errors:
+
+ ====== =============================================================
+ ENOMEM not enough memory can be allocated to complete the task
+ ENXIO if CMMA is not enabled
+ EINVAL if KVM_S390_CMMA_PEEK is not set but migration mode was not enabled
+ EINVAL if KVM_S390_CMMA_PEEK is not set but dirty tracking has been
+ disabled (and thus migration mode was automatically disabled)
+ EFAULT if the userspace address is invalid or if no page table is
+ present for the addresses (e.g. when using hugepages).
+ ====== =============================================================
+
This ioctl is used to get the values of the CMMA bits on the s390
architecture. It is meant to be used in two scenarios:
@@ -4529,12 +4541,6 @@ mask is unused.
values points to the userspace buffer where the result will be stored.
-This ioctl can fail with -ENOMEM if not enough memory can be allocated to
-complete the task, with -ENXIO if CMMA is not enabled, with -EINVAL if
-KVM_S390_CMMA_PEEK is not set but migration mode was not enabled, with
--EFAULT if the userspace address is invalid or if no page table is
-present for the addresses (e.g. when using hugepages).
-
4.108 KVM_S390_SET_CMMA_BITS
----------------------------
diff --git a/Documentation/virt/kvm/devices/vm.rst b/Documentation/virt/kvm/devices/vm.rst
index 60acc39e0e93..147efec626e5 100644
--- a/Documentation/virt/kvm/devices/vm.rst
+++ b/Documentation/virt/kvm/devices/vm.rst
@@ -302,6 +302,10 @@ Allows userspace to start migration mode, needed for PGSTE migration.
Setting this attribute when migration mode is already active will have
no effects.
+Dirty tracking must be enabled on all memslots, else -EINVAL is returned. When
+dirty tracking is disabled on any memslot, migration mode is automatically
+stopped.
+
:Parameters: none
:Returns: -ENOMEM if there is not enough free memory to start migration mode;
-EINVAL if the state of the VM is invalid (e.g. no memory defined);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index e4890e04b210..cb72f9a09fb3 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -5633,23 +5633,40 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
if (kvm_s390_pv_get_handle(kvm))
return -EINVAL;
- if (change == KVM_MR_DELETE || change == KVM_MR_FLAGS_ONLY)
- return 0;
+ if (change != KVM_MR_DELETE && change != KVM_MR_FLAGS_ONLY) {
+ /*
+ * A few sanity checks. We can have memory slots which have to be
+ * located/ended at a segment boundary (1MB). The memory in userland is
+ * ok to be fragmented into various different vmas. It is okay to mmap()
+ * and munmap() stuff in this slot after doing this call at any time
+ */
- /* A few sanity checks. We can have memory slots which have to be
- located/ended at a segment boundary (1MB). The memory in userland is
- ok to be fragmented into various different vmas. It is okay to mmap()
- and munmap() stuff in this slot after doing this call at any time */
+ if (new->userspace_addr & 0xffffful)
+ return -EINVAL;
- if (new->userspace_addr & 0xffffful)
- return -EINVAL;
+ size = new->npages * PAGE_SIZE;
+ if (size & 0xffffful)
+ return -EINVAL;
- size = new->npages * PAGE_SIZE;
- if (size & 0xffffful)
- return -EINVAL;
+ if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
+ return -EINVAL;
+ }
- if ((new->base_gfn * PAGE_SIZE) + size > kvm->arch.mem_limit)
- return -EINVAL;
+ if (!kvm->arch.migration_mode)
+ return 0;
+
+ /*
+ * Turn off migration mode when:
+ * - userspace creates a new memslot with dirty logging off,
+ * - userspace modifies an existing memslot (MOVE or FLAGS_ONLY) and
+ * dirty logging is turned off.
+ * Migration mode expects dirty page logging being enabled to store
+ * its dirty bitmap.
+ */
+ if (change != KVM_MR_DELETE &&
+ !(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
+ WARN(kvm_s390_vm_stop_migration(kvm),
+ "Failed to stop migration mode");
return 0;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x af1c89ddb74f170eccd5a57001d7317560b638ea
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167810226715267(a)kroah.com' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
af1c89ddb74f ("ARM: dts: exynos: correct HDMI phy compatible in Exynos4")
73a901d09a21 ("ARM: dts: exynos: Add soc node to exynos4")
9097b4bd9fce ("ARM: dts: exynos: Use pmu label in exynos4412")
e030be47ac48 ("ARM: dts: exynos: Use labels instead of full paths in exynos4210")
e58864515240 ("ARM: dts: exynos: Use pinctrl labels in exynos4412-pinctrl")
88c166cec136 ("ARM: dts: exynos: Use pinctrl labels in exynos4210-pinctrl")
3be1ecf291df ("ARM: dts: exynos: Use lower case hex addresses in node unit addresses")
6351fe9375f4 ("ARM: dts: exynos: Add G3D power domain to Exynos5250")
c0d40bb3ac71 ("ARM: dts: exynos: Add audio power domain to Exynos5250")
9fbb4c096323 ("ARM: dts: exynos: Fix power domain node names for Exynos5250")
528832d4c01a ("ARM: dts: exynos: Add audio power domain support to Exynos542x SoCs")
cdd745c8c76b ("ARM: dts: exynos: Remove duplicate definitions of SSS nodes for Exynos5")
78a68acf3d33 ("ARM: dts: exynos: Switch to dedicated Odroid XU3 sound card binding")
061ae5326613 ("Merge tag 'samsung-dt-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/soc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From af1c89ddb74f170eccd5a57001d7317560b638ea Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Wed, 25 Jan 2023 10:45:05 +0100
Subject: [PATCH] ARM: dts: exynos: correct HDMI phy compatible in Exynos4
The HDMI phy compatible was missing vendor prefix.
Fixes: ed80d4cab772 ("ARM: dts: add hdmi related nodes for exynos4 SoCs")
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20230125094513.155063-1-krzysztof.kozlowski@linar…
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
diff --git a/arch/arm/boot/dts/exynos4.dtsi b/arch/arm/boot/dts/exynos4.dtsi
index 55afe9972460..d1adaee2af58 100644
--- a/arch/arm/boot/dts/exynos4.dtsi
+++ b/arch/arm/boot/dts/exynos4.dtsi
@@ -605,7 +605,7 @@ i2c_8: i2c@138e0000 {
status = "disabled";
hdmi_i2c_phy: hdmiphy@38 {
- compatible = "exynos4210-hdmiphy";
+ compatible = "samsung,exynos4210-hdmiphy";
reg = <0x38>;
};
};
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x b7625f461da6734a21c38ba6e7558538a116a2e3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167810211413647(a)kroah.com' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
b7625f461da6 ("btrfs: sysfs: update fs features directory asynchronously")
85e79ec7b78f ("btrfs: zoned: enable metadata over-commit for non-ZNS setup")
c52cc7b7acfb ("btrfs: add a BTRFS_FS_NEED_TRANS_COMMIT flag")
7966a6b5959b ("btrfs: move fs_info::flags enum to fs.h")
fc97a410bd78 ("btrfs: move mount option definitions to fs.h")
0d3a9cf8c306 ("btrfs: convert incompat and compat flag test helpers to macros")
ec8eb376e271 ("btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.h")
9b569ea0be6f ("btrfs: move the printk helpers out of ctree.h")
e118578a8df7 ("btrfs: move assert helpers out of ctree.h")
c7f13d428ea1 ("btrfs: move fs wide helpers out of ctree.h")
63a7cb130718 ("btrfs: auto enable discard=async when possible")
f1e5c6185ca1 ("btrfs: move flush related definitions to space-info.h")
4300c58f8090 ("btrfs: move btrfs on-disk definitions out of ctree.h")
d60d956eb41f ("btrfs: remove unused set/clear_pending_info helpers")
8bb808c6ad91 ("btrfs: don't print stack trace when transaction is aborted due to ENOMEM")
0e75f0054a2a ("btrfs: move fs_info forward declarations to the top of ctree.h")
2103da3b0e3a ("btrfs: move btrfs_swapfile_pin into volumes.h")
c2e79e865b87 ("btrfs: move btrfs_pinned_by_swapfile prototype into volumes.h")
43712116f8c8 ("btrfs: move btrfs_init_async_reclaim_work prototype to space-info.h")
83cf709a89fb ("btrfs: move extent state init and alloc functions to their own file")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b7625f461da6734a21c38ba6e7558538a116a2e3 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Fri, 13 Jan 2023 19:11:39 +0800
Subject: [PATCH] btrfs: sysfs: update fs features directory asynchronously
[BUG]
Since the introduction of per-fs feature sysfs interface
(/sys/fs/btrfs/<UUID>/features/), the content of that directory is never
updated.
Thus for the following case, that directory will not show the new
features like RAID56:
# mkfs.btrfs -f $dev1 $dev2 $dev3
# mount $dev1 $mnt
# btrfs balance start -f -mconvert=raid5 $mnt
# ls /sys/fs/btrfs/$uuid/features/
extended_iref free_space_tree no_holes skinny_metadata
While after unmount and mount, we got the correct features:
# umount $mnt
# mount $dev1 $mnt
# ls /sys/fs/btrfs/$uuid/features/
extended_iref free_space_tree no_holes raid56 skinny_metadata
[CAUSE]
Because we never really try to update the content of per-fs features/
directory.
We had an attempt to update the features directory dynamically in commit
14e46e04958d ("btrfs: synchronize incompat feature bits with sysfs
files"), but unfortunately it get reverted in commit e410e34fad91
("Revert "btrfs: synchronize incompat feature bits with sysfs files"").
The problem in the original patch is, in the context of
btrfs_create_chunk(), we can not afford to update the sysfs group.
The exported but never utilized function, btrfs_sysfs_feature_update()
is the leftover of such attempt. As even if we go sysfs_update_group(),
new files will need extra memory allocation, and we have no way to
specify the sysfs update to go GFP_NOFS.
[FIX]
This patch will address the old problem by doing asynchronous sysfs
update in the cleaner thread.
This involves the following changes:
- Make __btrfs_(set|clear)_fs_(incompat|compat_ro) helpers to set
BTRFS_FS_FEATURE_CHANGED flag when needed
- Update btrfs_sysfs_feature_update() to use sysfs_update_group()
And drop unnecessary arguments.
- Call btrfs_sysfs_feature_update() in cleaner_kthread
If we have the BTRFS_FS_FEATURE_CHANGED flag set.
- Wake up cleaner_kthread in btrfs_commit_transaction if we have
BTRFS_FS_FEATURE_CHANGED flag
By this, all the previously dangerous call sites like
btrfs_create_chunk() need no new changes, as above helpers would
have already set the BTRFS_FS_FEATURE_CHANGED flag.
The real work happens at cleaner_kthread, thus we pay the cost of
delaying the update to sysfs directory, but the delayed time should be
small enough that end user can not distinguish though it might get
delayed if the cleaner thread is busy with removing subvolumes or
defrag.
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7586a8e9b718..a6f89ac1c086 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1914,6 +1914,9 @@ static int cleaner_kthread(void *arg)
goto sleep;
}
+ if (test_and_clear_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags))
+ btrfs_sysfs_feature_update(fs_info);
+
btrfs_run_delayed_iputs(fs_info);
again = btrfs_clean_one_deleted_snapshot(fs_info);
diff --git a/fs/btrfs/fs.c b/fs/btrfs/fs.c
index 5553e1f8afe8..31c1648bc0b4 100644
--- a/fs/btrfs/fs.c
+++ b/fs/btrfs/fs.c
@@ -24,6 +24,7 @@ void __btrfs_set_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
@@ -46,6 +47,7 @@ void __btrfs_clear_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
@@ -68,6 +70,7 @@ void __btrfs_set_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
@@ -90,5 +93,6 @@ void __btrfs_clear_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 37b86acfcbcf..3d8156fc8523 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -125,6 +125,12 @@ enum {
*/
BTRFS_FS_NO_OVERCOMMIT,
+ /*
+ * Indicate if we have some features changed, this is mostly for
+ * cleaner thread to update the sysfs interface.
+ */
+ BTRFS_FS_FEATURE_CHANGED,
+
#if BITS_PER_LONG == 32
/* Indicate if we have error/warn message printed on 32bit systems */
BTRFS_FS_32BIT_ERROR,
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 45615ce36498..108aa3876186 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -2272,36 +2272,23 @@ void btrfs_sysfs_del_one_qgroup(struct btrfs_fs_info *fs_info,
* Change per-fs features in /sys/fs/btrfs/UUID/features to match current
* values in superblock. Call after any changes to incompat/compat_ro flags
*/
-void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info,
- u64 bit, enum btrfs_feature_set set)
+void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info)
{
- struct btrfs_fs_devices *fs_devs;
struct kobject *fsid_kobj;
- u64 __maybe_unused features;
- int __maybe_unused ret;
+ int ret;
if (!fs_info)
return;
- /*
- * See 14e46e04958df74 and e410e34fad913dd, feature bit updates are not
- * safe when called from some contexts (eg. balance)
- */
- features = get_features(fs_info, set);
- ASSERT(bit & supported_feature_masks[set]);
-
- fs_devs = fs_info->fs_devices;
- fsid_kobj = &fs_devs->fsid_kobj;
-
+ fsid_kobj = &fs_info->fs_devices->fsid_kobj;
if (!fsid_kobj->state_initialized)
return;
- /*
- * FIXME: this is too heavy to update just one value, ideally we'd like
- * to use sysfs_update_group but some refactoring is needed first.
- */
- sysfs_remove_group(fsid_kobj, &btrfs_feature_attr_group);
- ret = sysfs_create_group(fsid_kobj, &btrfs_feature_attr_group);
+ ret = sysfs_update_group(fsid_kobj, &btrfs_feature_attr_group);
+ if (ret < 0)
+ btrfs_warn(fs_info,
+ "failed to update /sys/fs/btrfs/%pU/features: %d",
+ fs_info->fs_devices->fsid, ret);
}
int __init btrfs_init_sysfs(void)
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index bacef43f7267..86c7eef12873 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -19,8 +19,7 @@ void btrfs_sysfs_remove_device(struct btrfs_device *device);
int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs);
void btrfs_sysfs_remove_fsid(struct btrfs_fs_devices *fs_devs);
void btrfs_sysfs_update_sprout_fsid(struct btrfs_fs_devices *fs_devices);
-void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info,
- u64 bit, enum btrfs_feature_set set);
+void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info);
void btrfs_kobject_uevent(struct block_device *bdev, enum kobject_action action);
int __init btrfs_init_sysfs(void);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 528efe559866..18329ebcb1cb 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2464,6 +2464,11 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
wake_up(&fs_info->transaction_wait);
btrfs_trans_state_lockdep_release(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED);
+ /* If we have features changed, wake up the cleaner to update sysfs. */
+ if (test_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags) &&
+ fs_info->cleaner_kthread)
+ wake_up_process(fs_info->cleaner_kthread);
+
ret = btrfs_write_and_wait_transaction(trans);
if (ret) {
btrfs_handle_fs_error(fs_info, ret,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x b7625f461da6734a21c38ba6e7558538a116a2e3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167810211211536(a)kroah.com' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
b7625f461da6 ("btrfs: sysfs: update fs features directory asynchronously")
85e79ec7b78f ("btrfs: zoned: enable metadata over-commit for non-ZNS setup")
c52cc7b7acfb ("btrfs: add a BTRFS_FS_NEED_TRANS_COMMIT flag")
7966a6b5959b ("btrfs: move fs_info::flags enum to fs.h")
fc97a410bd78 ("btrfs: move mount option definitions to fs.h")
0d3a9cf8c306 ("btrfs: convert incompat and compat flag test helpers to macros")
ec8eb376e271 ("btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.h")
9b569ea0be6f ("btrfs: move the printk helpers out of ctree.h")
e118578a8df7 ("btrfs: move assert helpers out of ctree.h")
c7f13d428ea1 ("btrfs: move fs wide helpers out of ctree.h")
63a7cb130718 ("btrfs: auto enable discard=async when possible")
f1e5c6185ca1 ("btrfs: move flush related definitions to space-info.h")
4300c58f8090 ("btrfs: move btrfs on-disk definitions out of ctree.h")
d60d956eb41f ("btrfs: remove unused set/clear_pending_info helpers")
8bb808c6ad91 ("btrfs: don't print stack trace when transaction is aborted due to ENOMEM")
0e75f0054a2a ("btrfs: move fs_info forward declarations to the top of ctree.h")
2103da3b0e3a ("btrfs: move btrfs_swapfile_pin into volumes.h")
c2e79e865b87 ("btrfs: move btrfs_pinned_by_swapfile prototype into volumes.h")
43712116f8c8 ("btrfs: move btrfs_init_async_reclaim_work prototype to space-info.h")
83cf709a89fb ("btrfs: move extent state init and alloc functions to their own file")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b7625f461da6734a21c38ba6e7558538a116a2e3 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Fri, 13 Jan 2023 19:11:39 +0800
Subject: [PATCH] btrfs: sysfs: update fs features directory asynchronously
[BUG]
Since the introduction of per-fs feature sysfs interface
(/sys/fs/btrfs/<UUID>/features/), the content of that directory is never
updated.
Thus for the following case, that directory will not show the new
features like RAID56:
# mkfs.btrfs -f $dev1 $dev2 $dev3
# mount $dev1 $mnt
# btrfs balance start -f -mconvert=raid5 $mnt
# ls /sys/fs/btrfs/$uuid/features/
extended_iref free_space_tree no_holes skinny_metadata
While after unmount and mount, we got the correct features:
# umount $mnt
# mount $dev1 $mnt
# ls /sys/fs/btrfs/$uuid/features/
extended_iref free_space_tree no_holes raid56 skinny_metadata
[CAUSE]
Because we never really try to update the content of per-fs features/
directory.
We had an attempt to update the features directory dynamically in commit
14e46e04958d ("btrfs: synchronize incompat feature bits with sysfs
files"), but unfortunately it get reverted in commit e410e34fad91
("Revert "btrfs: synchronize incompat feature bits with sysfs files"").
The problem in the original patch is, in the context of
btrfs_create_chunk(), we can not afford to update the sysfs group.
The exported but never utilized function, btrfs_sysfs_feature_update()
is the leftover of such attempt. As even if we go sysfs_update_group(),
new files will need extra memory allocation, and we have no way to
specify the sysfs update to go GFP_NOFS.
[FIX]
This patch will address the old problem by doing asynchronous sysfs
update in the cleaner thread.
This involves the following changes:
- Make __btrfs_(set|clear)_fs_(incompat|compat_ro) helpers to set
BTRFS_FS_FEATURE_CHANGED flag when needed
- Update btrfs_sysfs_feature_update() to use sysfs_update_group()
And drop unnecessary arguments.
- Call btrfs_sysfs_feature_update() in cleaner_kthread
If we have the BTRFS_FS_FEATURE_CHANGED flag set.
- Wake up cleaner_kthread in btrfs_commit_transaction if we have
BTRFS_FS_FEATURE_CHANGED flag
By this, all the previously dangerous call sites like
btrfs_create_chunk() need no new changes, as above helpers would
have already set the BTRFS_FS_FEATURE_CHANGED flag.
The real work happens at cleaner_kthread, thus we pay the cost of
delaying the update to sysfs directory, but the delayed time should be
small enough that end user can not distinguish though it might get
delayed if the cleaner thread is busy with removing subvolumes or
defrag.
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7586a8e9b718..a6f89ac1c086 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1914,6 +1914,9 @@ static int cleaner_kthread(void *arg)
goto sleep;
}
+ if (test_and_clear_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags))
+ btrfs_sysfs_feature_update(fs_info);
+
btrfs_run_delayed_iputs(fs_info);
again = btrfs_clean_one_deleted_snapshot(fs_info);
diff --git a/fs/btrfs/fs.c b/fs/btrfs/fs.c
index 5553e1f8afe8..31c1648bc0b4 100644
--- a/fs/btrfs/fs.c
+++ b/fs/btrfs/fs.c
@@ -24,6 +24,7 @@ void __btrfs_set_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
@@ -46,6 +47,7 @@ void __btrfs_clear_fs_incompat(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
@@ -68,6 +70,7 @@ void __btrfs_set_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
@@ -90,5 +93,6 @@ void __btrfs_clear_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag,
name, flag);
}
spin_unlock(&fs_info->super_lock);
+ set_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags);
}
}
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index 37b86acfcbcf..3d8156fc8523 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -125,6 +125,12 @@ enum {
*/
BTRFS_FS_NO_OVERCOMMIT,
+ /*
+ * Indicate if we have some features changed, this is mostly for
+ * cleaner thread to update the sysfs interface.
+ */
+ BTRFS_FS_FEATURE_CHANGED,
+
#if BITS_PER_LONG == 32
/* Indicate if we have error/warn message printed on 32bit systems */
BTRFS_FS_32BIT_ERROR,
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 45615ce36498..108aa3876186 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -2272,36 +2272,23 @@ void btrfs_sysfs_del_one_qgroup(struct btrfs_fs_info *fs_info,
* Change per-fs features in /sys/fs/btrfs/UUID/features to match current
* values in superblock. Call after any changes to incompat/compat_ro flags
*/
-void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info,
- u64 bit, enum btrfs_feature_set set)
+void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info)
{
- struct btrfs_fs_devices *fs_devs;
struct kobject *fsid_kobj;
- u64 __maybe_unused features;
- int __maybe_unused ret;
+ int ret;
if (!fs_info)
return;
- /*
- * See 14e46e04958df74 and e410e34fad913dd, feature bit updates are not
- * safe when called from some contexts (eg. balance)
- */
- features = get_features(fs_info, set);
- ASSERT(bit & supported_feature_masks[set]);
-
- fs_devs = fs_info->fs_devices;
- fsid_kobj = &fs_devs->fsid_kobj;
-
+ fsid_kobj = &fs_info->fs_devices->fsid_kobj;
if (!fsid_kobj->state_initialized)
return;
- /*
- * FIXME: this is too heavy to update just one value, ideally we'd like
- * to use sysfs_update_group but some refactoring is needed first.
- */
- sysfs_remove_group(fsid_kobj, &btrfs_feature_attr_group);
- ret = sysfs_create_group(fsid_kobj, &btrfs_feature_attr_group);
+ ret = sysfs_update_group(fsid_kobj, &btrfs_feature_attr_group);
+ if (ret < 0)
+ btrfs_warn(fs_info,
+ "failed to update /sys/fs/btrfs/%pU/features: %d",
+ fs_info->fs_devices->fsid, ret);
}
int __init btrfs_init_sysfs(void)
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index bacef43f7267..86c7eef12873 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -19,8 +19,7 @@ void btrfs_sysfs_remove_device(struct btrfs_device *device);
int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs);
void btrfs_sysfs_remove_fsid(struct btrfs_fs_devices *fs_devs);
void btrfs_sysfs_update_sprout_fsid(struct btrfs_fs_devices *fs_devices);
-void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info,
- u64 bit, enum btrfs_feature_set set);
+void btrfs_sysfs_feature_update(struct btrfs_fs_info *fs_info);
void btrfs_kobject_uevent(struct block_device *bdev, enum kobject_action action);
int __init btrfs_init_sysfs(void);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 528efe559866..18329ebcb1cb 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2464,6 +2464,11 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
wake_up(&fs_info->transaction_wait);
btrfs_trans_state_lockdep_release(fs_info, BTRFS_LOCKDEP_TRANS_UNBLOCKED);
+ /* If we have features changed, wake up the cleaner to update sysfs. */
+ if (test_bit(BTRFS_FS_FEATURE_CHANGED, &fs_info->flags) &&
+ fs_info->cleaner_kthread)
+ wake_up_process(fs_info->cleaner_kthread);
+
ret = btrfs_write_and_wait_transaction(trans);
if (ret) {
btrfs_handle_fs_error(fs_info, ret,