A few improvements/fixes for the mm kselftests:
- Patch 1-2 extend support for more build configurations: out-of-tree
$KDIR, cross-compilation, etc.
- Patch 3-4 fix issues in the pagemap_ioctl tests, most importantly that
it does not report failures: ./run_kselftests.sh would report OK
even if some pagemap_ioctl tests fail. That's probably why the issue
in patch 3 went unnoticed.
---
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Shuah Khan <shuah(a)kernel.org>
---
Kevin Brodsky (4):
selftests/mm: remove flaky header check
selftests/mm: pass down full CC and CFLAGS to check_config.sh
selftests/mm: fix faulting-in code in pagemap_ioctl test
selftests/mm: fix exit code in pagemap_ioctl
tools/testing/selftests/mm/Makefile | 6 +-----
tools/testing/selftests/mm/check_config.sh | 3 +--
tools/testing/selftests/mm/pagemap_ioctl.c | 12 ++++++------
3 files changed, 8 insertions(+), 13 deletions(-)
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
--
2.51.2
In cgroup v2, a mutual overlap check is required when at least one of two
cpusets is exclusive. However, this check should be relaxed and limited to
cases where both cpusets are exclusive.
This patch ensures that for sibling cpusets A1 (exclusive) and B1
(non-exclusive), change B1 cannot affect A1's exclusivity.
for example. Assume a machine has 4 CPUs (0-3).
root cgroup
/ \
A1 B1
Case 1:
Table 1.1: Before applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root | member |
#3> echo "0" > B1/cpuset.cpus | root invalid | member |
After step #3, A1 changes from "root" to "root invalid" because its CPUs
(0-1) overlap with those requested by B1 (0-3). However, B1 can actually
use CPUs 2-3(from B1's parent), so it would be more reasonable for A1 to
remain as "root."
Table 1.2: After applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root | member |
#3> echo "0" > B1/cpuset.cpus | root | member |
Case 2: (This situation remains unchanged from before)
Table 2.1: Before applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#3> echo "1-2" > B1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root invalid | member |
Table 2.2: After applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#3> echo "1-2" > B1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root invalid | member |
All other cases remain unaffected. For example, cgroup-v1, both A1 and
B1 are exclusive or non-exlusive.
---
v3 -> v4:
- Adjust the test_cpuset_prt.sh test file to align with the current
behavior.
v2 -> v3:
- Ensure compliance with constraints such as cpuset.cpus.exclusive.
- Link: https://lore.kernel.org/cgroups/20251113131434.606961-1-sunshaojie@kylinos.…
v1 -> v2:
- Keeps the current cgroup v1 behavior unchanged
- Link: https://lore.kernel.org/cgroups/c8e234f4-2c27-4753-8f39-8ae83197efd3@redhat…
---
kernel/cgroup/cpuset-internal.h | 3 ++
kernel/cgroup/cpuset-v1.c | 20 +++++++++
kernel/cgroup/cpuset.c | 43 ++++++++++++++-----
.../selftests/cgroup/test_cpuset_prs.sh | 5 ++-
4 files changed, 58 insertions(+), 13 deletions(-)
--
2.25.1
From: Li RongQing <lirongqing(a)baidu.com>
The softlockup_panic sysctl is currently a binary option: panic immediately
or never panic on soft lockups.
Panicking on any soft lockup, regardless of duration, can be overly
aggressive for brief stalls that may be caused by legitimate operations.
Conversely, never panicking may allow severe system hangs to persist
undetected.
Extend softlockup_panic to accept an integer threshold, allowing the kernel
to panic only when the normalized lockup duration exceeds N watchdog
threshold periods. This provides finer-grained control to distinguish
between transient delays and persistent system failures.
The accepted values are:
- 0: Don't panic (unchanged)
- 1: Panic when duration >= 1 * threshold (20s default, original behavior)
- N > 1: Panic when duration >= N * threshold (e.g., 2 = 40s, 3 = 60s.)
The original behavior is preserved for values 0 and 1, maintaining full
backward compatibility while allowing systems to tolerate brief lockups
while still catching severe, persistent hangs.
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
---
Documentation/admin-guide/kernel-parameters.txt | 10 +++++-----
arch/arm/configs/aspeed_g5_defconfig | 2 +-
arch/arm/configs/pxa3xx_defconfig | 2 +-
arch/openrisc/configs/or1klitex_defconfig | 2 +-
arch/powerpc/configs/skiroot_defconfig | 2 +-
drivers/gpu/drm/ci/arm.config | 2 +-
drivers/gpu/drm/ci/arm64.config | 2 +-
drivers/gpu/drm/ci/x86_64.config | 2 +-
kernel/watchdog.c | 8 +++++---
lib/Kconfig.debug | 13 +++++++------
tools/testing/selftests/bpf/config | 2 +-
tools/testing/selftests/wireguard/qemu/kernel.config | 2 +-
12 files changed, 26 insertions(+), 23 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a8d0afd..27c5f96 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6934,12 +6934,12 @@ Kernel parameters
softlockup_panic=
[KNL] Should the soft-lockup detector generate panics.
- Format: 0 | 1
+ Format: <int>
- A value of 1 instructs the soft-lockup detector
- to panic the machine when a soft-lockup occurs. It is
- also controlled by the kernel.softlockup_panic sysctl
- and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, which is the
+ A value of non-zero instructs the soft-lockup detector
+ to panic the machine when a soft-lockup duration exceeds
+ N thresholds. It is also controlled by the kernel.softlockup_panic
+ sysctl and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, which is the
respective build-time switch to that functionality.
softlockup_all_cpu_backtrace=
diff --git a/arch/arm/configs/aspeed_g5_defconfig b/arch/arm/configs/aspeed_g5_defconfig
index 2e6ea13..ec558e5 100644
--- a/arch/arm/configs/aspeed_g5_defconfig
+++ b/arch/arm/configs/aspeed_g5_defconfig
@@ -306,7 +306,7 @@ CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_TIMEOUT=-1
CONFIG_SOFTLOCKUP_DETECTOR=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=1
CONFIG_WQ_WATCHDOG=y
# CONFIG_SCHED_DEBUG is not set
diff --git a/arch/arm/configs/pxa3xx_defconfig b/arch/arm/configs/pxa3xx_defconfig
index 07d422f..fb272e3 100644
--- a/arch/arm/configs/pxa3xx_defconfig
+++ b/arch/arm/configs/pxa3xx_defconfig
@@ -100,7 +100,7 @@ CONFIG_PRINTK_TIME=y
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_SHIRQ=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
# CONFIG_SCHED_DEBUG is not set
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
diff --git a/arch/openrisc/configs/or1klitex_defconfig b/arch/openrisc/configs/or1klitex_defconfig
index fb1eb9a..984b0e3 100644
--- a/arch/openrisc/configs/or1klitex_defconfig
+++ b/arch/openrisc/configs/or1klitex_defconfig
@@ -52,5 +52,5 @@ CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,bpf"
CONFIG_PRINTK_TIME=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_SOFTLOCKUP_DETECTOR=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_BUG_ON_DATA_CORRUPTION=y
diff --git a/arch/powerpc/configs/skiroot_defconfig b/arch/powerpc/configs/skiroot_defconfig
index 2b71a6d..a4114fc 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -289,7 +289,7 @@ CONFIG_SCHED_STACK_END_CHECK=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_SOFTLOCKUP_DETECTOR=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_WQ_WATCHDOG=y
diff --git a/drivers/gpu/drm/ci/arm.config b/drivers/gpu/drm/ci/arm.config
index 411e814..d7c5167 100644
--- a/drivers/gpu/drm/ci/arm.config
+++ b/drivers/gpu/drm/ci/arm.config
@@ -52,7 +52,7 @@ CONFIG_TMPFS=y
CONFIG_PROVE_LOCKING=n
CONFIG_DEBUG_LOCKDEP=n
CONFIG_SOFTLOCKUP_DETECTOR=n
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=n
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=0
CONFIG_FW_LOADER_COMPRESS=y
diff --git a/drivers/gpu/drm/ci/arm64.config b/drivers/gpu/drm/ci/arm64.config
index fddfbd4..ea0e307 100644
--- a/drivers/gpu/drm/ci/arm64.config
+++ b/drivers/gpu/drm/ci/arm64.config
@@ -161,7 +161,7 @@ CONFIG_TMPFS=y
CONFIG_PROVE_LOCKING=n
CONFIG_DEBUG_LOCKDEP=n
CONFIG_SOFTLOCKUP_DETECTOR=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_DETECT_HUNG_TASK=y
diff --git a/drivers/gpu/drm/ci/x86_64.config b/drivers/gpu/drm/ci/x86_64.config
index 8eaba388..7ac98a7 100644
--- a/drivers/gpu/drm/ci/x86_64.config
+++ b/drivers/gpu/drm/ci/x86_64.config
@@ -47,7 +47,7 @@ CONFIG_TMPFS=y
CONFIG_PROVE_LOCKING=n
CONFIG_DEBUG_LOCKDEP=n
CONFIG_SOFTLOCKUP_DETECTOR=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_DETECT_HUNG_TASK=y
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 0685e3a..a5fa116 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -363,7 +363,7 @@ static struct cpumask watchdog_allowed_mask __read_mostly;
/* Global variables, exported for sysctl */
unsigned int __read_mostly softlockup_panic =
- IS_ENABLED(CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC);
+ CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC;
static bool softlockup_initialized __read_mostly;
static u64 __read_mostly sample_period;
@@ -879,7 +879,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
sys_info(softlockup_si_mask & ~SYS_INFO_ALL_BT);
- if (softlockup_panic)
+ duration = duration / get_softlockup_thresh();
+
+ if (softlockup_panic && duration >= softlockup_panic)
panic("softlockup: hung tasks");
}
@@ -1228,7 +1230,7 @@ static const struct ctl_table watchdog_sysctls[] = {
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
+ .extra2 = SYSCTL_INT_MAX,
},
{
.procname = "softlockup_sys_info",
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ba36939..17a7a77 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1110,13 +1110,14 @@ config SOFTLOCKUP_DETECTOR_INTR_STORM
the CPU stats and the interrupt counts during the "soft lockups".
config BOOTPARAM_SOFTLOCKUP_PANIC
- bool "Panic (Reboot) On Soft Lockups"
+ int "Panic (Reboot) On Soft Lockups"
depends on SOFTLOCKUP_DETECTOR
+ default 0
help
- Say Y here to enable the kernel to panic on "soft lockups",
- which are bugs that cause the kernel to loop in kernel
- mode for more than 20 seconds (configurable using the watchdog_thresh
- sysctl), without giving other tasks a chance to run.
+ Set to a non-zero value N to enable the kernel to panic on "soft
+ lockups", which are bugs that cause the kernel to loop in kernel
+ mode for more than (N * 20 seconds) (configurable using the
+ watchdog_thresh sysctl), without giving other tasks a chance to run.
The panic can be used in combination with panic_timeout,
to cause the system to reboot automatically after a
@@ -1124,7 +1125,7 @@ config BOOTPARAM_SOFTLOCKUP_PANIC
high-availability systems that have uptime guarantees and
where a lockup must be resolved ASAP.
- Say N if unsure.
+ Say 0 if unsure.
config HAVE_HARDLOCKUP_DETECTOR_BUDDY
bool
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 558839e..2485538 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -1,6 +1,6 @@
CONFIG_BLK_DEV_LOOP=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_BPF=y
CONFIG_BPF_EVENTS=y
CONFIG_BPF_JIT=y
diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config
index 0504c11..bb89d2d 100644
--- a/tools/testing/selftests/wireguard/qemu/kernel.config
+++ b/tools/testing/selftests/wireguard/qemu/kernel.config
@@ -80,7 +80,7 @@ CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_WQ_WATCHDOG=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=1
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=1
CONFIG_PANIC_TIMEOUT=-1
CONFIG_STACKTRACE=y
--
2.9.4
Much work has recently gone into supporting block device integrity data
(sometimes called "metadata") in Linux. Many NVMe devices these days
support metadata transfers and/or automatic protection information
generation and verification. However, ublk devices can't yet advertise
integrity data capabilities. This patch series wires up support for
integrity data in ublk. The ublk feature is referred to as "integrity"
rather than "metadata" to match the block layer's name for it and to
avoid confusion with the existing and unrelated UBLK_IO_F_META.
To advertise support for integrity data, a ublk server fills out the
struct ublk_params's integrity field and sets UBLK_PARAM_TYPE_INTEGRITY.
The struct ublk_param_integrity flags and csum_type fields use the
existing LBMD_PI_* constants from the linux/fs.h UAPI header. The ublk
driver fills out a corresponding struct blk_integrity.
When a request with integrity data is issued to the ublk device, the
ublk driver sets UBLK_IO_F_INTEGRITY in struct ublksrv_io_desc's
op_flags field. This is necessary for a ublk server for which
bi_offload_capable() returns true to distinguish requests with integrity
data from those without.
Integrity data transfers can currently only be performed via the ublk
user copy mechanism. The overhead of zero-copy buffer registration makes
it less appealing for the small transfers typical of integrity data.
Additionally, neither io_uring NVMe passthru nor IORING_RW_ATTR_FLAG_PI
currently allow an io_uring registered buffer for the integrity data.
The ki_pos field of the struct kiocb passed to the user copy
->{read,write}_iter() callback gains a bit UBLKSRV_IO_INTEGRITY_FLAG for
a ublk server to indicate whether to access the request's data or
integrity data.
Not yet supported is an analogue for the IO_INTEGRITY_CHK_*/BIP_CHECK_*
flags to ask the ublk server to verify the guard, reftag, and/or apptag
of a request's protection information. The user copy mechanism currently
forbids a ublk server from reading the data/integrity buffer of a
read-direction request. We could potentially relax this restriction for
integrity data on reads. Alternatively, the ublk driver could verify the
requested fields as part of the user copy operation.
The first 2 commits harden blk_validate_integrity_limits() to reject
nonsensical pi_offset and interval_exp integrity limits.
Caleb Sander Mateos (17):
block: validate pi_offset integrity limit
block: validate interval_exp integrity limit
blk-integrity: take const pointer in blk_integrity_rq()
ublk: move ublk flag check functions earlier
ublk: set UBLK_IO_F_INTEGRITY in ublksrv_io_desc
ublk: add ublk_copy_user_bvec() helper
ublk: split out ublk_user_copy() helper
ublk: inline ublk_check_and_get_req() into ublk_user_copy()
ublk: move offset check out of __ublk_check_and_get_req()
ublk: optimize ublk_user_copy() on daemon task
selftests: ublk: add utility to get block device metadata size
selftests: ublk: add kublk support for integrity params
selftests: ublk: implement integrity user copy in kublk
selftests: ublk: support non-O_DIRECT backing files
selftests: ublk: add integrity data support to loop target
selftests: ublk: add integrity params test
selftests: ublk: add end-to-end integrity test
Stanley Zhang (3):
ublk: add integrity UAPI
ublk: support UBLK_PARAM_TYPE_INTEGRITY in device creation
ublk: implement integrity user copy
block/blk-settings.c | 14 +-
drivers/block/ublk_drv.c | 336 +++++++++++++------
include/linux/blk-integrity.h | 6 +-
include/uapi/linux/ublk_cmd.h | 20 +-
tools/testing/selftests/ublk/Makefile | 6 +-
tools/testing/selftests/ublk/common.c | 4 +-
tools/testing/selftests/ublk/fault_inject.c | 1 +
tools/testing/selftests/ublk/file_backed.c | 61 +++-
tools/testing/selftests/ublk/kublk.c | 85 ++++-
tools/testing/selftests/ublk/kublk.h | 37 +-
tools/testing/selftests/ublk/metadata_size.c | 37 ++
tools/testing/selftests/ublk/null.c | 1 +
tools/testing/selftests/ublk/stripe.c | 6 +-
tools/testing/selftests/ublk/test_common.sh | 10 +
tools/testing/selftests/ublk/test_loop_08.sh | 111 ++++++
tools/testing/selftests/ublk/test_null_04.sh | 166 +++++++++
16 files changed, 765 insertions(+), 136 deletions(-)
create mode 100644 tools/testing/selftests/ublk/metadata_size.c
create mode 100755 tools/testing/selftests/ublk/test_loop_08.sh
create mode 100755 tools/testing/selftests/ublk/test_null_04.sh
--
2.45.2
From: Stanley Zhang <stazhang(a)purestorage.com>
Add UAPI definitions for metadata/integrity support in ublk.
UBLK_PARAM_TYPE_INTEGRITY and struct ublk_param_integrity allow a ublk
server to specify the integrity params of a ublk device.
The ublk driver will set UBLK_IO_F_INTEGRITY in the op_flags field of
struct ublksrv_io_desc for requests with integrity data.
The ublk server uses user copy with UBLKSRV_IO_INTEGRITY_FLAG set in the
offset parameter to access a request's integrity buffer.
Signed-off-by: Stanley Zhang <stazhang(a)purestorage.com>
[csander: drop feature flag and redundant pi_tuple_size field,
add io_desc flag, use block metadata UAPI constants]
Signed-off-by: Caleb Sander Mateos <csander(a)purestorage.com>
---
include/uapi/linux/ublk_cmd.h | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index ec77dabba45b..5bfb9a0521c3 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -129,11 +129,15 @@
#define UBLK_QID_BITS 12
#define UBLK_QID_BITS_MASK ((1ULL << UBLK_QID_BITS) - 1)
#define UBLK_MAX_NR_QUEUES (1U << UBLK_QID_BITS)
-#define UBLKSRV_IO_BUF_TOTAL_BITS (UBLK_QID_OFF + UBLK_QID_BITS)
+/* Copy to/from request integrity buffer instead of data buffer */
+#define UBLK_INTEGRITY_FLAG_OFF (UBLK_QID_OFF + UBLK_QID_BITS)
+#define UBLKSRV_IO_INTEGRITY_FLAG (1ULL << UBLK_INTEGRITY_FLAG_OFF)
+
+#define UBLKSRV_IO_BUF_TOTAL_BITS (UBLK_INTEGRITY_FLAG_OFF + 1)
#define UBLKSRV_IO_BUF_TOTAL_SIZE (1ULL << UBLKSRV_IO_BUF_TOTAL_BITS)
/*
* ublk server can register data buffers for incoming I/O requests with a sparse
* io_uring buffer table. The request buffer can then be used as the data buffer
@@ -406,10 +410,12 @@ struct ublksrv_ctrl_dev_info {
*
* ublk server has to check this flag if UBLK_AUTO_BUF_REG_FALLBACK is
* passed in.
*/
#define UBLK_IO_F_NEED_REG_BUF (1U << 17)
+/* Request has an integrity data buffer */
+#define UBLK_IO_F_INTEGRITY (1U << 18)
/*
* io cmd is described by this structure, and stored in share memory, indexed
* by request tag.
*
@@ -598,10 +604,20 @@ struct ublk_param_segment {
__u32 max_segment_size;
__u16 max_segments;
__u8 pad[2];
};
+struct ublk_param_integrity {
+ __u32 flags; /* LBMD_PI_CAP_* from linux/fs.h */
+ __u8 interval_exp;
+ __u8 metadata_size;
+ __u8 pi_offset;
+ __u8 csum_type; /* LBMD_PI_CSUM_* from linux/fs.h */
+ __u8 tag_size;
+ __u8 pad[7];
+};
+
struct ublk_params {
/*
* Total length of parameters, userspace has to set 'len' for both
* SET_PARAMS and GET_PARAMS command, and driver may update len
* if two sides use different version of 'ublk_params', same with
@@ -612,16 +628,18 @@ struct ublk_params {
#define UBLK_PARAM_TYPE_DISCARD (1 << 1)
#define UBLK_PARAM_TYPE_DEVT (1 << 2)
#define UBLK_PARAM_TYPE_ZONED (1 << 3)
#define UBLK_PARAM_TYPE_DMA_ALIGN (1 << 4)
#define UBLK_PARAM_TYPE_SEGMENT (1 << 5)
+#define UBLK_PARAM_TYPE_INTEGRITY (1 << 6)
__u32 types; /* types of parameter included */
struct ublk_param_basic basic;
struct ublk_param_discard discard;
struct ublk_param_devt devt;
struct ublk_param_zoned zoned;
struct ublk_param_dma_align dma;
struct ublk_param_segment seg;
+ struct ublk_param_integrity integrity;
};
#endif
--
2.45.2
Various code assumes that the integrity interval is at least 1 sector
and evenly divides the logical block size. Add these checks to
blk_validate_integrity_limits(). This guards against block drivers that
report invalid interval_exp values.
Signed-off-by: Caleb Sander Mateos <csander(a)purestorage.com>
---
block/blk-settings.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index d138abc973bb..a9e65dc090da 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -191,12 +191,17 @@ static int blk_validate_integrity_limits(struct queue_limits *lim)
return -EINVAL;
}
break;
}
- if (!bi->interval_exp)
+ if (!bi->interval_exp) {
bi->interval_exp = ilog2(lim->logical_block_size);
+ } else if (bi->interval_exp < SECTOR_SHIFT ||
+ bi->interval_exp > ilog2(lim->logical_block_size)) {
+ pr_warn("invalid interval_exp %u\n", bi->interval_exp);
+ return -EINVAL;
+ }
/*
* The PI generation / validation helpers do not expect intervals to
* straddle multiple bio_vecs. Enforce alignment so that those are
* never generated, and that each buffer is aligned as expected.
--
2.45.2