June 2023 - Linux-stable-mirror

by Limonciello, Mario

Hi, Some (consumer) Phoenix laptops are showing up on the market that don't have discrete TPMs and are choosing not to enable the AMD security processor firmware TPM (fTPM). In these laptops they offer Pluton, and are relying upon Pluton for TPM functionality. This was introduced in kernel 6.3 with: 4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton") I double checked with a backport of this to 6.1.y and at least basic TPM functionality does work properly. Could this be brought back to 6.1.y so that users with these laptops have TPM working with the LTS kernel? Thanks,

2 years

1
0
0 0

[PATCH v13 3/3] block: add overflow checks for Amiga partition support

by Michael Schmitz

The Amiga partition parser module uses signed int for partition sector address and count, which will overflow for disks larger than 1 TB. Use u64 as type for sector address and size to allow using disks up to 2 TB without LBD support, and disks larger than 2 TB with LBD. The RBD format allows to specify disk sizes up to 2^128 bytes (though native OS limitations reduce this somewhat, to max 2^68 bytes), so check for u64 overflow carefully to protect against overflowing sector_t. Bail out if sector addresses overflow 32 bits on kernels without LBD support. This bug was reported originally in 2012, and the fix was created by the RDB author, Joanne Dow <jdow(a)earthlink.net>. A patch had been discussed and reviewed on linux-m68k at that time but never officially submitted (now resubmitted as patch 1 in this series). This patch adds additional error checking and warning messages. Reported-by: Martin Steigerwald <Martin(a)lichtvoll.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=43511 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Message-ID: <201206192146.09327.Martin(a)lichtvoll.de> Cc: <stable(a)vger.kernel.org> # 5.2 Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com> Reviewed-by: Geert Uytterhoeven <geert(a)linux-m68k.org> Reviewed-by: Christoph Hellwig <hch(a)infradead.org> --- Changes from RFC: - use u64 instead of sector_t, since that may be u32 without LBD support - check multiplication overflows each step - 3 u32 values may exceed u64! - warn against use on AmigaDOS if partition data overflow u32 sector count. - warn if partition CylBlocks larger than what's stored in the RDSK header. - bail out if we were to overflow sector_t (32 or 64 bit). Changes from v1: Kars de Jong: - use defines for magic offsets in DosEnvec struct Geert Uytterhoeven: - use u64 cast for multiplications of u32 numbers - use array3_size for overflow checks - change pr_err to pr_warn - discontinue use of pr_cont - reword log messages - drop redundant nr_sects overflow test - warn against 32 bit overflow for each affected partition - skip partitions that overflow sector_t size instead of aborting scan Changes from v2: - further trim 32 bit overflow test - correct duplicate types.h inclusion introduced in v2 Changes from v3: - split off sector address type fix for independent review - change blksize to unsigned - use check_mul_overflow() instead of array3_size() - rewrite checks to avoid 64 bit divisions in check_mul_overflow() Changes from v5: Geert Uytterhoeven: - correct ineffective u64 cast to avoid 32 bit mult. overflow - fix mult. overflow in partition block address calculation Changes from v6: Geert Uytterhoeven: - don't fail hard on partition block address overflow Changes from v7: - replace bdevname(state->bdev, b) by state->disk->disk_name - drop warn_no_part conditionals - remove remaining warn_no_part Changes from v8: Christoph Hellwig: - whitespace fix, drop unnecessary u64 casts kbuild warning: - sparse warning fix Changes from v9: - revert ineffective sparse warning fix, and rely on change to annotation of rdb_CylBlocks (patch 2 this series) instead. - add Fixes: tags and stable backport prereq --- block/partitions/amiga.c | 103 ++++++++++++++++++++++++++++++++------- 1 file changed, 85 insertions(+), 18 deletions(-) diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c index 85c5c79aae48..ed222b9c901b 100644 --- a/block/partitions/amiga.c +++ b/block/partitions/amiga.c @@ -11,10 +11,18 @@ #define pr_fmt(fmt) fmt #include <linux/types.h> +#include <linux/mm_types.h> +#include <linux/overflow.h> #include <linux/affs_hardblocks.h> #include "check.h" +/* magic offsets in partition DosEnvVec */ +#define NR_HD 3 +#define NR_SECT 5 +#define LO_CYL 9 +#define HI_CYL 10 + static __inline__ u32 checksum_block(__be32 *m, int size) { @@ -31,9 +39,12 @@ int amiga_partition(struct parsed_partitions *state) unsigned char *data; struct RigidDiskBlock *rdb; struct PartitionBlock *pb; - sector_t start_sect, nr_sects; - int blk, part, res = 0; - int blksize = 1; /* Multiplier for disk block size */ + u64 start_sect, nr_sects; + sector_t blk, end_sect; + u32 cylblk; /* rdb_CylBlocks = nr_heads*sect_per_track */ + u32 nr_hd, nr_sect, lo_cyl, hi_cyl; + int part, res = 0; + unsigned int blksize = 1; /* Multiplier for disk block size */ int slot = 1; for (blk = 0; ; blk++, put_dev_sector(sect)) { @@ -41,7 +52,7 @@ int amiga_partition(struct parsed_partitions *state) goto rdb_done; data = read_part_sector(state, blk, &sect); if (!data) { - pr_err("Dev %s: unable to read RDB block %d\n", + pr_err("Dev %s: unable to read RDB block %llu\n", state->disk->disk_name, blk); res = -1; goto rdb_done; @@ -58,12 +69,12 @@ int amiga_partition(struct parsed_partitions *state) *(__be32 *)(data+0xdc) = 0; if (checksum_block((__be32 *)data, be32_to_cpu(rdb->rdb_SummedLongs) & 0x7F)==0) { - pr_err("Trashed word at 0xd0 in block %d ignored in checksum calculation\n", + pr_err("Trashed word at 0xd0 in block %llu ignored in checksum calculation\n", blk); break; } - pr_err("Dev %s: RDB in block %d has bad checksum\n", + pr_err("Dev %s: RDB in block %llu has bad checksum\n", state->disk->disk_name, blk); } @@ -80,10 +91,15 @@ int amiga_partition(struct parsed_partitions *state) blk = be32_to_cpu(rdb->rdb_PartitionList); put_dev_sector(sect); for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) { - blk *= blksize; /* Read in terms partition table understands */ + /* Read in terms partition table understands */ + if (check_mul_overflow(blk, (sector_t) blksize, &blk)) { + pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n", + state->disk->disk_name, blk, part); + break; + } data = read_part_sector(state, blk, &sect); if (!data) { - pr_err("Dev %s: unable to read partition block %d\n", + pr_err("Dev %s: unable to read partition block %llu\n", state->disk->disk_name, blk); res = -1; goto rdb_done; @@ -95,19 +111,70 @@ int amiga_partition(struct parsed_partitions *state) if (checksum_block((__be32 *)pb, be32_to_cpu(pb->pb_SummedLongs) & 0x7F) != 0 ) continue; - /* Tell Kernel about it */ + /* RDB gives us more than enough rope to hang ourselves with, + * many times over (2^128 bytes if all fields max out). + * Some careful checks are in order, so check for potential + * overflows. + * We are multiplying four 32 bit numbers to one sector_t! + */ + + nr_hd = be32_to_cpu(pb->pb_Environment[NR_HD]); + nr_sect = be32_to_cpu(pb->pb_Environment[NR_SECT]); + + /* CylBlocks is total number of blocks per cylinder */ + if (check_mul_overflow(nr_hd, nr_sect, &cylblk)) { + pr_err("Dev %s: heads*sects %u overflows u32, skipping partition!\n", + state->disk->disk_name, cylblk); + continue; + } + + /* check for consistency with RDB defined CylBlocks */ + if (cylblk > be32_to_cpu(rdb->rdb_CylBlocks)) { + pr_warn("Dev %s: cylblk %u > rdb_CylBlocks %u!\n", + state->disk->disk_name, cylblk, + be32_to_cpu(rdb->rdb_CylBlocks)); + } + + /* RDB allows for variable logical block size - + * normalize to 512 byte blocks and check result. + */ + + if (check_mul_overflow(cylblk, blksize, &cylblk)) { + pr_err("Dev %s: partition %u bytes per cyl. overflows u32, skipping partition!\n", + state->disk->disk_name, part); + continue; + } + + /* Calculate partition start and end. Limit of 32 bit on cylblk + * guarantees no overflow occurs if LBD support is enabled. + */ + + lo_cyl = be32_to_cpu(pb->pb_Environment[LO_CYL]); + start_sect = ((u64) lo_cyl * cylblk); + + hi_cyl = be32_to_cpu(pb->pb_Environment[HI_CYL]); + nr_sects = (((u64) hi_cyl - lo_cyl + 1) * cylblk); - nr_sects = ((sector_t)be32_to_cpu(pb->pb_Environment[10]) + 1 - - be32_to_cpu(pb->pb_Environment[9])) * - be32_to_cpu(pb->pb_Environment[3]) * - be32_to_cpu(pb->pb_Environment[5]) * - blksize; if (!nr_sects) continue; - start_sect = (sector_t)be32_to_cpu(pb->pb_Environment[9]) * - be32_to_cpu(pb->pb_Environment[3]) * - be32_to_cpu(pb->pb_Environment[5]) * - blksize; + + /* Warn user if partition end overflows u32 (AmigaDOS limit) */ + + if ((start_sect + nr_sects) > UINT_MAX) { + pr_warn("Dev %s: partition %u (%llu-%llu) needs 64 bit device support!\n", + state->disk->disk_name, part, + start_sect, start_sect + nr_sects); + } + + if (check_add_overflow(start_sect, nr_sects, &end_sect)) { + pr_err("Dev %s: partition %u (%llu-%llu) needs LBD device support, skipping partition!\n", + state->disk->disk_name, part, + start_sect, end_sect); + continue; + } + + /* Tell Kernel about it */ + put_partition(state,slot++,start_sect,nr_sects); { /* Be even more informative to aid mounting */ -- 2.17.1

2 years

1
0
0 0

[PATCH v13 2/3] block: change all __u32 annotations to __be32 in affs_hardblocks.h

by Michael Schmitz

The Amiga partition parser module uses signed int for partition sector address and count, which will overflow for disks larger than 1 TB. Use u64 as type for sector address and size to allow using disks up to 2 TB without LBD support, and disks larger than 2 TB with LBD. The RBD format allows to specify disk sizes up to 2^128 bytes (though native OS limitations reduce this somewhat, to max 2^68 bytes), so check for u64 overflow carefully to protect against overflowing sector_t. This bug was reported originally in 2012, and the fix was created by the RDB author, Joanne Dow <jdow(a)earthlink.net>. A patch had been discussed and reviewed on linux-m68k at that time but never officially submitted (now resubmitted as patch 1 of this series). Patch 3 (this series) adds additional error checking and warning messages. One of the error checks now makes use of the previously unused rdb_CylBlocks field, which causes a 'sparse' warning (cast to restricted __be32). Annotate all 32 bit fields in affs_hardblocks.h as __be32, as the on-disk format of RDB and partition blocks is always big endian. Reported-by: Martin Steigerwald <Martin(a)lichtvoll.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=43511 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Message-ID: <201206192146.09327.Martin(a)lichtvoll.de> Cc: <stable(a)vger.kernel.org> # 5.2 Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com> Reviewed-by: Christoph Hellwig <hch(a)lst.de> Reviewed-by: Geert Uytterhoeven <geert(a)linux-m68k.org> --- Changes from v10: Christoph Hellwig: - change annotation of all __u32 fields to __be32 Changes from v11: Geert Uytterhoeven: - also change annotation of the two __s32 checksum fields Changes from v12: - change patch subject to reflect changes in v11 and v12 --- include/uapi/linux/affs_hardblocks.h | 68 ++++++++++++++-------------- 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/include/uapi/linux/affs_hardblocks.h b/include/uapi/linux/affs_hardblocks.h index 5e2fb8481252..a5aff2eb5f70 100644 --- a/include/uapi/linux/affs_hardblocks.h +++ b/include/uapi/linux/affs_hardblocks.h @@ -7,42 +7,42 @@ /* Just the needed definitions for the RDB of an Amiga HD. */ struct RigidDiskBlock { - __u32 rdb_ID; + __be32 rdb_ID; __be32 rdb_SummedLongs; - __s32 rdb_ChkSum; - __u32 rdb_HostID; + __be32 rdb_ChkSum; + __be32 rdb_HostID; __be32 rdb_BlockBytes; - __u32 rdb_Flags; - __u32 rdb_BadBlockList; + __be32 rdb_Flags; + __be32 rdb_BadBlockList; __be32 rdb_PartitionList; - __u32 rdb_FileSysHeaderList; - __u32 rdb_DriveInit; - __u32 rdb_Reserved1[6]; - __u32 rdb_Cylinders; - __u32 rdb_Sectors; - __u32 rdb_Heads; - __u32 rdb_Interleave; - __u32 rdb_Park; - __u32 rdb_Reserved2[3]; - __u32 rdb_WritePreComp; - __u32 rdb_ReducedWrite; - __u32 rdb_StepRate; - __u32 rdb_Reserved3[5]; - __u32 rdb_RDBBlocksLo; - __u32 rdb_RDBBlocksHi; - __u32 rdb_LoCylinder; - __u32 rdb_HiCylinder; - __u32 rdb_CylBlocks; - __u32 rdb_AutoParkSeconds; - __u32 rdb_HighRDSKBlock; - __u32 rdb_Reserved4; + __be32 rdb_FileSysHeaderList; + __be32 rdb_DriveInit; + __be32 rdb_Reserved1[6]; + __be32 rdb_Cylinders; + __be32 rdb_Sectors; + __be32 rdb_Heads; + __be32 rdb_Interleave; + __be32 rdb_Park; + __be32 rdb_Reserved2[3]; + __be32 rdb_WritePreComp; + __be32 rdb_ReducedWrite; + __be32 rdb_StepRate; + __be32 rdb_Reserved3[5]; + __be32 rdb_RDBBlocksLo; + __be32 rdb_RDBBlocksHi; + __be32 rdb_LoCylinder; + __be32 rdb_HiCylinder; + __be32 rdb_CylBlocks; + __be32 rdb_AutoParkSeconds; + __be32 rdb_HighRDSKBlock; + __be32 rdb_Reserved4; char rdb_DiskVendor[8]; char rdb_DiskProduct[16]; char rdb_DiskRevision[4]; char rdb_ControllerVendor[8]; char rdb_ControllerProduct[16]; char rdb_ControllerRevision[4]; - __u32 rdb_Reserved5[10]; + __be32 rdb_Reserved5[10]; }; #define IDNAME_RIGIDDISK 0x5244534B /* "RDSK" */ @@ -50,16 +50,16 @@ struct RigidDiskBlock { struct PartitionBlock { __be32 pb_ID; __be32 pb_SummedLongs; - __s32 pb_ChkSum; - __u32 pb_HostID; + __be32 pb_ChkSum; + __be32 pb_HostID; __be32 pb_Next; - __u32 pb_Flags; - __u32 pb_Reserved1[2]; - __u32 pb_DevFlags; + __be32 pb_Flags; + __be32 pb_Reserved1[2]; + __be32 pb_DevFlags; __u8 pb_DriveName[32]; - __u32 pb_Reserved2[15]; + __be32 pb_Reserved2[15]; __be32 pb_Environment[17]; - __u32 pb_EReserved[15]; + __be32 pb_EReserved[15]; }; #define IDNAME_PARTITION 0x50415254 /* "PART" */ -- 2.17.1

2 years

1
0
0 0

[PATCH v13 1/3] block: fix signed int overflow in Amiga partition support

by Michael Schmitz

The Amiga partition parser module uses signed int for partition sector address and count, which will overflow for disks larger than 1 TB. Use sector_t as type for sector address and size to allow using disks up to 2 TB without LBD support, and disks larger than 2 TB with LBD. This bug was reported originally in 2012, and the fix was created by the RDB author, Joanne Dow <jdow(a)earthlink.net>. A patch had been discussed and reviewed on linux-m68k at that time but never officially submitted. This patch differs from Joanne's patch only in its use of sector_t instead of unsigned int. No checking for overflows is done (see patch 3 of this series for that). Reported-by: Martin Steigerwald <Martin(a)lichtvoll.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=43511 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Message-ID: <201206192146.09327.Martin(a)lichtvoll.de> Cc: <stable(a)vger.kernel.org> # 5.2 Signed-off-by: Michael Schmitz <schmitzmic(a)gmail.com> Tested-by: Martin Steigerwald <Martin(a)lichtvoll.de> Reviewed-by: Geert Uytterhoeven <geert(a)linux-m68k.org> Reviewed-by: Christoph Hellwig <hch(a)lst.de> --- Changes from v3: - split off change of sector address type as quick fix. - cast to sector_t in sector address calculations. - move overflow checking to separate patch for more thorough review. Changes from v4: Andreas Schwab: - correct cast to sector_t in sector address calculations Changes from v7: Christoph Hellwig - correct style issues Changes from v9: - add Fixes: tags and stable backport prereq --- block/partitions/amiga.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c index 5c8624e26a54..85c5c79aae48 100644 --- a/block/partitions/amiga.c +++ b/block/partitions/amiga.c @@ -31,7 +31,8 @@ int amiga_partition(struct parsed_partitions *state) unsigned char *data; struct RigidDiskBlock *rdb; struct PartitionBlock *pb; - int start_sect, nr_sects, blk, part, res = 0; + sector_t start_sect, nr_sects; + int blk, part, res = 0; int blksize = 1; /* Multiplier for disk block size */ int slot = 1; @@ -96,14 +97,14 @@ int amiga_partition(struct parsed_partitions *state) /* Tell Kernel about it */ - nr_sects = (be32_to_cpu(pb->pb_Environment[10]) + 1 - - be32_to_cpu(pb->pb_Environment[9])) * + nr_sects = ((sector_t)be32_to_cpu(pb->pb_Environment[10]) + 1 - + be32_to_cpu(pb->pb_Environment[9])) * be32_to_cpu(pb->pb_Environment[3]) * be32_to_cpu(pb->pb_Environment[5]) * blksize; if (!nr_sects) continue; - start_sect = be32_to_cpu(pb->pb_Environment[9]) * + start_sect = (sector_t)be32_to_cpu(pb->pb_Environment[9]) * be32_to_cpu(pb->pb_Environment[3]) * be32_to_cpu(pb->pb_Environment[5]) * blksize; -- 2.17.1

2 years

1
0
0 0

[PATCH] HID: amd_sfh: Check that sensors are enabled before set/get report

by Mario Limonciello

A crash was reported in amd-sfh related to hid core initialization before SFH initialization has run. ``` amdtp_hid_request+0x36/0x50 [amd_sfh 2e3095779aada9fdb1764f08ca578ccb14e41fe4] sensor_hub_get_feature+0xad/0x170 [hid_sensor_hub d6157999c9d260a1bfa6f27d4a0dc2c3e2c5654e] hid_sensor_parse_common_attributes+0x217/0x310 [hid_sensor_iio_common 07a7935272aa9c7a28193b574580b3e953a64ec4] hid_gyro_3d_probe+0x7f/0x2e0 [hid_sensor_gyro_3d 9f2eb51294a1f0c0315b365f335617cbaef01eab] platform_probe+0x44/0xa0 really_probe+0x19e/0x3e0 ``` Ensure that sensors have been set up before calling into amd_sfh_get_report() or amd_sfh_set_report(). Cc: stable(a)vger.kernel.org Cc: Linux regression tracking (Thorsten Leemhuis) <regressions(a)leemhuis.info> Fixes: 7bcfdab3f0c6 ("HID: amd_sfh: if no sensors are enabled, clean up") Reported-by: Haochen Tong <linux(a)hexchain.org> Link: https://lore.kernel.org/all/3250319.ancTxkQ2z5@zen/T/ Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com> --- drivers/hid/amd-sfh-hid/amd_sfh_client.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c b/drivers/hid/amd-sfh-hid/amd_sfh_client.c index d9b7b01900b5..88f3d913eaa1 100644 --- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c +++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c @@ -25,6 +25,9 @@ void amd_sfh_set_report(struct hid_device *hid, int report_id, struct amdtp_cl_data *cli_data = hid_data->cli_data; int i; + if (!cli_data->is_any_sensor_enabled) + return; + for (i = 0; i < cli_data->num_hid_devices; i++) { if (cli_data->hid_sensor_hubs[i] == hid) { cli_data->cur_hid_dev = i; @@ -41,6 +44,9 @@ int amd_sfh_get_report(struct hid_device *hid, int report_id, int report_type) struct request_list *req_list = &cli_data->req_list; int i; + if (!cli_data->is_any_sensor_enabled) + return -ENODEV; + for (i = 0; i < cli_data->num_hid_devices; i++) { if (cli_data->hid_sensor_hubs[i] == hid) { struct request_list *new = kzalloc(sizeof(*new), GFP_KERNEL); -- 2.34.1

2 years

1
0
0 0

[5.15-stable PATCH 0/2] Copy-on-write hwpoison recovery

by Jane Chu

I was able to reproduce crash on 5.15.y kernel during COW, and when the grandchild process attempts a write to a private page inherited from the child process and the private page contains a memory uncorrectable error. The way to reproduce is described in Tony's patch, using his ras-tools/einj_mem_uc. And the patch series fixed the panic issue in 5.15.y. The backport has encountered trivial conflicts due to missing dependencies, details are provided in each patch. Please let me know whether the backport is acceptable. Tony Luck (2): mm, hwpoison: try to recover from copy-on write faults mm, hwpoison: when copy-on-write hits poison, take page offline include/linux/highmem.h | 24 ++++++++++++++++++++++++ include/linux/mm.h | 5 ++++- mm/memory.c | 33 +++++++++++++++++++++++---------- 3 files changed, 51 insertions(+), 11 deletions(-) -- 2.18.4

2 years

2
6
0 0

[RESEND 1/1] linux-5.10/rcu/kvfree: Avoid freeing new kfree_rcu() memory after old grace period

by Suren Baghdasaryan

From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com> From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com> commit 5da7cb193db32da783a3f3e77d8b639989321d48 upstream. Memory passed to kvfree_rcu() that is to be freed is tracked by a per-CPU kfree_rcu_cpu structure, which in turn contains pointers to kvfree_rcu_bulk_data structures that contain pointers to memory that has not yet been handed to RCU, along with an kfree_rcu_cpu_work structure that tracks the memory that has already been handed to RCU. These structures track three categories of memory: (1) Memory for kfree(), (2) Memory for kvfree(), and (3) Memory for both that arrived during an OOM episode. The first two categories are tracked in a cache-friendly manner involving a dynamically allocated page of pointers (the aforementioned kvfree_rcu_bulk_data structures), while the third uses a simple (but decidedly cache-unfriendly) linked list through the rcu_head structures in each block of memory. On a given CPU, these three categories are handled as a unit, with that CPU's kfree_rcu_cpu_work structure having one pointer for each of the three categories. Clearly, new memory for a given category cannot be placed in the corresponding kfree_rcu_cpu_work structure until any old memory has had its grace period elapse and thus has been removed. And the kfree_rcu_monitor() function does in fact check for this. Except that the kfree_rcu_monitor() function checks these pointers one at a time. This means that if the previous kfree_rcu() memory passed to RCU had only category 1 and the current one has only category 2, the kfree_rcu_monitor() function will send that current category-2 memory along immediately. This can result in memory being freed too soon, that is, out from under unsuspecting RCU readers. To see this, consider the following sequence of events, in which: o Task A on CPU 0 calls rcu_read_lock(), then uses "from_cset", then is preempted. o CPU 1 calls kfree_rcu(cset, rcu_head) in order to free "from_cset" after a later grace period. Except that "from_cset" is freed right after the previous grace period ended, so that "from_cset" is immediately freed. Task A resumes and references "from_cset"'s member, after which nothing good happens. In full detail: CPU 0 CPU 1 ---------------------- ---------------------- count_memcg_event_mm() |rcu_read_lock() <--- |mem_cgroup_from_task() |// css_set_ptr is the "from_cset" mentioned on CPU 1 |css_set_ptr = rcu_dereference((task)->cgroups) |// Hard irq comes, current task is scheduled out. cgroup_attach_task() |cgroup_migrate() |cgroup_migrate_execute() |css_set_move_task(task, from_cset, to_cset, true) |cgroup_move_task(task, to_cset) |rcu_assign_pointer(.., to_cset) |... |cgroup_migrate_finish() |put_css_set_locked(from_cset) |from_cset->refcount return 0 |kfree_rcu(cset, rcu_head) // free from_cset after new gp |add_ptr_to_bulk_krc_lock() |schedule_delayed_work(&krcp->monitor_work, ..) kfree_rcu_monitor() |krcp->bulk_head[0]'s work attached to krwp->bulk_head_free[] |queue_rcu_work(system_wq, &krwp->rcu_work) |if rwork->rcu.work is not in WORK_STRUCT_PENDING_BIT state, |call_rcu(&rwork->rcu, rcu_work_rcufn) <--- request new gp // There is a perious call_rcu(.., rcu_work_rcufn) // gp end, rcu_work_rcufn() is called. rcu_work_rcufn() |__queue_work(.., rwork->wq, &rwork->work); |kfree_rcu_work() |krwp->bulk_head_free[0] bulk is freed before new gp end!!! |The "from_cset" is freed before new gp end. // the task resumes some time later. |css_set_ptr->subsys[(subsys_id) <--- Caused kernel crash, because css_set_ptr is freed. This commit therefore causes kfree_rcu_monitor() to refrain from moving kfree_rcu() memory to the kfree_rcu_cpu_work structure until the RCU grace period has completed for all three categories. v2: Use helper function instead of inserted code block at kfree_rcu_monitor(). [UR: backport to 5.10-stable] [UR: Added missing need_offload_krc() function] Fixes: 34c881745549 ("rcu: Support kfree_bulk() interface in kfree_rcu()") Fixes: 5f3c8d620447 ("rcu/tree: Maintain separate array for vmalloc ptrs") Reported-by: Mukesh Ojha <quic_mojha(a)quicinc.com> Signed-off-by: Ziwei Dai <ziwei.dai(a)unisoc.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Tested-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> --- Resending per Greg's request. Original posting: https://lore.kernel.org/all/20230418102518.5911-1-urezki@gmail.com/ kernel/rcu/tree.c | 49 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 35 insertions(+), 14 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 30e1d7fedb5f..eec8e2f7537e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3281,6 +3281,30 @@ static void kfree_rcu_work(struct work_struct *work) } } +static bool +need_offload_krc(struct kfree_rcu_cpu *krcp) +{ + int i; + + for (i = 0; i < FREE_N_CHANNELS; i++) + if (krcp->bkvhead[i]) + return true; + + return !!krcp->head; +} + +static bool +need_wait_for_krwp_work(struct kfree_rcu_cpu_work *krwp) +{ + int i; + + for (i = 0; i < FREE_N_CHANNELS; i++) + if (krwp->bkvhead_free[i]) + return true; + + return !!krwp->head_free; +} + /* * Schedule the kfree batch RCU work to run in workqueue context after a GP. * @@ -3298,16 +3322,13 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) for (i = 0; i < KFREE_N_BATCHES; i++) { krwp = &(krcp->krw_arr[i]); - /* - * Try to detach bkvhead or head and attach it over any - * available corresponding free channel. It can be that - * a previous RCU batch is in progress, it means that - * immediately to queue another one is not possible so - * return false to tell caller to retry. - */ - if ((krcp->bkvhead[0] && !krwp->bkvhead_free[0]) || - (krcp->bkvhead[1] && !krwp->bkvhead_free[1]) || - (krcp->head && !krwp->head_free)) { + // Try to detach bulk_head or head and attach it, only when + // all channels are free. Any channel is not free means at krwp + // there is on-going rcu work to handle krwp's free business. + if (need_wait_for_krwp_work(krwp)) + continue; + + if (need_offload_krc(krcp)) { // Channel 1 corresponds to SLAB ptrs. // Channel 2 corresponds to vmalloc ptrs. for (j = 0; j < FREE_N_CHANNELS; j++) { @@ -3334,12 +3355,12 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) */ queue_rcu_work(system_wq, &krwp->rcu_work); } - - // Repeat if any "free" corresponding channel is still busy. - if (krcp->bkvhead[0] || krcp->bkvhead[1] || krcp->head) - repeat = true; } + // Repeat if any "free" corresponding channel is still busy. + if (need_offload_krc(krcp)) + repeat = true; + return !repeat; } -- 2.41.0.162.gfafddb0af9-goog

2 years

2
6
0 0

[PATCH 6.3] riscv: Link with '-z norelro'

by Nathan Chancellor

This patch fixes a stable only patch, so it has no direct upstream equivalent. After a stable only patch to explicitly handle the '.got' section to handle an orphan section warning from the linker, certain configurations error when linking with ld.lld, which enables relro by default: ld.lld: error: section: .got is not contiguous with other relro sections This has come up with other architectures before, such as arm and arm64 in commit 0cda9bc15dfc ("ARM: 9038/1: Link with '-z norelro'") and commit 3b92fa7485eb ("arm64: link with -z norelro regardless of CONFIG_RELOCATABLE"). Additionally, '-z norelro' is used unconditionally for RISC-V upstream after commit 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line"), which alluded to this issue for the same reason. Bring 6.3 in line with mainline and link with '-z norelro', which resolves the above link failure. Fixes: e6d1562dd4e9 ("riscv: vmlinux.lds.S: Explicitly handle '.got' section") Reported-by: kernel test robot <lkp(a)intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306192231.DJmWr6BX-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan(a)kernel.org> --- arch/riscv/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index b05e833a022d..d46b6722710f 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -7,7 +7,7 @@ # OBJCOPYFLAGS := -O binary -LDFLAGS_vmlinux := +LDFLAGS_vmlinux := -z norelro ifeq ($(CONFIG_DYNAMIC_FTRACE),y) LDFLAGS_vmlinux := --no-relax KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY --- base-commit: f2427f9a3730e9a1a11b69f6b767f7f2fad87523 change-id: 20230620-6-3-fix-got-relro-error-lld-397f3112860b Best regards, -- Nathan Chancellor <nathan(a)kernel.org>

2 years

2
1
0 0

[PATCH v3] jfs: jfs_dmap: Validate db_l2nbperpage while mounting

by Siddh Raman Pant

In jfs_dmap.c at line 381, BLKTODMAP is used to get a logical block number inside dbFree(). db_l2nbperpage, which is the log2 number of blocks per page, is passed as an argument to BLKTODMAP which uses it for shifting. Syzbot reported a shift out-of-bounds crash because db_l2nbperpage is too big. This happens because the large value is set without any validation in dbMount() at line 181. Thus, make sure that db_l2nbperpage is correct while mounting. Max number of blocks per page = Page size / Min block size => log2(Max num_block per page) = log2(Page size / Min block size) = log2(Page size) - log2(Min block size) => Max db_l2nbperpage = L2PSIZE - L2MINBLOCKSIZE Reported-and-tested-by: syzbot+d2cd27dcf8e04b232eb2(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=2a70a453331db32ed491f5cbb07e81bf2d2257… Cc: stable(a)vger.kernel.org Suggested-by: Dave Kleikamp <dave.kleikamp(a)oracle.com> Signed-off-by: Siddh Raman Pant <code(a)siddh.me> --- Changes in v3: - Fix typo in commit message (number of pages -> number of blocks per page). Changes in v2: - Fix upper bound as pointed out in v1 by Shaggy. - Add an explanation for the same in commit message for completeness. fs/jfs/jfs_dmap.c | 6 ++++++ fs/jfs/jfs_filsys.h | 2 ++ 2 files changed, 8 insertions(+) diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index a3eb1e826947..da6a2bc6bf02 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -178,7 +178,13 @@ int dbMount(struct inode *ipbmap) dbmp_le = (struct dbmap_disk *) mp->data; bmp->db_mapsize = le64_to_cpu(dbmp_le->dn_mapsize); bmp->db_nfree = le64_to_cpu(dbmp_le->dn_nfree); + bmp->db_l2nbperpage = le32_to_cpu(dbmp_le->dn_l2nbperpage); + if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE) { + err = -EINVAL; + goto err_release_metapage; + } + bmp->db_numag = le32_to_cpu(dbmp_le->dn_numag); if (!bmp->db_numag) { err = -EINVAL; diff --git a/fs/jfs/jfs_filsys.h b/fs/jfs/jfs_filsys.h index b5d702df7111..33ef13a0b110 100644 --- a/fs/jfs/jfs_filsys.h +++ b/fs/jfs/jfs_filsys.h @@ -122,7 +122,9 @@ #define NUM_INODE_PER_IAG INOSPERIAG #define MINBLOCKSIZE 512 +#define L2MINBLOCKSIZE 9 #define MAXBLOCKSIZE 4096 +#define L2MAXBLOCKSIZE 12 #define MAXFILESIZE ((s64)1 << 52) #define JFS_LINK_MAX 0xffffffff -- 2.39.2

2 years

2
1
0 0

[PATCH v2] jfs: jfs_dmap: Validate db_l2nbperpage while mounting

by Siddh Raman Pant

In jfs_dmap.c at line 381, BLKTODMAP is used to get a logical block number inside dbFree(). db_l2nbperpage, which is the log2 number of blocks per page, is passed as an argument to BLKTODMAP which uses it for shifting. Syzbot reported a shift out-of-bounds crash because db_l2nbperpage is too big. This happens because the large value is set without any validation in dbMount() at line 181. Thus, make sure that db_l2nbperpage is correct while mounting. Max number of pages = Page size / Min block size => log2(Max number of pages) = log2(Page size / Min block size) = log2(Page size) - log2(Min block size) => Max db_l2nbperpage = L2PSIZE - L2MINBLOCKSIZE Reported-and-tested-by: syzbot+d2cd27dcf8e04b232eb2(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=2a70a453331db32ed491f5cbb07e81bf2d2257… Cc: stable(a)vger.kernel.org Suggested-by: Dave Kleikamp <dave.kleikamp(a)oracle.com> Signed-off-by: Siddh Raman Pant <code(a)siddh.me> --- Changes in v2: - Fix upper bound as pointed out in v1 by Shaggy. - Add an explanation for the same in commit message for completeness. fs/jfs/jfs_dmap.c | 6 ++++++ fs/jfs/jfs_filsys.h | 2 ++ 2 files changed, 8 insertions(+) diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index a3eb1e826947..da6a2bc6bf02 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -178,7 +178,13 @@ int dbMount(struct inode *ipbmap) dbmp_le = (struct dbmap_disk *) mp->data; bmp->db_mapsize = le64_to_cpu(dbmp_le->dn_mapsize); bmp->db_nfree = le64_to_cpu(dbmp_le->dn_nfree); + bmp->db_l2nbperpage = le32_to_cpu(dbmp_le->dn_l2nbperpage); + if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE) { + err = -EINVAL; + goto err_release_metapage; + } + bmp->db_numag = le32_to_cpu(dbmp_le->dn_numag); if (!bmp->db_numag) { err = -EINVAL; diff --git a/fs/jfs/jfs_filsys.h b/fs/jfs/jfs_filsys.h index b5d702df7111..33ef13a0b110 100644 --- a/fs/jfs/jfs_filsys.h +++ b/fs/jfs/jfs_filsys.h @@ -122,7 +122,9 @@ #define NUM_INODE_PER_IAG INOSPERIAG #define MINBLOCKSIZE 512 +#define L2MINBLOCKSIZE 9 #define MAXBLOCKSIZE 4096 +#define L2MAXBLOCKSIZE 12 #define MAXFILESIZE ((s64)1 << 52) #define JFS_LINK_MAX 0xffffffff -- 2.39.2

2 years

1
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror June 2023